convert data to/from json

In this post, we’ll transform data to/from json using the *-to-json and json-to-* tools from the json-toolkit. Transforming data to/from json allows us to use jq (and other json) tools on (almost) any data.

What are *-to-json and json-to*?

*-to-json and json-to-* are collections of cli tools that transform data from one format to another. For example:

  • xml-to-json - converts xml on stdin to json on stdout

  • json-to-yaml - converts json on stdin to yaml on stdout

  • json-to-logfmt - converts json on stdin to logfmt on stdout

In total there are 12 tools covering 6 formats, plus json.

Each tool takes one format on stdin and returns the same data in a different format on stdout, for example:

cat file.xml | xml-to-json > file.json

Why use *-to-json and json-to*?

In short, because they let you do this:

# extracts the element .key.a[1].b from file.xml as xml
cat file.xml | xml-to-json | jq '.key.a[1].b' | json-to-xml

and

# Provides a structural diff of the xml files
json-diff <(cat a.xml | xml-to-json) <(cat b.xml | xml-to-json)

With *-to-json and json-to-* you can use jq to process new 6 data types with ease. Why bother to learn a new tool (like XMLStarlet) for each data type, we can use jq!

Plus, since the tool is xml-to-json, it doesn’t just unlock jq and json-diff for xml, it unlocks every json based program on xml data. This versatility provided by the *-to-json and json-to-* opens up incredible options to help us code faster.

Installing the *-to-json and json-to-*

*-to-json and json-to_* are all provided in the json-toolkit, checkout the README for the latest installation instructions.

The 6 formats

The json toolkit supports 6 formats which can be converted to/from json

  • xml

  • yaml

  • dsv (delimited separated value)

  • csv (comma separated value which, sadly, is not a special case of dsv)

  • logfmt

  • binary (as json, binary data is an array of values 0-255 corresponding to byte numeric values)

Limitations

Sadly, not all formats can be represented in json and vice versa. Although I’ve found they rarely affect my usage of these tools, they’re good to keep in mind. Here are a few:

binary

binary data is always rendered as a json array of values from 0-255. As such, for example, json-to-binary can’t handle an input with a string.

logfmt

logfmt values are newline delimited lists of key-value pairs where the keys and values are always strings, even if the value is clearly a number. As such their json equivalent is a list of objects with string values. This means json-to-logfmt values must be a list of objects whose values are all strings.

xml

All xml can be represented as json, but not all json can be represented as xml. For example, xml requires a single root element. As such an array like [1, 2, 3] cannot be represented as xml, although the json value {"a": {"b": ["1", "2", "3"]}} can be.

Conclusion

In this post, we introduced the *-to-json and json-to-* tools which can convert data of 6 different formats to/from json. We saw how they can be used not only with jq and json-diff, but any json based tool for any of the data types. We saw their limitations to better understand how they’re used. Next week, we’ll move onto techniques in engineering design. Engineering design is difficult, but better engineering design can 10x or even 100x development velocity. If you don’t want to miss out, just click “Subscribe now” below.