Experts' Insights

JSON: Web Dev’s-eye View

JSON appeared from nowhere and it was a salvation. JSON was not a standard, neither was it designed by a committee. In fact, it was never designed at all. It just appeared one day as a logical and totally expected result of advances in web development. The truth was, of course, that JSON was not accidental, and its popularity should be credited to hard work of one very dedicated person. Compared to previous standards such as XML and others, it was simple and elegant.

Since then, JSON took web by storm. It got standardized as RFC 4627 and RFC 7159. The most modern web applications now use JSON as a communication format. Although very popular, JSON definitely has some shortcomings. Some of those shortcomings are critical to specific areas.

Aiming towards radical simplicity and reducing the amount of data types to numbers, strings, booleans, arrays, and objects, many practical use cases were not covered. For example, JSON lacks specific data types for dates and times. Usually, dates and times are represented as a string in ISO 8601 format (ex. 2016-01-13T19:11:46+03:00), which definitely works, yet creates very unpleasant problems when converting and binding JSON back to programming languages structures and objects.

Another problem source in JSON is its inability to describe an object, while maintaining the object type/class info in JSON document. Number of data types describable in JSON is strictly limited to 6 built-in types. You cannot express anything more complex than that. That may pose serious problems when using JSON as a serialization format. Try to describe something as simple as a decimal or collection (try set or queue), and you will soon find that there is no JSON description that is compatible between implementations and that allows serialization/deserialization without losing data.

Another shortcoming of JSON is its inability to serialize circular references. That may be a serious problem when serializing (very common when trying to serialize DOM or database entities). There is a couple of standard and nonstandard ways to overcome this. Grails serialization of circular references is an example of nonstandard ad-hoc approach. While serializing an entity to JSON, Grails marshallers keep stack of all previously seen objects and wherever it detects cycle in dependency graph, a path-like relative reference will be used. It’s worth noting, though, that it is up to client code to parse such kinds of JSON and resolve references.

More standards-compliant (and also more intrusive) way to specify circular dependencies is JSON Reference standard (RFC). It allows to reference any property in current or external JSON file. Unluckily, the “standard” seems to be non-descriptive and nominative and does not contain any description on how object references should be constructed in a document. JSON references are quite popular, they are used extensively in JSON Schema definitions.

JSON also carries limitations of its parent language. You cannot really use Unicode to a full extent in JavaScript, you can only use what is called basic multilingual plane (BMP). Such characters as emoji and extended mathematical symbols are out of BMP and cannot be used in JavaScript without escaping and hence always need to be specially encoded in JSON. That cripples uniformity of syntax and carries additional burden for parsers.

Finally, JSON does not allow to use comments. As more tools have chosen JSON for storing configuration (say, npm uses package.json for storing project information), inability to comment your configuration is a major source of frustration.

Of course, one may argue that JSON is a fine format for specific needs. And JSON is really excellent in specific areas. Working with JSON web api is a true pleasure. However, you may consider other formats for serialization and storing configurations. One of wide-spread formats for such purposes is YAML.

YAML not only aims to provide always-readable representation of data, it also has some advanced features that JSON lacks. For example, you can both serialize custom classes (class name is preserved in YAML) and use content-references to describe data with circular dependencies. And of course, you can use comments in YAML.

Surely, the use of YAML should be justified as format is not simple. Some users may not like how serialized data looks, its performance may be unacceptable for the user’s needs and ability to describe `any` class and ability to instantiate any class described in user-supplied YAML may pose a security risk (just remember that hilarious vulnerability in Rails). However, it is clear that YAML not only has fancy syntax but also a couple of highly needed features.



I'm a software engineer with broad front-end and back-end experience. Currently I'm busy finding a perfect combination of tools for building nice web applications.