the tsj file format

This is a proposal for a new file format.

The TSV file format is very simple. It requires that fields be delimited by tabs, and has a limitation that tabs can’t appear within fields. Newlines are also forbidden inside a field.

The JSON file format is both simple and powerful, but it can be a tiny bit clumsy compared to TSV. Also there is a tradition of putting one JSON object on a line, but that format isn’t JSON, and it doesn’t have a name that’s caught on.

The TSJ file format is a specialization of TSV, where a field is either a JSON expression or a string. Here is how you tell if it’s a string or a JSON expression:

  1. Fields that start with “{“, ‘”‘, “[“, “-“, or the digits (0-9) are treated as JSON expressions. If they don’t parse, it’s invalid TSJ. If you want to store such a value, use a JSON string expression.
  2. Fields that are exactly one of the words “true”, “false”, or “null” are treated as JSON expressions.
  3. All other fields are treated strings.

Strings and JSON expressions are both UTF-8. Fields are not trimmed after they are split by tab characters. A file containing “true\t false\n” will be read as [[true, ” false”]].