Installation

pip install jsv

Motivation

JSON is an excellent universal represention for dictionaries, arrays and basic primitives. When representing records that are identical or nearly identical in schema, however, it is extremely verbose, as the same dictionary keys must be repeated for every record. The json lines format (which is just a sequence of json objects, each on a separate line of a file) therefore acheives great generality, but sacrifices compactness.

Other formats, notably csv, are not standardized, and can become leaky abstractions. In addition, they are usually confined to representing flat data, whereas json objects are rich with nested arrays and dictionaries.

This project replaces the json lines and csv formats with a rich, flexible, and compact representation of similar json objects. It aims to stay true to the simplicity and generality of json, and can represent any json object regardless of nesting. In addition, it provides for multiple record types in a single file or stream.

Basic Usage

JSV is a simple library for representing many JSON objects with a similar structure. It is intended to replace json lines, csv, and other similar formats.

Basic write usage

>>> import jsv
>>> data = [{'key_1': 1}, {'key_1': ['any','json','object'], 'other_key': None}]
>>> with jsv.JSVWriter('out.jsv', 'at', {'_': '{"key_1"}'}) as w:
...     for rec in data:
...         w.write(rec)

The resulting file will be:

#_ {"key_1"}
{1}
{["any","json","object"],"other_key":null}

Basic read usage:

>>> with jsv.JSVReader('out.jsv') as r:
...     for tid, obj in r.items():
...         print('{0}: {1}'.format(tid, obj))

The output will be:

_: {'key_1': 1}
_: {'key_1': ['any', 'json', 'object'], 'other_key': None}

Usage with multiple templates:

>>> data = [('t1', {'key_1': 1}),('t2', [{'key_2': 2}, {'key_2': None}])]
>>> with jsv.JSVWriter('out.jsv', 'wt') as w:
...     w['t1'] = '{"key_1"}'
...     w['t2'] = '[{"key_2"}]'
...     for tid, rec in data:
...         w.write(rec, tid)

The resulting file will be:

#t1 {"key_1"}
#t2 [{"key_2"}]
@t1 {1}
@t2 [{2},{null}]

Developer Reference