JSON Separated Values (jsv)¶
Installation¶
pip install jsv
Motivation¶
JSON is an excellent universal represention for dictionaries, arrays and basic primitives. When representing records that are identical or nearly identical in schema, however, it is extremely verbose, as the same dictionary keys must be repeated for every record. The json lines format (which is just a sequence of json objects, each on a separate line of a file) therefore acheives great generality, but sacrifices compactness.
Other formats, notably csv, are not standardized, and can become leaky abstractions. In addition, they are usually confined to representing flat data, whereas json objects are rich with nested arrays and dictionaries.
This project replaces the json lines and csv formats with a rich, flexible, and compact representation of similar json objects. It aims to stay true to the simplicity and generality of json, and can represent any json object regardless of nesting. In addition, it provides for multiple record types in a single file or stream.
Basic Usage¶
JSV is a simple library for representing many JSON objects with a similar structure. It is intended to replace json lines, csv, and other similar formats.
Basic write usage
>>> import jsv
>>> data = [{'key_1': 1}, {'key_1': ['any','json','object'], 'other_key': None}]
>>> with jsv.JSVWriter('out.jsv', 'at', {'_': '{"key_1"}'}) as w:
... for rec in data:
... w.write(rec)
The resulting file will be:
#_ {"key_1"}
{1}
{["any","json","object"],"other_key":null}
Basic read usage:
>>> with jsv.JSVReader('out.jsv') as r:
... for tid, obj in r.items():
... print('{0}: {1}'.format(tid, obj))
The output will be:
_: {'key_1': 1}
_: {'key_1': ['any', 'json', 'object'], 'other_key': None}
Usage with multiple templates:
>>> data = [('t1', {'key_1': 1}),('t2', [{'key_2': 2}, {'key_2': None}])]
>>> with jsv.JSVWriter('out.jsv', 'wt') as w:
... w['t1'] = '{"key_1"}'
... w['t2'] = '[{"key_2"}]'
... for tid, rec in data:
... w.write(rec, tid)
The resulting file will be:
#t1 {"key_1"}
#t2 [{"key_2"}]
@t1 {1}
@t2 [{2},{null}]
Developer Reference¶
API¶
Main Interface¶
-
class
jsv.
JSVReader
(record_file, template_file=None)¶ Bases:
jsv.template_io.JSVCollection
Context manager for reading data from files in JSV format.
This is the main class for reading JSV records from a file or stream. If either
record_file
ortemplate_file
is a string, then it must be used as a context manager. Otherwise, the context manager does nothing with the file pointers, and the object can be used directly.All of the templates used by a
JSVReader
instance must come from eitherrecord_file
ortemplate_file
. If it is present,template_file
is read during initialization (iftemplate_file
is a file pointer) or when entering the context manager (iftemplate_file
is a file path).Example
Take the file
in.jsv
which looks like this:#_ {"key_1","key_2","key_3"} {1,2,3} {4,5,6} {7,8,9}
We can then run the following code:
>>> with jsv.JSVReader('in.jsv') as r: ... for obj in r: ... print(obj) ... {'key_1': 1, 'key_2': 2, 'key_3': 3} {'key_1': 4, 'key_2': 5, 'key_3': 6} {'key_1': 7, 'key_2': 8, 'key_3': 9}
Parameters: - record_file (filepath or
io.TextIOBase
) – Either a file path, or a file pointer from which records should be read. Templates, if present, will also be read. - template_file (filepath or
io.TextIOBase
) – Either a file path, or a file pointer to which templates should be written. If present, templates and records will be written to different files. By convention, records should use the file extension.jsvr
and templates should use file extension.jsvt
.
-
__iter__
()¶ Iterator magic method for the reader object. Templates are consumed to decode records, but are not returned by the iterator.
Returns: (object) where object
is a json-compatible object representing a record.
-
items
()¶ Iterator over both the values and the template ids for each record. Templates are consumed to decode records, but are not returned by the iterator.
Returns: (tid, object) where tid
is the id of the template used, andobject
is a json-compatible object.
- record_file (filepath or
-
class
jsv.
JSVWriter
(record_file, record_mode='at', template_dict=None, template_file=None, template_mode='at')¶ Bases:
jsv.template_io.JSVCollection
Context manager for writing data to files in JSV format.
This is the main class for writing JSV records to a file or stream. If either
record_file
ortemplate_file
is a string, then it must be used as a context manager. Otherwise, the context manager does nothing with the file pointers, and the object can be used as a context manager or not.As
JSVWriter
inherits fromJSVCollection
, it maintains any templates assigned to it. In addition, templates are immediately written to the file or stream when added.Example
>>> out = [] >>> out.append({'key_1': 1, 'key_2': 2, 'key_3': 3}) >>> out.append({'key_1': 4, 'key_2': 5, 'key_3': 6}) >>> out.append({'key_1': 7, 'key_2': 8, 'key_3': 9}) >>> with jsv.JSVWriter('out.jsv', 'wt', {'_': '{"key_1","key_2","key_3"}'}) as w: ... for obj in out: ... w.write(obj)
This creates the file
out.jsv
which looks like this:#_ {"key_1","key_2","key_3"} {1,2,3} {4,5,6} {7,8,9}
Parameters: - record_file (filepath or
io.TextIOBase
) – Either a file path, or a file pointer to which records should be written. Iftemplate_file
is not given, templates will be written here as well. - record_mode (str) – file mode for the record file. Only used if
record_file
is a string. - template_dict (dict) – Dictionary of templates. See
JSVCollection
. - template_file (filepath or
io.TextIOBase
) – Either a file path, or a file pointer to which templates should be written. if present, templates and records will be written to different files. By convention, records should use the file extension.jsvr
and templates should use file extension.jsvt
. - template_mode (str) – file mode for the template file. Only used if
template_file
is a string.
- record_file (filepath or
-
class
jsv.
JSVCollection
(template_dict=None)¶ Use multiple templates in a single data stream.
A
JSVCollection
object is an associative array of templates, each with an id of type str. This facilitates reading from and writing to a file or stream, as each line in the file or stream must be matched to a template.Parameters: template_dict (dict) – A dictionary whose keys are template ids and whose values are JSVTemplate
objects, or are values suitable for theJSVTemplate
constructor.-
get_record_line
(obj, tid='_')¶ Returns a string defining a record in a
.jsv
file. For example:>>> coll = jsv.JSVCollection() >>> coll['template_1'] = '{"key_1"}' >>> coll.get_record_line({'key_1': 'value_1'}, 'template_1') '@template_1 {"value_1"}'
Parameters: - obj (json-compatible object) – The object to be encoded.
- tid (str) – The id of the template.
-
get_template_line
(tid='_')¶ Return a string defining a template in a
.jsv
file. For example:>>> coll = jsv.JSVCollection() >>> coll['template_1'] = '{"key_1"}' >>> coll.get_template_line('template_1') '#template_1 {"key_1"}'
Parameters: tid (str) – The id of the template.
-
items
()¶ Iterator that yields the tuple (tid, template).
-
read_line
(line)¶ Used to read a single line from a
.jsv
file. For example:>>> coll = jsv.JSVCollection() >>> tid, tmpl = coll.read_line('#template_1 {"key_1"}') >>> coll[tid] = tmpl >>> coll.read_line('@template_1 {"value_1"}') ('template_1', {'key_1': 'value_1'})
Parameters: line (str) – String to be read. The string should be in the format used by a .jsv
file. It can be either a record or a template.Returns: (tid, template_or_record) Return type: tuple
-
template_lines
()¶ Iterator that yields the string defining a template in a
.jsv
file. For example:>>> coll = jsv.JSVCollection() >>> coll['template_1'] = '{"key_1"}' >>> for s in coll.template_lines(): ... print(s) ... #_ {} #template_1 {"key_1"}
-
templates
¶ >>> coll = jsv.JSVCollection() >>> coll['template_1'] = '{"key_1"}' >>> coll['template_2'] = '{"key_2"}' >>> coll.templates['{"key_1"}'] 'template_1'
Also allows testing for containment with the
in
operator:>>> '{"key_1"}' in coll.templates True >>> '{"key_5"}' in coll.templates False
Type: Object that allows reverse lookup of template id from a given template. For example
-
-
class
jsv.
JSVTemplate
(key_source='{}')¶ Class for decoding and encoding json records in jsv format.
A Template object stores the key structure for a json-compatible python object. It can then serialize or deserialize any object that conforms to that key structure in a string which is similar in structure to json, but in which the keys are omitted.
Parameters: key_source (str or json-compatible object) – if a string, this must be a valid template string; if not, a JSVTemplateDecodeError
will be raised. If a json-compatible object, the key structure will be extracted, with keys in alphabetical order.-
decode
(s)¶ Decode a jsv string into a json-compatible object
Parameters: s (str) – s represents a json object that has been encoded with the given template. If it does not conform to the template, or is not parsable as jsv, a JSVRecordDecodeError
will be raised.
-
encode
(obj)¶ Encode a json-compatible object into jsv
Parameters: obj (json-compatible object) – obj must also conform to the key structure of the Template, otherwise an error will be raised.
-
Exceptions¶
-
class
jsv.
JSVTemplateDecodeError
(msg, pos)¶ An error occurred while decoding a str or object into a
JSVTemplate
.
-
class
jsv.
JSVRecordDecodeError
(msg, pos)¶ An error occurred while decoding a str representing a record into an object.