描述
NDJSON and JSONL (JSON Lines) are convenient formats for storing or streaming structured data that may be processed one row at a time. JSON files contain tabular data, in the form of an array of row objects. NDJSON and JSONL files contain one row of data per line. Each line is a valid JSON object. The concept of having line delimited JSON is widely used for data science was well as processing and analysis of large datasets. Google BigQuery, Apache Spark, and other platforms that consume large datasets leverage this format.
This application uses file streaming to accomplish the JSON to NDJSON conversion and does not load the entire file into memory. This allows it to convert very large (multiple GB) files, quickly with low memory and CPU usage.
For more information:
https://ndjson.org/
https://jsonlines.org/