Skip to content

Parsing JSON

Package: @lanexio/parser-grammar-json Stable Layer: 2 (Grammar). Depends only on @lanexio/parser-core. Runtime: Universal (browser, server, edge worker).

parseJson implements a full iterative JSON parser conforming to RFC 8259 (also ECMA-404). It handles objects, arrays, strings, numbers, booleans, and null with a single-pass explicit-container-stack state machine. No recursion.

Input is always a Uint8Array. Output is always a LexTree. Malformed JSON produces JsonKind.Error nodes — parseJson never throws.

The parser runs the complete JSONTestSuite corpus — all 311 cases: 95 must-accept (y_*), 183 must-reject (n_*), and 33 implementation-defined never-throw (i_*). All pass.

import {
parseJson,
JsonKind,
JsonField,
JSON_KIND_NAMES_BY_ID,
jsonGrammar,
type ParseJsonOptions,
type JsonKindType,
} from '@lanexio/parser-grammar-json';
import { parseJson } from '@lanexio/parser-grammar-json';
const encoder = new TextEncoder();
const tree = parseJson(encoder.encode('{"key": [1, true, null]}'));
console.log(tree.nodeCount); // total nodes
console.log(tree.root.kind); // Document root kind id (0x0800)

parseJson accepts a Uint8Array. Always use TextEncoder when converting a string to bytes.

const tree = parseJson(encoder.encode(`
{
"users": [
{ "id": 1, "name": "Alice" },
{ "id": 2, "name": "Bob" }
]
}
`));
// Walks: Document → Object → Member("users") → Array → Object → Member("id", Number(1)), ...

JSON conformance is all-or-nothing by design (RFC strictness): a malformed document produces a single-node error tree whose root is JsonKind.Error, with the node’s range pointing at the byte where parsing failed.

// Valid — Document → True
parseJson(encoder.encode('true'));
// Invalid — the ROOT is the Error node (never throws)
const tree = parseJson(encoder.encode('{"a": 1, broken'));
console.log(tree.root.kind === JsonKind.Error); // true
const [errStart] = tree.root.range;
console.log('failed at byte offset', errStart); // 9 — start of the offending token
FieldTypeDefaultDescription
strictbooleanundefinedReserved for future use. Currently has no effect.
import { parseJson, JsonKind } from '@lanexio/parser-grammar-json';
const encoder = new TextEncoder();
const tree = parseJson(encoder.encode('[1,]')); // trailing comma — RFC 8259 disallows
if (tree.root.kind === JsonKind.Error) {
const [at] = tree.root.range;
console.log('invalid JSON — parse failed at byte', at);
}

parseJson never throws. Unlike the markup grammars (which recover and embed error nodes inside a larger tree), JSON is strict: any violation yields the single-node Error root above. The error node’s range carries the failure offset, so diagnostics can point at the exact spot.

Member nodes carry field metadata: the key string is tagged key and the member’s value node is tagged value, so childByField() gives you direct key/value navigation — the most common JSON-tree operation:

import { parseJson, JsonKind } from '@lanexio/parser-grammar-json';
const encoder = new TextEncoder();
const tree = parseJson(encoder.encode('{"name": "Alice", "id": 42}'));
const obj = tree.root.child(0); // Document → Object
for (const member of obj!.children()) { // Member nodes
const key = member.childByField('key'); // String node (includes quotes)
const value = member.childByField('value'); // String / Number / Object / …
console.log(key?.text, '', value?.text);
// "name" → "Alice"
// "id" → 42
}

JsonField exposes the numeric ids (JsonField.Key, JsonField.Value) for hot loops that compare node.fieldId directly.

import { JsonKind } from '@lanexio/parser-grammar-json';
// JsonKind is a const object. Use 'as const' pattern, never enum.
const kind: JsonKindType = JsonKind.Array;
// Node kind IDs (0x0800 block)
JsonKind.Document; // 0x0800
JsonKind.Object; // 0x0801
JsonKind.Member; // 0x0802
JsonKind.Array; // 0x0803
JsonKind.String; // 0x0804
JsonKind.Number; // 0x0805
JsonKind.True; // 0x0806
JsonKind.False; // 0x0807
JsonKind.Null; // 0x0808
JsonKind.Whitespace; // 0x0809 — reserved; whitespace is skipped, never emitted
JsonKind.Error; // 0x080a

JsonKind values are stable across versions. Never use raw numbers — always reference JsonKind.<name>.

ExportTypeDescription
parseJson(bytes: Uint8Array, options?: ParseJsonOptions) => LexTreeParse JSON RFC 8259. Never throws.
ParseJsonOptions{ readonly strict?: boolean }Options for parseJson.
JsonKindconst objectNumeric kind IDs for all JSON node types (0x0800 block).
JsonKindTypetype unionUnion of all JsonKind values.
JSON_KIND_NAMES_BY_IDReadonly<Record<number, string>>Kind-name lookup by numeric ID.
JsonFieldconst objectField role ids: Key, Value (and None).
JSON_FIELD_NAMES_BY_IDReadonly<Record<number, string>>Field-name lookup; wired into every parsed tree.
jsonGrammarLanexioParserPureGrammarGrammar descriptor — pass to createParser from @lanexio/parser.
MetricValue
JSONTestSuite y_ (valid)95 / 95
JSONTestSuite n_ (rejected)183 / 183
JSONTestSuite i_ (never-throw)33 / 33
JSONTestSuite total311 / 311
verify:no-throw59 entry points
Fuzz soak (300s)11.7M rounds / 0 crashes
  • @lanexio/parser-core — shared buffer protocol, LexTree, LexNode, LexCursor.
  • @lanexio/parser — unified entry point; pass this package’s grammar descriptor to createParser for a string-accepting, never-throwing parser handle.