Bimodal AST
The Lanexio Parser AST has two layers that work together. The universal base layer handles storage and traversal. The grammar-specific typed layer gives you named kind constants and type-safe field access.
Layer 1: Universal base
Section titled “Layer 1: Universal base”Every node in every grammar is a LexNode. The universal layer is grammar-agnostic. It stores and traverses nodes, but it does not know whether a node is an HTML element, a Markdown heading, or any other domain concept.
The universal base is implemented in @lanexio/parser-core. It provides:
LexTree— the parsed result, wrapping a flatArrayBufferLexNode— a view into a 16-byte record in that bufferLexCursor— a stateful cursor for efficient traversal
You can use the universal base alone if you want to write grammar-neutral code, but most application code also imports the grammar-specific layer for named kind constants.
Layer 2: Typed projections (per grammar)
Section titled “Layer 2: Typed projections (per grammar)”Each grammar pack adds a typed projection on top of the universal base. The projection is a set of const objects generated from the Zig grammar definition.
For @lanexio/parser-grammar-html, the projection is HtmlKind, HtmlField, and HTML_FIELD_NAMES_BY_ID. For @lanexio/parser-grammar-markdown, it is MdKind, MdField, and MD_FIELD_NAMES_BY_ID.
HtmlKind example
Section titled “HtmlKind example”import { parseHtml, HtmlKind } from '@lanexio/parser-grammar-html';
const encoder = new TextEncoder();const tree = parseHtml(encoder.encode('<p>Hello</p>'));
const cursor = tree.cursor();visit: while (true) { if (cursor.current.kind === HtmlKind.Element) { console.log('element node at', cursor.current.range); } if (cursor.gotoFirstChild()) continue; while (!cursor.gotoNextSibling()) { if (!cursor.gotoParent()) break visit; }}HtmlKind.Element is a stable numeric constant. The numeric values are generated from zig/grammars/html/src/kinds.zig and never change within a major version.
The as-const pattern
Section titled “The as-const pattern”Lanexio Parser uses const objects with typeof union types instead of TypeScript enum. This is because TypeScript enums have runtime behavior and type erasure that create subtle bugs in strict code.
// What the generated code looks likeexport const HtmlKind = { Document: 1, Element: 2, Text: 3, // ...} as const;
export type HtmlKindValue = typeof HtmlKind[keyof typeof HtmlKind];When you need to type-check a kind value, use the union type:
import { HtmlKind } from '@lanexio/parser-grammar-html';
type HtmlKindValue = typeof HtmlKind[keyof typeof HtmlKind];
function describeKind(kind: HtmlKindValue): string { switch (kind) { case HtmlKind.Element: return 'Element'; case HtmlKind.Text: return 'Text'; default: return 'Other'; }}How types are generated
Section titled “How types are generated”Grammar kind constants are defined in Zig, in files like zig/grammars/html/src/kinds.zig. The pnpm codegen command generates the TypeScript .generated.ts file from the Zig source.
Never edit .generated.ts files by hand. They are overwritten by pnpm codegen. If you need to add or change a kind, edit kinds.zig and run pnpm codegen.
Adding a new kind (grammar authors)
Section titled “Adding a new kind (grammar authors)”- Add the kind to the appropriate
kinds.zig. - Run
pnpm codegen. - Commit both
kinds.zigand the generated.generated.ts. - Add tests and a corpus entry for the new kind.
- Run
pnpm verify:no-throwto confirm the parse path still holds.
See AGENTS.md §7 for the full protocol.
Grammar packs are independent
Section titled “Grammar packs are independent”Each grammar pack is its own npm package. Grammar packs depend only on parser-core. They never depend on each other. This means you can use grammar-html without loading grammar-markdown, and vice versa.
The lazy-load story depends on this independence. If a grammar pack imported another, bundlers could not tree-shake unused grammars.