Stability Guarantees
Lanexio Parser makes three public stability commitments. They are tested in CI, backed by fuzz harnesses, and enforced by architectural constraints. Breaking any one of them is a P0 bug.
Guarantee 1: Never-Throw Integrity
Section titled “Guarantee 1: Never-Throw Integrity”parse(), reparse(), streaming parse methods, and the serializers (serializeHtml, serializeMarkdown) never throw. Malformed input produces error nodes in the AST. Exceptions never leave the parse boundary.
import { parseHtml } from '@lanexio/parser-grammar-html';
// This never throws, even on deeply malformed input.const encoder = new TextEncoder();const tree = parseHtml(encoder.encode('<<<<<\x00\xFF\xFE'));
// Check for parse errors in the tree.for (const node of tree.root.children()) { if (node.hasError) { console.log('parse error at', node.range); }}What this means
Section titled “What this means”- You do not need to wrap
parseHtmlorparseMarkdownin a try-catch. - Malformed input is always safe to pass to the parser.
- The return value is always a valid
LexTree.
What this does not mean
Section titled “What this does not mean”- The returned tree may contain
LexErrornodes. It is your responsibility to detect and handle them. - Third-party code you call with the parse output (e.g.,
innerHTML = serializeHtml(tree)) may behave unexpectedly on malformed input. That is outside the parse boundary.
Guarantee 2: Panic-Free Byte Sequence Immunity
Section titled “Guarantee 2: Panic-Free Byte Sequence Immunity”No byte sequence of any length causes a panic, unhandled exception, or silent memory corruption.
This guarantee is backed by:
- Iterative state machines — no recursion in parsers or serializers; 50,000-level-deep nesting is covered by regression tests for both
serializeHtmlandserializeMarkdown - Structural forward-progress guards in loop-driven state machines (a dispatch gap degrades to an error node plus a one-byte advance, never a hang)
- Wrapping arithmetic operators in all Zig hot paths
- Bounds checking before every slice index on user input
- 24-hour continuous fuzz soak on every release
24-hour fuzz soak results (v1.0.0)
Section titled “24-hour fuzz soak results (v1.0.0)”At tag v1.0.0, a 24-hour continuous fuzz soak ran 13 harnesses across all grammar packs:
| Harness | Cases / 24h | Result |
|---|---|---|
core-soak (parse) | 594,620,674 | 0 crashes |
html-char-eof | 3,829,498,486 | 0 crashes |
html-fragment-ns | 3,278,712,434 | 0 crashes |
html-frameset | 3,185,115,065 | 0 crashes |
html-aa-byte-range | 1,899,683,016 | 0 crashes |
html-foster-parent | 1,413,404,061 | 0 crashes |
html-aa-tree-shape | 1,179,886,446 | 0 crashes |
json-parse | 1,065,749,199 | 0 throws |
yaml-parse | 970,495,753 | 0 throws |
html-orphan-p | 823,743,573 | 0 crashes |
html-rawtext-nul | 604,480,359 | 0 violations |
css-parse | 403,933,293 | 0 throws |
html-noscript | 247,989,717 | 0 violations |
| Total | ~20,000,000,000 | 0 failures |
What this means
Section titled “What this means”- You can pass any arbitrary byte sequence (null bytes, invalid UTF-8, binary data, adversarial inputs) to
parseHtmlorparseMarkdownwithout crashing the process. - The output is always a valid
LexTreewith the appropriateLexErrornodes.
What this does not mean
Section titled “What this does not mean”- The output may be unusable for its intended purpose if the input is deeply corrupted. The parser does its best, but tree shape is undefined for extreme malformations.
- Performance is not guaranteed on adversarial input. The parser may run slower on inputs designed to maximize backtracking.
Guarantee 3: Deterministic 3-Token Recovery Window
Section titled “Guarantee 3: Deterministic 3-Token Recovery Window”After a syntax error, the parser resyncs to a valid construct within 3 tokens.
This means:
- Parse errors do not cascade indefinitely.
- The tree shape after an error is bounded and predictable.
- For any given input,
parseHtml(input)always produces the sameLexTree(deterministic output).
What this means
Section titled “What this means”- Calling
parseHtml(input)twice with the sameinputalways returns an equivalent tree. - Error recovery is local and bounded.
What this does not mean
Section titled “What this does not mean”- The recovered subtree is necessarily semantically correct. The parser resumes at a valid grammar point, but the semantic meaning of the recovered tree depends on the document.
- The 3-token window applies to spec-defined recovery paths. Certain error classes (e.g., deeply nested malformed trees) may have documented exceptions. See
zig/grammars/*/recovery.zigfor grammar-specific notes.
Summary
Section titled “Summary”| Guarantee | What it covers | How it is enforced |
|---|---|---|
| Never-throw | parse(), reparse(), streaming methods | pnpm verify:no-throw, Vitest never-throw harness |
| Panic-free | Any byte sequence input | Zig fuzz harnesses, zig build fuzz |
| 3-token recovery | Bounded error recovery in parse trees | Corpus entries in corpus/malformed/, recovery.zig |