Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool
Building Recursive Descent Parsers for JSON
Recursive descent parsing is a good fit for JSON because the grammar is small, nested, and easy to dispatch with one token or one character of lookahead. The hard part is not the recursion itself. The hard part is being strict enough to accept real JSON and reject JavaScript-like input that only looks close.
If you are building a parser for learning, custom diagnostics, or a transformation pipeline, focus on the exact rules first: JSON keys must be double-quoted, trailing commas are invalid, numbers have tighter syntax than JavaScript literals, and the root can be any JSON value, not only an object or array. That gives you a parser that behaves like users expect when they paste data into a formatter or validator.
Start from the JSON rules that matter
The current JSON standard is RFC 8259, aligned with ECMA-404. For a hand-written recursive descent parser, this is the core grammar you actually need:
value = object / array / string / number / "true" / "false" / "null"
object = "{" [ member *( "," member ) ] "}"
member = string ":" value
array = "[" [ value *( "," value ) ] "]"- A valid JSON document can be
42,true, or"hello", not just{...}or[...]. - Object member names should be unique. In practice, parsers differ on duplicates, so it is better to choose a policy explicitly than to leave it accidental.
- Trailing commas are invalid in both objects and arrays. A parser that accepts them is parsing a JavaScript-style extension, not strict JSON.
- JSON numbers do not allow
+1,01,NaN,Infinity, or a decimal point without following digits.
The parser state can stay very small
For JSON, you do not need a large parser framework. A minimal recursive descent parser usually keeps only:
- The input string and a current cursor position.
- A few helpers such as
peek(),skipWhitespace(), andexpectChar(). - One parsing function per grammar concept:
parseValue(),parseObject(),parseArray(),parseString(), andparseNumber(). - A consistent way to produce syntax errors with the current position.
A separate tokenizer is still a valid design, especially if you want line and column tracking, better error recovery, or streaming behavior. But for a strict JSON parser, a direct character scanner is often simpler and easier to reason about.
How recursive descent maps to JSON
The top-level dispatcher is parseValue(). It looks at the next non-whitespace character and sends control to the matching rule:
{starts an object.[starts an array."starts a string.-or a digit starts a number.t,f, andnstart the literal keywords.
Objects and arrays are recursive because their contents call back into parseValue(). That is the whole recursive descent pattern in JSON: containers recurse, leaf values terminate.
A practical TypeScript implementation
The example below is still compact, but it is strict in the places that matter for real JSON input. It rejects trailing commas, enforces double-quoted keys, validates number syntax, and decodes standard escape sequences.
Strict JSON Parser (TypeScript)
type JsonValue =
| null
| boolean
| number
| string
| JsonValue[]
| { [key: string]: JsonValue };
class JsonParser {
private index = 0;
constructor(private readonly input: string) {}
parse(): JsonValue {
const value = this.parseValue();
this.skipWhitespace();
if (!this.isAtEnd()) {
throw this.error("Unexpected non-whitespace after root value");
}
return value;
}
private parseValue(): JsonValue {
this.skipWhitespace();
const ch = this.peek();
if (ch === "{") return this.parseObject();
if (ch === "[") return this.parseArray();
if (ch === '"') return this.parseString();
if (ch === "-" || this.isDigit(ch)) return this.parseNumber();
if (ch === "t") return this.parseLiteral("true", true);
if (ch === "f") return this.parseLiteral("false", false);
if (ch === "n") return this.parseLiteral("null", null);
throw this.error("Expected a JSON value");
}
private parseObject(): { [key: string]: JsonValue } {
const result: { [key: string]: JsonValue } = {};
this.expectChar("{");
this.skipWhitespace();
if (this.peek() === "}") {
this.index++;
return result;
}
while (true) {
this.skipWhitespace();
if (this.peek() !== '"') {
throw this.error("Object keys must be double-quoted strings");
}
const key = this.parseString();
this.skipWhitespace();
this.expectChar(":");
const value = this.parseValue();
if (Object.prototype.hasOwnProperty.call(result, key)) {
throw this.error("Duplicate object key: " + key);
}
result[key] = value;
this.skipWhitespace();
if (this.peek() === "}") {
this.index++;
return result;
}
this.expectChar(",");
this.skipWhitespace();
if (this.peek() === "}") {
throw this.error("Trailing commas are not allowed in objects");
}
}
}
private parseArray(): JsonValue[] {
const result: JsonValue[] = [];
this.expectChar("[");
this.skipWhitespace();
if (this.peek() === "]") {
this.index++;
return result;
}
while (true) {
result.push(this.parseValue());
this.skipWhitespace();
if (this.peek() === "]") {
this.index++;
return result;
}
this.expectChar(",");
this.skipWhitespace();
if (this.peek() === "]") {
throw this.error("Trailing commas are not allowed in arrays");
}
}
}
private parseString(): string {
this.expectChar('"');
let result = "";
while (!this.isAtEnd()) {
const ch = this.input[this.index++];
if (ch === '"') {
return result;
}
if (ch === "\\") {
result += this.parseEscapeSequence();
continue;
}
if (ch < " ") {
throw this.error("Unescaped control character in string");
}
result += ch;
}
throw this.error("Unterminated string");
}
private parseEscapeSequence(): string {
const ch = this.input[this.index++];
switch (ch) {
case '"':
case "\\":
case "/":
return ch;
case "b":
return "\b";
case "f":
return "\f";
case "n":
return "\n";
case "r":
return "\r";
case "t":
return "\t";
case "u":
return this.parseUnicodeEscape();
default:
throw this.error("Invalid escape sequence");
}
}
private parseUnicodeEscape(): string {
const hex = this.input.slice(this.index, this.index + 4);
if (!/^[0-9a-fA-F]{4}$/.test(hex)) {
throw this.error("Expected four hex digits after \\u");
}
this.index += 4;
return String.fromCharCode(parseInt(hex, 16));
}
private parseNumber(): number {
const start = this.index;
if (this.peek() === "-") {
this.index++;
}
if (this.peek() === "0") {
this.index++;
if (this.isDigit(this.peek())) {
throw this.error("Leading zeros are not allowed");
}
} else {
this.readDigits("Expected digit after minus sign");
}
if (this.peek() === ".") {
this.index++;
this.readDigits("Expected digit after decimal point");
}
if (this.peek() === "e" || this.peek() === "E") {
this.index++;
if (this.peek() === "+" || this.peek() === "-") {
this.index++;
}
this.readDigits("Expected digit in exponent");
}
return Number(this.input.slice(start, this.index));
}
private readDigits(message: string): void {
if (!this.isDigit(this.peek())) {
throw this.error(message);
}
while (this.isDigit(this.peek())) {
this.index++;
}
}
private parseLiteral<T extends JsonValue>(text: string, value: T): T {
if (this.input.slice(this.index, this.index + text.length) !== text) {
throw this.error("Unexpected literal");
}
this.index += text.length;
return value;
}
private expectChar(expected: string): void {
if (this.peek() !== expected) {
throw this.error("Expected '" + expected + "'");
}
this.index++;
}
private skipWhitespace(): void {
while (this.peek() === " " || this.peek() === "\n" || this.peek() === "\r" || this.peek() === "\t") {
this.index++;
}
}
private peek(): string {
return this.input[this.index] ?? "";
}
private isDigit(ch: string): boolean {
return ch >= "0" && ch <= "9";
}
private isAtEnd(): boolean {
return this.index >= this.input.length;
}
private error(message: string): SyntaxError {
return new SyntaxError(message + " at index " + this.index);
}
}
const parser = new JsonParser(
'{"name":"Ada","scores":[10,20,30],"active":true,"profile":{"city":"London"}}'
);
console.log(parser.parse());Implementation details that usually cause bugs
- Duplicate keys: RFC 8259 says object names should be unique, but it does not force one runtime behavior. Rejecting duplicates is often safer than silently keeping the last value.
- String escapes: You need to handle
\",\\,\/, control escapes, and\uXXXX. Accepting raw control characters inside strings is a parser bug. - Number precision: Converting directly to JavaScript
numberis convenient, but very large integers may lose precision. If exact numeric text matters, keep the original lexeme too. - Error reporting: Index-only errors are enough for a basic parser. For a formatter or editor, line and column tracking is worth the extra bookkeeping.
Useful tests for a real JSON parser
These cases catch most of the mistakes in first-pass implementations:
// valid
"hello"
42
{"a":[true,false,null]}
[]
// invalid
{'a':1} // single quotes are not JSON
{"a":1,} // trailing comma
{"a":01} // leading zero
[1,,2] // missing element
{"x":"\q"} // invalid escape
{"x":1} garbage // extra data after rootWhen to hand-write a parser and when not to
Writing a recursive descent parser for JSON is worthwhile when you need custom diagnostics, a teaching example, or strict control over how values are represented. It is usually not worth replacing highly optimized built-in parsers in production application code unless you have a specific behavior that the platform parser cannot provide.
In other words, recursive descent is excellent for understanding JSON and for building specialized tools. It is not automatically the fastest or safest production choice just because the grammar is simple.
Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool