Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool
How JSON Formatters Detect and Display Syntax Errors
When you paste a malformed JSON document into a formatter, it instantly highlights errors with precise error messages. This seemingly simple feature is powered by sophisticated parsing techniques. In this article, we'll explore how JSON formatters detect syntax errors and the technical process behind displaying helpful error information.
The JSON Parsing Process
To understand error detection, we need to first understand how JSON parsing works. JSON formatters typically process documents in several stages:
1. Lexical Analysis (Tokenization)
The first step in parsing JSON is breaking the input string into meaningful tokens:
- Structural tokens: Braces
{}
, brackets[]
, colons:
, commas,
- Value tokens: Strings, numbers, booleans (
true
,false
),null
- Whitespace: Spaces, tabs, newlines (ignored in parsing but preserved for formatting)
Example of Tokenization:
Input: {"name":"John", "age":30} Tokens: 1. { (left brace) 2. "name" (string) 3. : (colon) 4. "John" (string) 5. , (comma) 6. "age" (string) 7. : (colon) 8. 30 (number) 9. } (right brace)
2. Syntactic Analysis (Parsing)
The parser takes these tokens and attempts to build a structured representation according to JSON grammar rules:
- Objects must begin with
{
and end with}
- Arrays must begin with
[
and end with]
- Properties in objects follow the pattern
"name": value
- Values can be strings, numbers, objects, arrays, booleans, or null
- Multiple values within objects or arrays must be separated by commas
During this phase, the parser builds a tree-like structure called an Abstract Syntax Tree (AST) that represents the hierarchical structure of the JSON data.
3. Semantic Analysis
Some advanced JSON formatters may perform semantic validation beyond syntax checking:
- JSON Schema validation
- Detecting duplicate keys
- Type validation for specific applications
- Format-specific validations (e.g., checking if a string is a valid URL or date)
How Errors Are Detected
During parsing, errors can occur at any stage. Here's how formatters detect different types of errors:
Lexical Errors
These errors occur during tokenization when the formatter encounters characters that don't conform to JSON syntax:
- Invalid escape sequences in strings (e.g.,
\z
) - Malformed numbers (e.g.,
01.2.3
) - Unexpected symbols or control characters
Lexical Error Example:
{ "message": "Hello\zWorld" }
Error: Invalid escape sequence in string
Syntactic Errors
These errors occur when the sequence of tokens doesn't follow proper JSON grammar:
- Missing or extra commas
- Unclosed structures (objects or arrays)
- Missing colons between property names and values
- Unexpected end of input
Syntactic Error Example:
{ "name": "John" "age": 30 }
Error: Expected comma or closing brace after property value
Semantic Errors
While technically valid JSON, some formatters detect these higher-level issues:
- Duplicate keys in objects
- Values not conforming to an expected schema
- Type mismatches for specific applications
Semantic Error Example:
{ "user": "john", "user": "smith" }
Warning: Duplicate key 'user' in object
Error Localization Techniques
Quality JSON formatters don't just detect errors—they pinpoint their exact location using several techniques:
1. Position Tracking
During parsing, formatters keep track of:
- Line number
- Column position
- Character offset from the beginning of the document
When an error is encountered, the formatter knows precisely where it occurred.
2. Context Awareness
Good parsers maintain a state stack that tracks:
- The current parsing context (inside object, array, string, etc.)
- The nesting level of structures
- Expected next tokens based on grammar rules
This allows formatters to provide context-specific error messages like "Expected closing brace to close object started at line 3."
3. Error Recovery
Advanced formatters implement error recovery strategies:
- Skipping invalid tokens to continue parsing
- Inserting missing tokens to maintain structure
- Attempting to parse the rest of the document despite errors
This allows them to detect multiple errors in a single pass rather than stopping at the first problem.
Technical Note:
Most JSON formatters use a recursive descent parser with predictive parsing due to the relatively simple and unambiguous grammar of JSON. This approach allows for efficient error detection and clear error messaging.
Displaying Error Information
Once an error is detected, formatters use various visual techniques to communicate the problem to users:
1. Visual Highlighting
- Underlines or squiggly lines - Marking the exact error location
- Color coding - Red for errors, yellow for warnings
- Background highlighting - Drawing attention to problematic lines
- Line number indicators - Making it easy to find errors in large documents
2. Error Messages
Quality formatters generate descriptive error messages that include:
- Error type - The category of error (syntax, value, structure, etc.)
- Location - Line and column numbers
- Context - What the parser was expecting vs. what it found
- Suggestions - Possible fixes for common errors
3. Structural Visualization
Advanced formatters provide additional visual aids:
- Bracket matching - Highlighting paired delimiters to identify mismatches
- Collapsible sections - Allowing users to focus on problematic areas
- Tree views - Displaying the successfully parsed portions in a hierarchical structure
Behind the Scenes: Parser Implementation
Most JSON formatters implement parsing using one of these approaches:
1. Hand-written Parsers
Many formatters use custom-built parsers optimized for JSON syntax:
// Simplified pseudocode for a JSON parser function parseValue(tokens, position) { const token = tokens[position]; if (token.type === 'LEFT_BRACE') { return parseObject(tokens, position); } else if (token.type === 'LEFT_BRACKET') { return parseArray(tokens, position); } else if (token.type === 'STRING' || token.type === 'NUMBER' || token.type === 'BOOLEAN' || token.type === 'NULL') { return { value: token.value, position: position + 1 }; } else { throw new SyntaxError( `Unexpected token ${token.value} at line ${token.line}, column ${token.column}` ); } }
2. Parser Generators
Some formatters use parser generators that automate the creation of parsing code from grammar definitions:
- Tools like ANTLR, Jison, PEG.js
- The grammar is defined declaratively
- Error handling is often more sophisticated
- Easier to maintain and extend
3. Native JSON APIs
Web-based formatters often utilize built-in browser capabilities with custom error handling:
// Using browser's JSON.parse with custom error handling function parseWithErrorInfo(jsonString) { try { return { result: JSON.parse(jsonString), error: null }; } catch (error) { // Extract line/column information from the error message const match = /position (\d+)/.exec(error.message); const position = match ? parseInt(match[1], 10) : 0; // Find line and column from character position const { line, column } = findLineAndColumn(jsonString, position); return { result: null, error: { message: error.message, line, column } }; } }
Advanced Error Detection Features
1. Error Prediction
Sophisticated formatters don't just identify what's wrong—they suggest what might be right:
- Suggesting missing punctuation (commas, quotes, brackets)
- Identifying likely typos based on common patterns
- Recommending structural fixes for nested objects and arrays
2. Format Detection
Some formatters detect and support JSON-adjacent formats:
- JSONC (JSON with Comments)
- JSON5 (relaxed JSON that allows trailing commas, single quotes, etc.)
- Detecting when the input might be YAML, XML, or other formats mistakenly used as JSON
3. Performance Optimizations
For large documents, formatters implement optimizations:
- Incremental parsing to validate changes without re-parsing the entire document
- Parallel parsing of independent sections in multi-threaded environments
- Lazy parsing that only fully processes sections being viewed or edited
Conclusion
JSON formatters use sophisticated parsing techniques to detect and display syntax errors. By breaking down the document into tokens, analyzing their structure according to JSON grammar rules, and tracking position information, these tools can provide precise error messages that help developers quickly identify and fix problems.
The next time you see a helpful error message highlighting exactly where your JSON went wrong, you'll have a better understanding of the complex parsing machinery working behind the scenes to make your debugging experience smoother and more efficient.
Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool