Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool

How JSON Formatters Detect and Display Syntax Errors

When you paste a malformed JSON document into a formatter, it instantly highlights errors with precise error messages. This seemingly simple feature is powered by sophisticated parsing techniques. In this article, we'll explore how JSON formatters detect syntax errors and the technical process behind displaying helpful error information.

The JSON Parsing Process

To understand error detection, we need to first understand how JSON parsing works. JSON formatters typically process documents in several stages:

1. Lexical Analysis (Tokenization)

The first step in parsing JSON is breaking the input string into meaningful tokens:

  • Structural tokens: Braces {}, brackets [], colons :, commas ,
  • Value tokens: Strings, numbers, booleans (true, false), null
  • Whitespace: Spaces, tabs, newlines (ignored in parsing but preserved for formatting)

Example of Tokenization:

Input: {"name":"John", "age":30}

Tokens:
1. { (left brace)
2. "name" (string)
3. : (colon)
4. "John" (string)
5. , (comma)
6. "age" (string)
7. : (colon)
8. 30 (number)
9. } (right brace)

2. Syntactic Analysis (Parsing)

The parser takes these tokens and attempts to build a structured representation according to JSON grammar rules:

  • Objects must begin with { and end with }
  • Arrays must begin with [ and end with ]
  • Properties in objects follow the pattern "name": value
  • Values can be strings, numbers, objects, arrays, booleans, or null
  • Multiple values within objects or arrays must be separated by commas

During this phase, the parser builds a tree-like structure called an Abstract Syntax Tree (AST) that represents the hierarchical structure of the JSON data.

3. Semantic Analysis

Some advanced JSON formatters may perform semantic validation beyond syntax checking:

  • JSON Schema validation
  • Detecting duplicate keys
  • Type validation for specific applications
  • Format-specific validations (e.g., checking if a string is a valid URL or date)

How Errors Are Detected

During parsing, errors can occur at any stage. Here's how formatters detect different types of errors:

Lexical Errors

These errors occur during tokenization when the formatter encounters characters that don't conform to JSON syntax:

  • Invalid escape sequences in strings (e.g., \z)
  • Malformed numbers (e.g., 01.2.3)
  • Unexpected symbols or control characters

Lexical Error Example:

{
  "message": "Hello\zWorld"
}

Error: Invalid escape sequence in string

Syntactic Errors

These errors occur when the sequence of tokens doesn't follow proper JSON grammar:

  • Missing or extra commas
  • Unclosed structures (objects or arrays)
  • Missing colons between property names and values
  • Unexpected end of input

Syntactic Error Example:

{
  "name": "John"
  "age": 30
}

Error: Expected comma or closing brace after property value

Semantic Errors

While technically valid JSON, some formatters detect these higher-level issues:

  • Duplicate keys in objects
  • Values not conforming to an expected schema
  • Type mismatches for specific applications

Semantic Error Example:

{
  "user": "john",
  "user": "smith"
}

Warning: Duplicate key 'user' in object

Error Localization Techniques

Quality JSON formatters don't just detect errors—they pinpoint their exact location using several techniques:

1. Position Tracking

During parsing, formatters keep track of:

  • Line number
  • Column position
  • Character offset from the beginning of the document

When an error is encountered, the formatter knows precisely where it occurred.

2. Context Awareness

Good parsers maintain a state stack that tracks:

  • The current parsing context (inside object, array, string, etc.)
  • The nesting level of structures
  • Expected next tokens based on grammar rules

This allows formatters to provide context-specific error messages like "Expected closing brace to close object started at line 3."

3. Error Recovery

Advanced formatters implement error recovery strategies:

  • Skipping invalid tokens to continue parsing
  • Inserting missing tokens to maintain structure
  • Attempting to parse the rest of the document despite errors

This allows them to detect multiple errors in a single pass rather than stopping at the first problem.

Technical Note:

Most JSON formatters use a recursive descent parser with predictive parsing due to the relatively simple and unambiguous grammar of JSON. This approach allows for efficient error detection and clear error messaging.

Displaying Error Information

Once an error is detected, formatters use various visual techniques to communicate the problem to users:

1. Visual Highlighting

  • Underlines or squiggly lines - Marking the exact error location
  • Color coding - Red for errors, yellow for warnings
  • Background highlighting - Drawing attention to problematic lines
  • Line number indicators - Making it easy to find errors in large documents

2. Error Messages

Quality formatters generate descriptive error messages that include:

  • Error type - The category of error (syntax, value, structure, etc.)
  • Location - Line and column numbers
  • Context - What the parser was expecting vs. what it found
  • Suggestions - Possible fixes for common errors

3. Structural Visualization

Advanced formatters provide additional visual aids:

  • Bracket matching - Highlighting paired delimiters to identify mismatches
  • Collapsible sections - Allowing users to focus on problematic areas
  • Tree views - Displaying the successfully parsed portions in a hierarchical structure

Behind the Scenes: Parser Implementation

Most JSON formatters implement parsing using one of these approaches:

1. Hand-written Parsers

Many formatters use custom-built parsers optimized for JSON syntax:

// Simplified pseudocode for a JSON parser
function parseValue(tokens, position) {
  const token = tokens[position];
  
  if (token.type === 'LEFT_BRACE') {
    return parseObject(tokens, position);
  } else if (token.type === 'LEFT_BRACKET') {
    return parseArray(tokens, position);
  } else if (token.type === 'STRING' || token.type === 'NUMBER' || 
             token.type === 'BOOLEAN' || token.type === 'NULL') {
    return {
      value: token.value,
      position: position + 1
    };
  } else {
    throw new SyntaxError(
      `Unexpected token ${token.value} at line ${token.line}, column ${token.column}`
    );
  }
}

2. Parser Generators

Some formatters use parser generators that automate the creation of parsing code from grammar definitions:

  • Tools like ANTLR, Jison, PEG.js
  • The grammar is defined declaratively
  • Error handling is often more sophisticated
  • Easier to maintain and extend

3. Native JSON APIs

Web-based formatters often utilize built-in browser capabilities with custom error handling:

// Using browser's JSON.parse with custom error handling
function parseWithErrorInfo(jsonString) {
  try {
    return {
      result: JSON.parse(jsonString),
      error: null
    };
  } catch (error) {
    // Extract line/column information from the error message
    const match = /position (\d+)/.exec(error.message);
    const position = match ? parseInt(match[1], 10) : 0;
    
    // Find line and column from character position
    const { line, column } = findLineAndColumn(jsonString, position);
    
    return {
      result: null,
      error: {
        message: error.message,
        line,
        column
      }
    };
  }
}

Advanced Error Detection Features

1. Error Prediction

Sophisticated formatters don't just identify what's wrong—they suggest what might be right:

  • Suggesting missing punctuation (commas, quotes, brackets)
  • Identifying likely typos based on common patterns
  • Recommending structural fixes for nested objects and arrays

2. Format Detection

Some formatters detect and support JSON-adjacent formats:

  • JSONC (JSON with Comments)
  • JSON5 (relaxed JSON that allows trailing commas, single quotes, etc.)
  • Detecting when the input might be YAML, XML, or other formats mistakenly used as JSON

3. Performance Optimizations

For large documents, formatters implement optimizations:

  • Incremental parsing to validate changes without re-parsing the entire document
  • Parallel parsing of independent sections in multi-threaded environments
  • Lazy parsing that only fully processes sections being viewed or edited

Conclusion

JSON formatters use sophisticated parsing techniques to detect and display syntax errors. By breaking down the document into tokens, analyzing their structure according to JSON grammar rules, and tracking position information, these tools can provide precise error messages that help developers quickly identify and fix problems.

The next time you see a helpful error message highlighting exactly where your JSON went wrong, you'll have a better understanding of the complex parsing machinery working behind the scenes to make your debugging experience smoother and more efficient.

Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool