Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool

Best Practices for Validating JSON Before Formatting

Validating JSON before formatting is a critical step that helps catch errors early, prevents downstream issues, and ensures data integrity. While formatting makes JSON more readable, validation confirms that your JSON is structurally sound and adheres to expected schemas. This article explores best practices for implementing robust JSON validation as part of your data processing workflow.

Why Validate Before Formatting?

While many JSON formatters include basic validation, deliberately separating validation from formatting offers several key advantages:

  • Error isolation - Distinguishes between syntax errors and formatting preferences
  • Deeper validation - Enables schema validation beyond basic syntax checking
  • Controlled error handling - Provides more precise feedback and recovery options
  • Performance optimization - Prevents wasting resources formatting invalid data

Pro Tip:

Think of validation and formatting as distinct responsibilities in a pipeline: validation confirms your JSON is correct, while formatting makes it human-readable. This separation of concerns leads to clearer code and more reliable systems.

Levels of JSON Validation

Effective JSON validation occurs at multiple levels, with each providing different types of guarantees:

1. Syntax Validation

The most basic level ensures the JSON follows proper JSON syntax rules:

  • Properly formed objects and arrays
  • Correct use of quotes, commas, and colons
  • Properly escaped special characters
  • Valid data types (strings, numbers, objects, arrays, booleans, null)

Syntax Validation Example (JavaScript):

function validateJsonSyntax(jsonString) {
  try {
    JSON.parse(jsonString);
    return { valid: true };
  } catch (error) {
    return { 
      valid: false,
      error: error.message,
      // Extract position information if available
      position: error.message.match(/position (\d+)/) 
               ? Number(error.message.match(/position (\d+)/)[1]) 
               : null
    };
  }
}

2. Schema Validation

Beyond syntax, schema validation ensures the JSON adheres to an expected structure and data types:

  • Required properties are present
  • Property values have the correct data types
  • Values fall within expected ranges or patterns
  • Arrays contain valid elements
  • Nested structures follow expected patterns

Schema Validation Example (JavaScript with Ajv):

// Using Ajv, a popular JSON schema validator
const Ajv = require('ajv');
const ajv = new Ajv();

const schema = {
  type: "object",
  properties: {
    name: { type: "string" },
    age: { type: "number", minimum: 0 },
    email: { type: "string", format: "email" },
    tags: { 
      type: "array", 
      items: { type: "string" }
    }
  },
  required: ["name", "email"],
  additionalProperties: false
};

function validateJsonSchema(jsonString, schema) {
  try {
    const data = JSON.parse(jsonString);
    const validate = ajv.compile(schema);
    const valid = validate(data);
    
    if (valid) {
      return { valid: true };
    } else {
      return {
        valid: false,
        errors: validate.errors
      };
    }
  } catch (error) {
    return {
      valid: false,
      syntaxError: error.message
    };
  }
}

3. Semantic Validation

The deepest level of validation examines relationships, business rules, and domain-specific requirements:

  • Cross-field validations (e.g., end date after start date)
  • Business logic rules (e.g., discount cannot exceed price)
  • Referential integrity (e.g., referenced IDs must exist)
  • Domain-specific validations (e.g., valid product codes)

Semantic Validation Example:

function validateEventData(jsonString) {
  // Parse and perform basic schema validation first
  const baseResult = validateJsonSchema(jsonString, eventSchema);
  if (!baseResult.valid) return baseResult;
  
  const data = JSON.parse(jsonString);
  const errors = [];
  
  // Cross-field validation: end date must be after start date
  if (new Date(data.endDate) <= new Date(data.startDate)) {
    errors.push({
      field: "endDate",
      message: "End date must be after start date"
    });
  }
  
  // Capacity validation
  if (data.registrations > data.maxCapacity) {
    errors.push({
      field: "registrations",
      message: "Registrations cannot exceed maximum capacity"
    });
  }
  
  // Location validation: check if location exists in database
  if (!isValidLocation(data.locationId)) {
    errors.push({
      field: "locationId",
      message: "Location does not exist"
    });
  }
  
  return {
    valid: errors.length === 0,
    errors: errors
  };
}

JSON Schema: A Standard for Validation

JSON Schema is the industry standard for defining the structure, content, and validation constraints of JSON data:

  • Provides a declarative way to describe JSON data structures
  • Enables automatic validation of JSON documents
  • Supports complex validation rules and dependencies
  • Can generate documentation and UI forms
  • Supports composition and reuse through references

JSON Schema Example:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "User Profile",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "format": "uuid",
      "description": "Unique identifier for the user"
    },
    "name": {
      "type": "string",
      "minLength": 2,
      "maxLength": 100
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "age": {
      "type": "integer",
      "minimum": 13,
      "maximum": 120
    },
    "preferences": {
      "type": "object",
      "properties": {
        "theme": {
          "type": "string",
          "enum": ["light", "dark", "system"]
        },
        "notifications": {
          "type": "boolean"
        }
      }
    },
    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "maxItems": 10
    },
    "createdAt": {
      "type": "string",
      "format": "date-time"
    }
  },
  "required": ["id", "name", "email"],
  "additionalProperties": false
}

Best Practices for Implementing Validation

1. Create a Validation Pipeline

Implement a step-by-step validation process:

  1. Check for basic parsing/syntax errors
  2. Validate against schema (structure and types)
  3. Perform semantic validations (business rules)
  4. Log detailed errors at each stage

Validation Pipeline Example:

function validateJson(jsonString) {
  // Step 1: Syntax validation
  const syntaxResult = validateJsonSyntax(jsonString);
  if (!syntaxResult.valid) {
    return {
      valid: false,
      stage: "syntax",
      errors: [syntaxResult.error],
      position: syntaxResult.position
    };
  }
  
  // Step 2: Schema validation
  const data = JSON.parse(jsonString);
  const schemaResult = validateJsonSchema(data, mySchema);
  if (!schemaResult.valid) {
    return {
      valid: false,
      stage: "schema",
      errors: schemaResult.errors
    };
  }
  
  // Step 3: Semantic validation
  const semanticResult = validateBusinessRules(data);
  if (!semanticResult.valid) {
    return {
      valid: false,
      stage: "semantic",
      errors: semanticResult.errors
    };
  }
  
  // All validations passed
  return {
    valid: true,
    data: data
  };
}

2. Provide Detailed Error Information

Make error messages actionable and helpful:

  • Include line and column numbers for syntax errors
  • Reference specific property paths for schema violations
  • Explain why a value is invalid (e.g., "minimum value is 1, got -5")
  • Provide suggestions when possible
  • Group related errors logically

User-Friendly Error Example:

// Poor error message
"Invalid value for price"

// Better error message
{
  "field": "product.price",
  "value": -10.99,
  "constraint": "minimum",
  "minimumValue": 0,
  "message": "Product price must be a positive number",
  "path": ["product", "price"],
  "location": {
    "line": 12,
    "column": 16
  },
  "suggestion": "Use a positive value or 0 for free products"
}

3. Implement Progressive Validation

For large or complex JSON documents, validate incrementally:

  • Validate the overall structure first (keys and types)
  • Validate each logical section independently
  • Use lazy validation for nested arrays to avoid validating thousands of items at once
  • Prioritize critical validations before detailed ones

4. Cache and Reuse Validators

Optimize performance by proper validator management:

  • Compile schemas once and reuse the compiled validators
  • Share common validation logic across different document types
  • Consider performance implications of complex regex patterns
  • Use specialized validators for high-frequency operations

Validator Caching Example:

// Inefficient: Compiles schema for every validation
function validateUserData(userData) {
  const ajv = new Ajv();
  const validate = ajv.compile(userSchema);
  return validate(userData);
}

// Better: Compile once, reuse many times
const ajv = new Ajv(); // Global instance
const validatorCache = {};

function getValidator(schemaName) {
  if (!validatorCache[schemaName]) {
    const schema = require(`./schemas/${schemaName}.json`);
    validatorCache[schemaName] = ajv.compile(schema);
  }
  return validatorCache[schemaName];
}

function validateUserData(userData) {
  const validate = getValidator('user');
  return validate(userData);
}

Validation Tools and Libraries

Various libraries and tools are available to help with JSON validation:

Language/PlatformPopular LibrariesKey Features
JavaScriptAjv, Joi, yup, zodFast validation, TypeScript integration, custom error messages
Pythonjsonschema, pydantic, marshmallowData serialization, object mapping, automatic validation
JavaJackson, Everit JSON Schema, Java JSON SchemaStrong typing, thorough validation, enterprise features
.NETNewtonsoft.Json.Schema, NJsonSchemaIntegration with .NET ecosystem, code generation
Rubyjson_schema, json-schemaRuby-native API, schema generation
Gogojsonschema, validatorPerformance-focused, struct tag validation
CLI Toolsjsonschema, ajv-cli, jsonlintCommand-line validation, integration with build pipelines

Selecting the Right Validation Tool

Consider these factors when choosing a validation library:

  1. Performance - Validation speed and memory usage
  2. Feature completeness - Support for all JSON Schema features you need
  3. Error reporting - Quality and usefulness of error messages
  4. Extensibility - Support for custom validators and formats
  5. Community and maintenance - Active development and good documentation
  6. Integration - Compatibility with your tech stack

Integrating Validation with Formatting

1. Sequential Process

The most common approach is to validate first, then format:

function processJson(jsonString) {
  // Step 1: Validate
  const validationResult = validateJson(jsonString);
  if (!validationResult.valid) {
    return {
      success: false,
      errors: validationResult.errors,
      // Return the original JSON for debugging
      originalJson: jsonString
    };
  }
  
  // Step 2: Format (only if valid)
  try {
    const parsedJson = JSON.parse(jsonString);
    const formattedJson = JSON.stringify(parsedJson, null, 2);
    
    return {
      success: true,
      formattedJson: formattedJson,
      // Include any analytics or metadata
      stats: {
        lineCount: formattedJson.split('\n').length,
        byteSize: formattedJson.length
      }
    };
  } catch (error) {
    // This should never happen if validation passed
    return {
      success: false,
      errors: ["Unexpected error during formatting"],
      details: error.message
    };
  }
}

2. Validation-Informed Formatting

Some advanced systems use validation results to inform formatting:

  • Highlighting problematic sections in the formatted output
  • Adding comments next to potentially problematic values
  • Applying different formatting rules based on data types or contexts
  • Generating warnings inline for values that pass validation but are unusual

3. Recovery-Oriented Formatting

For some use cases, formatting can proceed even with certain validation errors:

  • Format what is valid, mark what isn't
  • Apply fixes for common errors before formatting
  • Provide both the fixed version and error reports
  • Allow different levels of strictness

Lenient Processing Example:

function lenientProcessJson(jsonString, options = {}) {
  const { 
    fixTrailingCommas = true,
    allowComments = true,
    convertSingleQuotes = true
  } = options;
  
  let processedJson = jsonString;
  const fixes = [];
  
  // Apply fixes based on options
  if (fixTrailingCommas) {
    processedJson = processedJson.replace(/,\s*([\]\}])/g, '$1');
    fixes.push("Removed trailing commas");
  }
  
  if (allowComments) {
    // Remove single line comments
    processedJson = processedJson.replace(/\/\/.*$/gm, '');
    // Remove multi-line comments
    processedJson = processedJson.replace(/\/\*[\s\S]*?\*\//g, '');
    fixes.push("Removed comments");
  }
  
  if (convertSingleQuotes) {
    // Convert single quotes to double quotes (with appropriate escaping)
    processedJson = processedJson.replace(/'([^'\\]*(?:\\.[^'\\]*)*)'(?=\s*:)/g, '"$1"');
    processedJson = processedJson.replace(/:\s*'([^'\\]*(?:\\.[^'\\]*)*)'(?=[,\}\]])/g, ': "$1"');
    fixes.push("Converted single quotes to double quotes");
  }
  
  // Now try to parse and format
  try {
    const parsedJson = JSON.parse(processedJson);
    const formattedJson = JSON.stringify(parsedJson, null, 2);
    
    return {
      success: true,
      formattedJson,
      appliedFixes: fixes.length > 0 ? fixes : ["No fixes needed"],
      originalHasIssues: processedJson !== jsonString
    };
  } catch (error) {
    return {
      success: false,
      originalJson: jsonString,
      processedJson: processedJson,
      error: error.message,
      appliedFixes: fixes
    };
  }
}

Validation in CI/CD and Production Environments

1. Automated Validation in CI Pipelines

Integrate JSON validation into your continuous integration workflows:

  • Validate all JSON configuration files with every commit
  • Add schema validation to API tests
  • Enforce schema compatibility between versions
  • Generate validation reports as part of build artifacts

2. Production-Grade Validation Strategies

For high-volume production systems, apply these practices:

  • Implement request validation at API boundaries
  • Use optimized validators for performance-critical paths
  • Apply appropriate error handling and logging
  • Consider validation impact on latency and throughput
  • Set up monitoring for validation errors to detect potential issues

Security Note:

Proper JSON validation is not just about data quality—it's also a security practice. Validating input helps prevent injection attacks, denial of service vulnerabilities, and other security issues related to untrusted data.

Conclusion

Implementing robust JSON validation before formatting is an essential practice that improves data quality, application reliability, and developer experience. By separating validation from formatting, you create a clearer separation of concerns and enable more thorough checking of your JSON data beyond simple syntax verification.

For optimal results, implement a multi-level validation strategy that includes syntax checking, schema validation, and semantic validation. Provide clear, actionable error messages, and leverage appropriate libraries for your technology stack. Whether you're building interactive tools, APIs, or data pipelines, proper validation is the foundation of reliable JSON processing.

Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool