Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool

Handling Incomplete JSON Data Streams

Working with streaming JSON data presents unique challenges, especially when dealing with incomplete or partial data streams. Whether you're building real-time applications, processing large datasets, or consuming server-sent events, properly handling incomplete JSON is crucial for maintaining data integrity and application stability.

1. The Challenge of Incomplete JSON Streams

JSON data streams can become incomplete due to various reasons:

  • Network interruptions during data transmission
  • Server timeouts or crashes during data generation
  • Rate limiting or bandwidth restrictions
  • Chunked transfer encoding with interrupted connections
  • Websocket disconnections during streaming

Example of an Incomplete JSON Stream:

{
  "events": [
    {"id": 1, "type": "login", "timestamp": 1625097600},
    {"id": 2, "type": "view_page", "timestamp": 1625097605},
    {"id": 3, "type": "cli

The stream was cut off mid-way through the third event, resulting in invalid JSON.

2. Implementing a Robust JSON Stream Parser

To handle incomplete JSON streams effectively, you need a more sophisticated approach than simpleJSON.parse(). Here's a strategy using a buffer-based approach:

Buffer-Based JSON Stream Processing:

class JsonStreamParser {
  constructor() {
    this.buffer = "";
    this.parsedObjects = [];
  }

  // Add new data to the buffer
  feed(chunk) {
    this.buffer += chunk;
    this.tryParse();
  }

  // Try to parse complete JSON objects from the buffer
  tryParse() {
    let startPos = 0;
    let depth = 0;
    let inString = false;
    let escaped = false;
    
    for (let i = 0; i < this.buffer.length; i++) {
      const char = this.buffer[i];
      
      // Handle string state
      if (char === '"' && !escaped) {
        inString = !inString;
      }
      
      // Only count braces when not in a string
      if (!inString) {
        if (char === '{') {
          depth++;
        } else if (char === '}') {
          depth--;
          
          // If depth returns to 0, we have a complete object
          if (depth === 0) {
            const jsonStr = this.buffer.substring(startPos, i + 1);
            try {
              const parsed = JSON.parse(jsonStr);
              this.parsedObjects.push(parsed);
              
              // Move the start position forward
              startPos = i + 1;
            } catch (error) {
              // If parsing fails, just continue
            }
          }
        }
      }
      
      // Track escape characters in strings
      escaped = inString && char === '\' && !escaped;
    }
    
    // Remove processed data from buffer
    if (startPos > 0) {
      this.buffer = this.buffer.substring(startPos);
    }
  }

  // Get all successfully parsed objects
  getParsedObjects() {
    const objects = [...this.parsedObjects];
    this.parsedObjects = [];
    return objects;
  }
}

3. Using JSON Streaming Libraries

Rather than implementing your own parser, you can leverage specialized libraries designed for handling JSON streams:

Using Streaming Libraries:

// Example with the 'oboe.js' library
import oboe from 'oboe';

oboe('/api/events-stream')
  // This triggers when a node matching the pattern is found
  .node('events[*]', (event) => {
    // Process each event as soon as it's parsed
    processEvent(event);
    
    // Return false to release the event from memory
    return false;
  })
  .done((fullResponse) => {
    console.log('Stream complete');
  })
  .fail((error) => {
    // Handle errors, including incomplete streams
    console.error('Stream error:', error);
    
    // Implement recovery strategy
    handleStreamError(error);
  });

Popular JSON Streaming Libraries

  • JavaScript: oboe.js, JSONStream, stream-json
  • Python: ijson, yajl-py
  • Java: Jackson's streaming API, Gson Streams
  • Go: json.Decoder with token streaming
  • Rust: serde_json streaming parsers

4. Implementing Chunk-Based Processing

For handling large JSON streams, a chunk-based approach allows you to process elements incrementally:

async function processJsonStream(readableStream) {
  // Create a reader from the stream
  const reader = readableStream.getReader();
  const decoder = new TextDecoder();
  const parser = new JsonStreamParser();
  
  try {
    while (true) {
      const { value, done } = await reader.read();
      
      if (done) {
        break;
      }
      
      // Convert the chunk to text and feed it to the parser
      const chunk = decoder.decode(value, { stream: true });
      parser.feed(chunk);
      
      // Process any complete objects
      const objects = parser.getParsedObjects();
      for (const obj of objects) {
        await processObject(obj);
      }
    }
    
    // Handle any remaining data in the buffer
    if (parser.buffer.length > 0) {
      console.warn('Stream ended with incomplete data in buffer');
      handleIncompleteData(parser.buffer);
    }
  } catch (error) {
    console.error('Error processing stream:', error);
  }
}

5. Recovery Strategies for Incomplete JSON

When you encounter incomplete JSON, there are several recovery strategies:

5.1 Graceful Degradation

Process whatever complete objects you have and acknowledge the incomplete nature of the data.

function handleIncompleteStream(parsedObjects, incompleteBuffer) {
  // Work with what we have
  if (parsedObjects.length > 0) {
    processValidObjects(parsedObjects);
  }
  
  // Log the incomplete portion for debugging
  logIncompleteData(incompleteBuffer);
  
  // Inform the user
  notifyUser({
    message: "Some data could not be processed due to an incomplete transmission",
    recoveredItems: parsedObjects.length
  });
}

5.2 Automatic Completion Attempts

For simple cases, you might attempt to complete the JSON structure:

function attemptJsonCompletion(incompleteJson) {
  // Count opening and closing braces/brackets
  const openBraces = (incompleteJson.match(/{/g) || []).length;
  const closeBraces = (incompleteJson.match(/}/g) || []).length;
  const openBrackets = (incompleteJson.match(/\[/g) || []).length;
  const closeBrackets = (incompleteJson.match(/\]/g) || []).length;
  
  // Add missing closing braces/brackets
  let completedJson = incompleteJson;
  for (let i = 0; i < openBraces - closeBraces; i++) {
    completedJson += '}';
  }
  
  for (let i = 0; i < openBrackets - closeBrackets; i++) {
    completedJson += ']';
  }
  
  // Try to parse the completed JSON
  try {
    return JSON.parse(completedJson);
  } catch (error) {
    // If it still fails, return null
    return null;
  }
}

5.3 Retry Mechanisms

Implement retry logic with exponential backoff to attempt retrieving the complete data:

async function fetchWithRetry(url, maxRetries = 3) {
  let retries = 0;
  
  while (retries < maxRetries) {
    try {
      const response = await fetch(url);
      const data = await response.json();
      return data;
    } catch (error) {
      retries++;
      
      if (retries >= maxRetries) {
        throw new Error(`Failed after ${maxRetries} attempts: ${error.message}`);
      }
      
      // Exponential backoff with jitter
      const delay = Math.min(1000 * 2 ** retries + Math.random() * 1000, 10000);
      console.log(`Retry ${retries} after ${delay}ms`);
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
}

6. Best Practices for JSON Stream Processing

  1. Use streaming parsers for large datasets: Always prefer specialized JSON streaming libraries over loading everything into memory.
  2. Implement proper error handling: Capture and respond to stream interruptions and parser errors.
  3. Apply backpressure: Control the rate of data consumption to prevent memory issues with very large streams.
  4. Keep state minimal: Process objects incrementally rather than accumulating everything before processing.
  5. Design for resilience: Your system should handle partial data gracefully rather than failing completely.
  6. Log incomplete segments: Store problematic JSON fragments for later analysis and debugging.
  7. Consider checkpoints: For long-running streams, implement checkpointing to resume from the last successful position.

Server-Side Considerations

When designing APIs that stream JSON data, implement these server-side practices:

  • Send valid JSON objects in each chunk
  • Implement proper content-length headers for complete responses
  • Use chunked transfer encoding correctly
  • Include sequence IDs to help clients detect missing chunks
  • Provide resumption tokens for interrupted streams

Need help with your JSON?

Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool