Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool
Erlang's Pattern Matching for JSON Processing
Introduction
Erlang is a powerful functional programming language known for its capabilities in building highly available, fault-tolerant, and scalable systems. One of its most distinctive and useful features is Pattern Matching. While often discussed in the context of function heads or case statements, pattern matching is an incredibly versatile tool that can simplify many data processing tasks, including working with structured data like JSON.
JSON (JavaScript Object Notation) is a ubiquitous data interchange format. It's human-readable, hierarchical, and widely used in web APIs, configuration files, and data storage. Processing JSON in many languages involves explicit steps: checking if a key exists, accessing properties using dot notation or square brackets, handling potential nulls or missing fields, and iterating through arrays. Erlang's pattern matching offers an alternative, often more declarative, way to approach these tasks.
Representing JSON in Erlang
Before we can pattern match on JSON, we need to represent it using Erlang's native data types. When JSON is parsed in Erlang (typically using a library like jsx
, jsone
, or Erlang's built-in json
module in recent versions), it gets transformed into Erlang terms:
- JSON Object : Represented as an Erlang Map (
#{}
). Keys are usually atoms or binaries (Erlang's byte strings,<<>>
). Values are other Erlang terms.JSON: {"name": "Alice", "age": 30, "isStudent": false}
(Note: Erlang maps use
Erlang: #{"name" => <<"Alice">>, "age" => 30, "isStudent" => false}=>
for key-value association). - JSON Array : Represented as an Erlang List (
[]
).JSON: [1, "apple", true, {"id": 101}]
Erlang: [1, <<"apple">>, true, #{"id" => 101}] - JSON String: Erlang Binary (
<<>>
) is the most common representation for efficiency.JSON: "hello"
Erlang: <<"hello">> - JSON Number: Erlang Integer or Float.
JSON: 42, -3.14
Erlang: 42, -3.14 - JSON Boolean: Erlang Atoms
true
orfalse
.JSON: true, false
Erlang: true, false - JSON Null: Erlang Atom
null
.JSON: null
Erlang: null
Once parsed into these Erlang terms, pattern matching becomes a natural fit for inspecting and extracting data.
Pattern Matching Basics with JSON Data
Pattern matching allows you to test if a term has a specific structure and, if it does, bind variables to parts of that structure. This replaces explicit checks and accessors in many cases. Let's look at examples using a fictional Erlang function process_json(JsonTerm)
.
Matching a Simple Object
Suppose we expect a JSON object like {"id": 123, "status": "active"}
.
-module(json_processor). -export([process_json/1]). process_json(#{"id" := Id, "status" := <<"active">>}) -> io:format("Processing active item with ID: ~w~n", [Id]); process_json(#{"id" := Id, "status" := <<"inactive">>}) -> io:format("Processing inactive item with ID: ~w~n", [Id]); process_json(Other) -> io:format("Unrecognized JSON structure or status: ~p~n", [Other]).
Explanation:
- The first function head
#{"id" := Id, "status" := <<"active">>}
attempts to match an Erlang map. "id" := Id
matches a key"id"
and binds its value to the variableId
."status" := <<"active">>
specifically matches a key"status"
*only if* its value is the binary<<"active">>
.- If the input
JsonTerm
matches the first pattern, the first function body executes, andId
is available. - The second function head similarly matches if the status is
<<"inactive">>
. - The third function head
process_json(Other)
uses the wildcardOther
(or simply_
if we don't need the value) to catch any input that didn't match the previous patterns. This is crucial for handling unexpected data or providing default behavior.
Matching a Simple Array
Let's process a JSON array of coordinates, like [10.5, 20.0]
.
process_coordinates([X, Y]) when is_number(X), is_number(Y) -> io:format("Coordinates are (~.2f, ~.2f)~n", [X, Y]); process_coordinates(Other) -> io:format("Expected a list of two numbers, got: ~p~n", [Other]).
Explanation:
[X, Y]
matches a list with exactly two elements, binding the first toX
and the second toY
.when is_number(X), is_number(Y)
is a Guard. The pattern must match *and* the guard must evaluate totrue
for this clause to be selected. This adds constraints beyond structure.- The second clause catches anything that isn't a list of exactly two numbers.
Extracting Nested Data
Consider a more complex structure: {"user": {"id": 456, "name": "Bob"}, "role": "admin"}
.
process_nested_user(#{"user" := #{"id" := UserId, "name" := UserName}, "role" := <<"admin">>}) -> io:format("Admin user ID: ~w, Name: ~s~n", [UserId, UserName]); process_nested_user(#{"user" := #{"id" := UserId, "name" := UserName}, "role" := <<"user">>}) -> io:format("Standard user ID: ~w, Name: ~s~n", [UserId, UserName]); process_nested_user(Other) -> io:format("Unexpected nested structure: ~p~n", [Other]).
Explanation:
- Patterns can be nested. Here, we match a map that must have a key
"user"
whose value is *itself* a map, containing keys"id"
and"name"
. - We bind variables (
UserId
,UserName
) directly to the values deep within the structure. - Again, multiple clauses handle different cases (admin vs. standard user roles).
Handling Optional/Missing Fields
JSON doesn't guarantee the presence of keys. Erlang map patterns require keys to be present unless specific techniques are used. A common way to handle optional keys or provide defaults is using maps:get/3
or separate clauses.
process_item_with_optional_description(#{"name" := Name} = ItemMap) -> Description = maps:get("description", ItemMap, <<"No description">>), io:format("Item: ~s, Description: ~s~n", [Name, Description]); process_item_with_optional_description(Other) -> io:format("Expected map with 'name' key, got: ~p~n", [Other]).
Explanation:
#{"name" := Name} = ItemMap
matches a map that *must* have a"name"
key, binding its value toName
. The= ItemMap
part binds the *entire* matched map to the variableItemMap
.- Inside the function body,
maps:get("description", ItemMap, <<"No description">>)
safely retrieves the value for the"description"
key from theItemMap
. If the key is not present, it returns the third argument (the default value). - This pattern ensures the minimum required field (
"name"
) is present via pattern matching, while optional fields are handled gracefully in the body.
Processing Lists of JSON Objects
Pattern matching is powerful with lists, especially when combined with recursion or list comprehensions.
% Function to process a list of user objects process_user_list([]) -> io:format("Finished processing list.~n"); % Base case for empty list process_user_list([#{"id" := Id, "name" := Name} | Rest]) -> % Match the head of the list as a user object io:format("Processing user ID: ~w, Name: ~s~n", [Id, Name]), % Recursively call for the rest of the list process_user_list(Rest); process_user_list([InvalidItem | Rest]) -> % Match if the head is not a valid user object io:format("Skipping invalid list item: ~p~n", [InvalidItem]), % Continue processing the rest process_user_list(Rest); process_user_list(Other) -> io:format("Expected a list, got: ~p~n", [Other]).
Explanation:
- The function
process_user_list
has multiple clauses to handle different list structures. - The first clause
process_user_list([])
is the base case for recursion when the list is empty. - The second clause
process_user_list([Head | Tail])
uses the list pattern[Head | Tail]
, whereHead
is the first element andTail
is the rest of the list. TheHead
is *also* pattern matched to be a map with"id"
and"name"
keys. If it matches, the body executes, and the function calls itself recursively on theTail
. - The third clause
process_user_list([InvalidItem | Rest])
catches list heads that *didn't* match the desired user object pattern. It handles the invalid item and continues processing the rest of the list. - The final clause catches anything that isn't a list at all.
Alternatively, list comprehensions with pattern matching can be used for transformations:
% Extract names from a list of user objects using list comprehension get_user_names(ListOfUsers) -> [Name || #{"name" := Name} <- ListOfUsers].
Explanation:
[Name || ...]
is a list comprehension. It builds a new list.#{"name" := Name} <- ListOfUsers
is the generator. It iterates through each element inListOfUsers
. For each element, it attempts to match the pattern#{"name" := Name}
.- If the element matches, the value bound to
Name
is included in the new list. Elements that *do not* match the pattern are silently skipped by default in list comprehensions.
Benefits of Pattern Matching for JSON
- Readability and Clarity: The code often reads like a direct description of the data structure you expect. It's clear what shape the input must have for a particular code path to execute.
- Conciseness: You can extract nested values and bind them to variables in a single line within the function head or case clause, avoiding verbose sequences of accessors and temporary variables.
- Implicit Validation: If an input term doesn't match any of the provided patterns, it triggers a "function_clause" error (or falls through a case statement), providing immediate feedback that the data structure was unexpected. This acts as built-in validation.
- Exhaustive Handling: When using pattern matching in
case
statements or multiple function clauses, the compiler can often warn you if your patterns don't cover all possible cases for a given type (though less strict for dynamic data like parsed JSON, it encourages thinking about all possibilities). - Data Transformation: Easily restructure or extract data based on its shape, as shown in the list comprehension example.
Considerations
- Learning Curve: Developers new to Erlang may find the pattern matching syntax and its application to data processing initially unfamiliar compared to imperative approaches.
- Data Representation: You are working with Erlang's representation of JSON (maps, lists, binaries, etc.), not the raw JSON string. This requires parsing the JSON first.
- Deep Nesting: For extremely deep or complex, highly variable JSON structures, pattern matching alone might become cumbersome. Combining it with other Erlang features (like functions in guards, or helper functions for validation) is often necessary.
- Key Presence: Default map patterns strictly require keys to be present. Handling optional keys often requires using
maps:get/3
or structuring clauses carefully, which adds slight complexity compared to a simple pattern.
Conclusion
Erlang's pattern matching provides an elegant and robust mechanism for handling JSON data once it has been parsed into native Erlang terms. It encourages a declarative style where you define the expected structure of the data, making your code more readable and less prone to errors caused by unexpected data shapes. While it requires understanding Erlang's data types and the pattern matching syntax, the benefits in terms of code clarity, conciseness, and implicit validation make it a powerful tool for any Erlang developer working with JSON. By leveraging pattern matching, you can transform repetitive data access logic into expressive and self-documenting code.
Need help with your JSON?
Try our JSON Formatter tool to automatically identify and fix syntax errors in your JSON. JSON Formatter tool