Working with JSON in Python

Published 2026-04-12 · 9 min read

Python's standard library has had excellent JSON support since 2.6. The json module covers the basics with zero dependencies, and the ecosystem around it — dataclasses, Pydantic, orjson — covers the advanced cases. This guide walks through what you need to know.

The json module basics

import json

text = '{"id": 42, "name": "Ada"}'
data = json.loads(text)      # dict
data["name"]                 # "Ada"

back = json.dumps(data)      # compact string
pretty = json.dumps(data, indent=2)

# File helpers
with open("user.json") as f:
    data = json.load(f)
with open("out.json", "w") as f:
    json.dump(data, f, indent=2)

Four functions cover string and file I/O: loads/load for reading, dumps/dump for writing.

Type mapping

  • JSON object ↔ Python dict
  • JSON array ↔ Python list
  • JSON string ↔ Python str
  • JSON number ↔ Python int or float
  • JSON true/false ↔ Python True/False
  • JSON null ↔ Python None

Note that Python tuples serialize as JSON arrays but deserialize back as lists — the round trip isn't symmetric.

Common pitfalls

Non-serializable types

Datetimes, Decimal, set, bytes, and custom classes all raise TypeError: Object of type X is not JSON serializable. Two ways to fix it:

# Option 1: custom default
import json, datetime

def default(o):
    if isinstance(o, (datetime.date, datetime.datetime)):
        return o.isoformat()
    raise TypeError

json.dumps({"now": datetime.datetime.now()}, default=default)

# Option 2: subclass JSONEncoder
class MyEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, datetime.datetime):
            return o.isoformat()
        return super().default(o)

json.dumps({"now": datetime.datetime.now()}, cls=MyEncoder)

NaN and Infinity

Python's json.dumps emits NaN, Infinity, and -Infinity as literal tokens by default — which is valid JavaScript but invalid JSON. Strict consumers will reject it. Pass allow_nan=False to raise an error instead, then decide how your app should handle those values.

Unicode

json.dumps escapes non-ASCII characters by default. If you want the raw UTF-8 output (smaller and more readable), pass ensure_ascii=False. Make sure your file is opened withencoding="utf-8" when you do.

Dataclasses and serialization

from dataclasses import dataclass, asdict
import json

@dataclass
class User:
    id: int
    name: str

u = User(42, "Ada")
json.dumps(asdict(u))  # '{"id": 42, "name": "Ada"}'

asdict recursively converts a dataclass (and any nested dataclasses) into a dict you can pass to json.dumps. This is the cleanest zero-dep approach for typed JSON in Python.

Pydantic for runtime validation

When you need to validate JSON coming from an API or user input, Pydantic is the standard choice:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

user = User.model_validate_json(text)   # raises ValidationError on bad input
user.model_dump_json(indent=2)          # serialize back

Performance: orjson

The stdlib jsonmodule is fast enough for most cases, but if you're processing millions of documents, orjson is typically 3–10x faster and handles datetimes and dataclasses out of the box.

Debugging broken JSON

If json.loads fails, the exception message includes a line and column. For more interactive debugging, paste the input into the JSON Validator — it will surface the same error with highlighted context, and the formatter will pretty-print valid JSON so you can see the structure at a glance.

More from the blog