Skip to content
Kordu Tools

JSON vs XML vs YAML vs TOML: When to Use Each Format

Compare JSON, XML, YAML, and TOML with real config examples. 70% of APIs use JSON (Postman, 2025) - learn which format fits your use case.

I
iyda
11 min read
json vs yaml json vs xml yaml vs toml data formats configuration files

Key Takeaways

  • JSON dominates APIs and data exchange. Over 70% of public APIs use it (Postman, 2025).
  • YAML is the default for DevOps config: Docker Compose, Kubernetes, CI/CD pipelines.
  • TOML wins for simple app config, especially in the Rust and Go ecosystems.
  • XML still rules enterprise systems, document formats, and anywhere you need schemas with validation.

Validate and Format JSON Instantly

Paste any JSON below to check syntax, beautify, or minify. Errors highlight with exact line numbers. Everything runs in your browser.

Try it JSON Formatter
Indent:

 

What Do These Four Formats Actually Look Like?

The best way to compare data formats is to see the same configuration expressed in each one. Over 70% of public APIs now return JSON by default (Postman State of APIs, 2025), but APIs are just one use case. Configuration files, document markup, and infrastructure definitions each favor different formats.

Here’s a simple app configuration written four ways.

JSON

{
  "app": {
    "name": "my-api",
    "version": "2.1.0",
    "port": 8080,
    "debug": false,
    "allowed_origins": ["https://example.com", "https://app.example.com"],
    "database": {
      "host": "db.internal",
      "port": 5432,
      "pool_size": 10
    }
  }
}

YAML

app:
  name: my-api
  version: "2.1.0"
  port: 8080
  debug: false
  allowed_origins:
    - https://example.com
    - https://app.example.com
  database:
    host: db.internal
    port: 5432
    pool_size: 10

TOML

[app]
name = "my-api"
version = "2.1.0"
port = 8080
debug = false
allowed_origins = ["https://example.com", "https://app.example.com"]

[app.database]
host = "db.internal"
port = 5432
pool_size = 10

XML

<?xml version="1.0" encoding="UTF-8"?>
<app>
  <name>my-api</name>
  <version>2.1.0</version>
  <port>8080</port>
  <debug>false</debug>
  <allowed_origins>
    <origin>https://example.com</origin>
    <origin>https://app.example.com</origin>
  </allowed_origins>
  <database>
    <host>db.internal</host>
    <port>5432</port>
    <pool_size>10</pool_size>
  </database>
</app>

Notice how XML is roughly twice the line count. YAML is the most compact. JSON sits in between. TOML reads like an INI file on steroids.

JSON syntax rules

How Does JSON Compare to the Alternatives?

Feature JSON XML YAML TOML
Comments No Yes Yes Yes
Native types 6 (string, number, bool, null, array, object) All text (schema-typed) Rich (int, float, bool, null, date, etc.) Strong (string, int, float, bool, datetime, array, table)
Nesting depth Unlimited Unlimited Unlimited Limited (3-4 levels practical)
Human readability Good Verbose Excellent Excellent for flat configs
Parse speed Fast Moderate Slow Fast
Schema validation JSON Schema XSD, DTD, RelaxNG No standard No standard
Trailing commas Not allowed N/A N/A Not allowed
Multi-line strings No (use \n) Yes (CDATA) Yes (multiple styles) Yes (triple quotes)
Ecosystem size Massive Massive (legacy) Large (DevOps) Growing (Rust, Go)
Spec complexity 1 page Hundreds of pages 85 pages Short and precise
We’ve found that parsing speed differences rarely matter for config files. A 500-line YAML file parses in under 5ms on modern hardware. The real cost is human time, debugging indentation errors or missing commas at 2 AM.

When Should You Use JSON?

JSON accounts for over 70% of API traffic on the modern web (Postman State of APIs, 2025). Douglas Crockford derived it from JavaScript in the early 2000s, and it became the de facto standard for web data exchange within a decade. Its simplicity is its greatest strength.

Where JSON dominates

REST APIs and web services. Every language has a fast, battle-tested JSON parser. JSON.parse() in JavaScript, json.loads() in Python, encoding/json in Go. The browser understands JSON natively.

Package manifests. package.json (npm), composer.json (PHP), tsconfig.json (TypeScript). These files get read by tools, not humans, so the lack of comments is tolerable.

Data storage and interchange. MongoDB stores BSON (binary JSON). Many message queues and event systems use JSON payloads. It’s the lingua franca of microservices.

Where JSON falls short

No comments. This is JSON’s biggest practical limitation. You can’t annotate config files, explain why a value is set, or leave TODO notes. Workarounds like "_comment" keys are ugly hacks.

No trailing commas. Adding an item to the end of a list means editing two lines, the new item and the comma on the previous line. This creates noisy diffs in version control.

No multi-line strings. Long text, SQL queries, or HTML templates become unreadable escape sequences.

JSON with comments

Several tools support JSONC (JSON with Comments): VS Code settings, TypeScript’s tsconfig.json, and ESLint configs. JSONC isn’t a standard, though. Don’t assume your parser supports it.

Citation capsule: JSON handles over 70% of public API traffic according to Postman’s 2025 State of APIs report. Its native browser support and universal parser availability make it the default choice for web data exchange, despite lacking comments and trailing commas.

validate your JSON

When Should You Use XML?

XML still processes an estimated 40% of enterprise B2B data exchange (MuleSoft Connectivity Report, 2024). It’s been the backbone of enterprise systems since the late 1990s. Writing it off as “legacy” ignores how much of the world still runs on it.

Where XML dominates

Enterprise integrations. SOAP APIs, EDI replacements, healthcare (HL7/FHIR), and financial systems (FIX, ISO 20022) rely on XML. These industries move slowly and need the schema validation that XML provides.

Document formats. XHTML, SVG, RSS, Atom, EPUB, and DOCX (which is zipped XML internally) all use XML. So does Android’s layout system.

Configuration with validation. Maven’s pom.xml, Spring’s application context, and .NET’s web.config use XML with XSD schemas. The schema catches misconfiguration before runtime.

{/* Maven pom.xml - real-world XML config */}
<dependency>
  <groupId>org.springframework</groupId>
  <artifactId>spring-core</artifactId>
  <version>6.1.4</version>
  <scope>compile</scope>
</dependency>

Where XML falls short

Verbosity. Even a simple key-value pair requires an opening tag, content, and a closing tag. XML files are typically 2-3x larger than equivalent JSON.

Complexity. The XML ecosystem (XSD, XSLT, XPath, XQuery, namespaces, DTDs) is powerful but overwhelming. Most developers only need 10% of what XML offers. We’ve seen teams migrate from XML to YAML or JSON purely to reduce cognitive overhead. The actual data structures didn’t change, just the syntax around them.

Citation capsule: XML processes roughly 40% of enterprise B2B data exchange per MuleSoft’s 2024 Connectivity Report. Its unmatched schema validation (XSD, DTD, RelaxNG) makes it irreplaceable in regulated industries like healthcare and finance where data contracts must be enforced.

When Should You Use YAML?

YAML powers over 90% of CI/CD pipeline definitions according to the CircleCI 2024 State of Software Delivery report (CircleCI, 2024). Originally standing for “Yet Another Markup Language” (later rebranded to “YAML Ain’t Markup Language”), it’s become the configuration language of the cloud-native era.

Where YAML dominates

Infrastructure and DevOps. Docker Compose, Kubernetes manifests, Ansible playbooks, GitHub Actions, GitLab CI, and CloudFormation templates all use YAML. If you work in DevOps, you write YAML daily.

# docker-compose.yml - the file every developer knows
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./html:/usr/share/nginx/html
    depends_on:
      - api
  api:
    build: ./api
    environment:
      DATABASE_URL: postgres://db:5432/app

Application config. Rails, Spring Boot, and many other frameworks use YAML for configuration. Comments make it self-documenting.

Where YAML falls short

Indentation sensitivity. A single misplaced space breaks everything. Unlike Python, where indentation errors produce clear messages, YAML parsers often give cryptic errors pointing nowhere near the actual problem.

The Norway problem. YAML 1.1 infamously interprets bare NO, yes, on, and off as booleans. The country code NO for Norway becomes false. YAML 1.2 fixed this, but many parsers still use 1.1 behavior.

# This is a boolean, not a country code (YAML 1.1)
country: NO  # parsed as false

# Fix: quote it
country: "NO"

Complexity in the spec. YAML supports anchors, aliases, merge keys, custom tags, and multiple document streams. Most of this power goes unused and creates security risks. Deserializing untrusted YAML has caused remote code execution vulnerabilities in Ruby, Python, and Java.

YAML security

Never use yaml.load() on untrusted input. Use yaml.safe_load() (Python), YAML.safe_load() (Ruby), or a safe parser in your language. YAML deserialization vulnerabilities have been exploited in the wild.

But is the indentation trade-off worth it? For most config files, yes. The readability gains outweigh the debugging cost, especially when your editor handles indentation for you.

Citation capsule: YAML defines over 90% of CI/CD pipeline configurations per CircleCI’s 2024 State of Software Delivery report. Its human-readable, comment-friendly syntax makes it the standard for Docker Compose, Kubernetes, and GitHub Actions, despite indentation sensitivity causing frequent debugging headaches.

regex for YAML validation

When Should You Use TOML?

TOML has seen rapid adoption since becoming Rust’s official config format via Cargo.toml. GitHub reported over 14 million repositories containing TOML files as of early 2025 (GitHub Search, 2025). Tom Preston-Werner (GitHub co-founder) created it in 2013 specifically because he found YAML too complex for simple config files.

Where TOML dominates

Rust ecosystem. Every Rust project starts with a Cargo.toml. The entire crate registry, build system, and dependency management runs on TOML.

# Cargo.toml - every Rust developer writes this
[package]
name = "my-app"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1.0", features = ["derive"] }
tokio = { version = "1", features = ["full"] }

[profile.release]
opt-level = 3
lto = true

Go tools. go.mod isn’t TOML, but many Go config files are. Hugo (the static site generator) uses config.toml by default.

Python packaging. pyproject.toml replaced setup.py and setup.cfg as the standard way to configure Python projects (PEP 518, 2016). pip, Poetry, Ruff, and Black all read it.

Where TOML falls short

Deep nesting. TOML gets awkward beyond 2-3 levels of nesting. Representing a Kubernetes manifest in TOML would be painful. The format works best for flat or shallow configuration.

# This is fine
[database]
host = "localhost"
port = 5432

# This gets ugly fast
[services.api.routes.health.middleware.cors]
allowed_origins = ["*"]

Smaller ecosystem. Not every language has a mature, well-maintained TOML parser. JavaScript’s TOML support, for instance, is functional but less polished than its JSON tooling. TOML occupies a sweet spot that YAML and JSON both miss: it’s comment-friendly like YAML but doesn’t punish you for indentation mistakes, and it’s strict like JSON but doesn’t forbid comments. For application config (not data interchange), it’s arguably the best choice today.

Citation capsule: TOML appears in over 14 million GitHub repositories as of 2025, driven by its role as Rust’s official config format (Cargo.toml) and Python’s packaging standard (pyproject.toml via PEP 518). Its explicit typing and flat structure make it ideal for application configuration with 2-3 levels of nesting.

What Are the Most Common Mistakes in Each Format?

Syntax errors in config files waste more developer time than most people admit. A 2023 JetBrains survey found that 62% of developers spend at least an hour per week debugging configuration issues (JetBrains Developer Ecosystem Survey, 2023). Here are the traps to avoid.

JSON pitfalls

  • Trailing commas. The most common JSON parse error. Always check the last item in arrays and objects.
  • Single quotes. JSON requires double quotes. No exceptions.
  • Unquoted keys. Valid in JavaScript objects, invalid in JSON.
  • Comments. There’s no way to add them. Don’t try // or /* */.

YAML pitfalls

  • Tabs vs spaces. YAML forbids tabs for indentation. Use spaces only.
  • The boolean trap. Bare yes, no, on, off, true, false are all booleans. Quote strings that look like booleans.
  • Colon in values. title: My App: v2 breaks because the second colon starts a new mapping. Quote it: title: "My App: v2".
  • Copy-paste indentation. Pasting YAML from a website often introduces invisible tab characters or wrong indentation levels.

TOML pitfalls

  • Table ordering. All keys under a [table] header belong to that table until the next header. Putting a key in the wrong section is easy to miss.
  • Inline tables are immutable. You can’t add keys to an inline table {key = "val"} later in the file.
  • Array of tables syntax. [[array]] creates an array of tables. Forgetting the double brackets gives you a single table.

XML pitfalls

  • Unescaped special characters. <, >, &, ', and " in content must be escaped as entities (&lt;, &amp;, etc.) or wrapped in CDATA sections.
  • Namespace conflicts. Mixing elements from different namespaces without proper prefixes causes silent parsing failures.
  • Encoding mismatches. Declaring UTF-8 in the XML header but saving the file as ISO-8859-1 creates hard-to-diagnose encoding bugs.

How Do You Choose the Right Format?

The decision comes down to three questions: what’s the data for, who reads it, and what does the ecosystem expect? According to the Stack Overflow 2024 Developer Survey, 65% of developers work with JSON daily, 45% with YAML, 25% with XML, and 15% with TOML (Stack Overflow, 2024).

Use JSON when:

  • You’re building or consuming a REST API
  • The data is machine-to-machine with no human editing
  • You need universal parser support across every language
  • File size matters (JSON is compact)

Use YAML when:

  • Humans edit the file regularly and need comments
  • You’re writing DevOps config (Docker, Kubernetes, CI/CD)
  • The configuration is deeply nested
  • Readability matters more than strictness

Use TOML when:

  • The config is flat or shallow (2-3 levels max)
  • You want comments without YAML’s indentation complexity
  • You’re in the Rust, Go, or Python ecosystem
  • Explicit typing matters (dates, integers vs floats)

Use XML when:

  • You need formal schema validation (XSD)
  • You’re integrating with enterprise, healthcare, or financial systems
  • The data is document-oriented (mixed content, attributes)
  • An existing ecosystem mandates it (Maven, SOAP, SVG)

The honest answer

Most of the time, the ecosystem chooses for you. You don’t pick YAML for Kubernetes, Kubernetes picks YAML for you. Focus your energy on learning the format your tools require, not debating which is “best.”

HTTP status codes for your APIs

Can You Convert Between These Formats?

Converting between formats is possible but not always lossless. JSON to YAML is nearly 1:1 since YAML is a superset of JSON. Going from YAML to JSON loses comments. XML to JSON loses attributes and mixed content semantics. TOML to YAML works for simple configs but TOML’s strict typing gets flattened.

Several tools handle conversions. yq converts between YAML, JSON, and XML on the command line. Python’s standard library handles JSON and XML natively, with tomllib added in Python 3.11 for TOML reading (Python docs, 2022).

The safest approach: define your canonical format, store data in that format, and convert at the boundary where a different format is needed.

Citation capsule: Converting between JSON, YAML, TOML, and XML is possible but not always lossless. JSON-to-YAML conversion preserves structure since YAML is a JSON superset, but YAML-to-JSON loses comments and XML-to-JSON loses attributes and mixed content semantics.

format and validate JSON before converting

Frequently Asked Questions

Is JSON faster to parse than YAML?

Yes. JSON parsing is typically 5-10x faster than YAML because JSON’s grammar is simpler. A 2023 benchmark by the Bun runtime team showed JSON parsing at roughly 800MB/s versus YAML at 80-150MB/s in JavaScript environments (Bun blog, 2023). For config files under a few hundred KB, the difference is imperceptible. For large data transfers, JSON’s speed advantage matters.

Should I replace XML with JSON in my project?

Not automatically. If your system works with XML and your team knows it, migration adds risk with no guaranteed payoff. Replace XML with JSON (or YAML) when starting new projects, adding new APIs, or when the verbosity actively hurts productivity. Keep XML where schemas and validation are critical to correctness.

Why does YAML not allow tabs?

The YAML spec forbids tab characters for indentation because tab width varies across editors and terminals. One editor might render a tab as 2 spaces, another as 8. This inconsistency would make YAML’s whitespace-significant syntax ambiguous. Spaces ensure every editor renders the file identically.

Is TOML better than YAML for configuration?

For flat or shallow config, yes. TOML is stricter, has explicit types, and doesn’t have YAML’s indentation footguns. For deeply nested structures like Kubernetes manifests, YAML handles the complexity better. Both support comments. The choice depends on nesting depth and ecosystem conventions.

Can I add comments to JSON?

Not in standard JSON. The JSON specification (RFC 8259, 2017) explicitly excludes comments. Some tools support JSONC (JSON with Comments), including VS Code, TypeScript, and ESLint, but this is a non-standard extension. If you need comments in a JSON-like format, consider JSON5 or switch to YAML/TOML.