Skip to main content

YAML

Full form - YAML Ain't Markup Language Human-readable, comments allowed, complex hierarchies. Indentation-sensitive.

Previously we had .yml extention but since 2006 official website recommendation suggested to use .yaml extension The .yml extension was originally a workaround for the filename limit in older systems like MS-DOS and early Windows.


Syntax

app: MyApp
version: "1.0.0"
debug: true
port: 8080

database:
host: localhost
port: 5432
name: mydb

features:
- auth
- logging
- cache

# This is a comment
timeout: 30

Data types: string, number, boolean, null, array (lists), object (maps)


Advanced Features

Multi-line strings

# Literal (preserves line breaks)
description: |
Line 1
Line 2
Line 3

# Folded (becomes single line)
summary: >
This is a very long string
that spans multiple lines
but becomes one line.

Anchors & Aliases (DRY)

defaults: &defaults
timeout: 30
retries: 3

production:
<<: *defaults
host: prod.example.com

development:
<<: *defaults
host: localhost

Environment variables

database_url: ${DATABASE_URL:-sqlite:///app.db}
api_key: ${API_KEY} # Required, will error if not set

Pros ✅

  • Human-readable
  • Comments supported
  • Multi-line strings
  • Anchors & aliases (DRY)
  • Works with Docker/Kubernetes

Cons ❌

  • Indentation sensitive (2 or 4 spaces!)
  • Slower to parse than JSON
  • Edge cases can be confusing
  • Security risk with untrusted input (use yaml.safe_load)

When to Use

  • Docker Compose files
  • Kubernetes manifests
  • CI/CD pipelines (GitHub Actions)
  • Configuration with documentation
  • Complex hierarchical data

Python Usage

import yaml

# Read (ALWAYS use safe_load)
with open('config.yaml') as f:
config = yaml.safe_load(f)

# Write
config = {'host': 'localhost', 'port': 5432}
with open('config.yaml', 'w') as f:
yaml.dump(config, f, default_flow_style=False)

# Access
host = config['database']['host']

Common Mistakes

Using yaml.load() with untrusted input (security risk)

# WRONG
config = yaml.load(f) # Can execute code!

Always use yaml.safe_load()

# RIGHT
config = yaml.safe_load(f)

Inconsistent indentation

database:
host: localhost
port: 5432 # ← Wrong! 4 spaces when parent is 2

Consistent 2-space indentation

database:
host: localhost
port: 5432

Validation

from pydantic import BaseModel

class Config(BaseModel):
app: str
port: int
debug: bool = False

config = Config(**yaml.safe_load(f))

Tools

# Pretty print / format
yq '.' config.yaml

# Query values
yq '.database.host' config.yaml

# Lint YAML
yamllint config.yaml

# Validate
python -c "import yaml; yaml.safe_load(open('config.yaml'))"

YAML vs JSON Example

Same config, different formats:

YAML (readable):

database:
host: localhost
port: 5432

JSON (compact):

{
"database": {
"host": "localhost",
"port": 5432
}
}

Both load to same Python dict.