Automation Diagramming Playbook

Goal: Design, build, review, and operate automation systems faster with fewer mistakes.

Mission: Reduce risk in automation design and operation.

Principle: Diagrams exist to reduce risk, not to look pretty.

0. The Mental Model (explain like I’m 5)

Automation is a robot doing work for us.

Diagrams are maps for the robot:

Where it starts
What decisions it makes
What can go wrong
How it remembers things

If the map is unclear, the robot breaks.

1. Why Diagrams Exist (Automation-First)

Diagrams are thinking tools, not documentation for later.

They help you:

Catch bugs before writing code
Estimate real ROI (time, cost, risk)
Avoid fragile automations
Teach others without re-explaining everything

What diagrams do NOT do

They do not replace code
They do not explain libraries or SDK internals
They do not guarantee correctness

If a diagram doesn’t change a decision, it probably shouldn’t exist.

2. The Only 7 Diagrams You Ever Need

Golden Rule

One diagram = one question

If it answers more than one question → split it.

Summary table

Diagram Type	Answers	Mandatory?
System Context	Why / Where	✅ Always
Architecture	What	✅ Always
Workflow	How	✅ Always
Decision Logic	Brain	⚠️ If AI/rules
State	Memory	⚠️ If long-lived
Failure & Recovery	Reality	⚠️ For production
Security	Trust	Optional

1️⃣ System Context Diagram

Question it answers: Why does this system exist and where does it sit?

Shows:

Users / roles
External systems
High-level inputs & outputs

Does NOT show:

Internal services
Databases
Business logic

Mandatory: Always

Smell test:

Can you explain the system in 30–60 seconds using only this?

2️⃣ Architecture Diagram

Question it answers: What are the major building blocks?

Shows:

Services / layers
Datastores
Clear boundaries

Does NOT show:

Step-by-step flows
Retry logic

Mandatory: Always

Smell test:

Can you assign one owner per box?

3️⃣ Workflow Diagram

Question it answers: How does the automation work step by step?

Shows:

Sequence of actions
Decision points
Loops & retries

Does NOT show:

UI styling
Code details

Mandatory: Always

Smell test:

Can someone implement this without asking questions?

4️⃣ Decision / Logic Diagram

Question it answers: How does the system decide?

Shows:

Conditions
Thresholds
Outcomes
Human overrides

Mandatory: If rules or AI exist

Smell test:

Can a non-engineer understand why a decision was made?

5️⃣ State Diagram

Question it answers: What states can this object be in over time?

Shows:

Valid states
Allowed transitions
Terminal states

Mandatory: If data lives beyond one request

Smell test:

Is every transition intentional?

6️⃣ Failure & Recovery Diagram

Question it answers: What breaks, and what happens next?

Shows:

Failure points
Retries
Dead ends
Human escalation

Mandatory: Before production

Smell test:

At 2 AM, is the response obvious?

7️⃣ Security & Audit Diagram

Question it answers: Who can access what, and who did what?

Shows:

Auth boundaries
Sensitive data flow
Audit logs

Mandatory: Enterprise / sensitive data

Smell test:

Can you answer “who accessed this and when?”

3. Diagram → Execution Mapping (CRITICAL)

Diagrams must map to real artifacts.

Diagram	Maps To
Context	README, pitch, scope
Architecture	Repo structure, services
Workflow	n8n / Celery / Temporal / Airflow
Decision	Rules engine, config, prompts
State	DB tables, enums
Failure	Retries, DLQs, alerts
Security	IAM, auth middleware, audit logs

If you cannot point to the code/config this diagram represents, it is lying.

4. Excalidraw Standards (Non-Negotiable)

Naming

One diagram = one file
Filename answers the question
- workflow_email_triage.excalidraw

Colors (keep minimal)

Blue: systems
Green: happy path
Red: failure
Yellow: decision

Boundaries

Draw system boundaries explicitly
External systems always outside

Versioning

Diagrams live in the repo
Updated with logic changes
PRs must include diagram updates

5. AI-Aware Automation Diagrams

AI introduces uncertainty. Diagrams must show it.

Always mark

AI decision points
Confidence thresholds
Human-in-the-loop gates

Never let AI

Change money
Delete data
Escalate humans

Without a human checkpoint.

If AI decides, humans must be able to override.

6. Failure & Ops Readiness

Before production, you must answer:

What fails first?
What retries?
When do we stop retrying?
Who gets notified?

If it’s not drawn, it’s not owned.

7. Diagram Review Checklist

Before building:

Context diagram approved
Workflow fully defined
Decisions explicit

Before production:

State diagram validated
Failure paths clear
Security reviewed

8. Example: Automation System (Email → AI Triage → Action Automation)

System context

[ External Senders ]
          ↓
       [ Email ]
          ↓
 ┌────────────────────┐
 │  Email Triage Bot  │
 └────────────────────┘
    ↓        ↓        ↓
 [ Slack ] [ CRM ] [ Task DB ]

Architecture:

┌──────────────┐
│ Email Ingest │  ← IMAP / Gmail API
└──────┬───────┘
       ↓
┌──────────────┐
│ Preprocessor │  ← validation, cleaning
└──────┬───────┘
       ↓
┌──────────────┐
│ Decision     │  ← AI + rules
│ Engine       │
└──────┬───────┘
       ↓
┌──────────────┐
│ Action Layer │  ← Slack / CRM / Tasks
└──────┬───────┘
       ↓
┌──────────────┐
│ Persistence  │  ← DB + audit logs
└──────────────┘

Workflow:

Email Received
      ↓
Validate Sender
      ↓
Extract Content
      ↓
Is Email Processable?
   ┌─────Yes─────┐
   ↓             ↓
Run AI Triage   Ignore / Log
   ↓
Decide Action
   ↓
Execute Action
   ↓
Store Result

Decision:

Is Urgent?
  |
  |-- Yes → Confidence ≥ 0.85?
  |           |
  |           |-- Yes → Notify Slack (Immediate)
  |           |
  |           |-- No  → Human Review Queue
  |
  |-- No  → Create Task (Normal Priority)

State:

RECEIVED
   ↓
PROCESSING
   ↓
DECIDED
   ↓
ACTION_TAKEN
   ↓
COMPLETED


Failure transition

PROCESSING → FAILED → RETRYING → PROCESSING
                    ↓
                 DEAD_LETTER

Failure:

Email API Fails
     ↓
Retry (x3, backoff)
     ↓
Still Fails?
   | Yes
   ↓
Store in DLQ
   ↓
Alert Ops (Slack / Email)

Security:

Email API
   ↓ (OAuth)
Ingestion Service
   ↓ (Service Token)
Decision Engine
   ↓
Encrypted Database
   ↓
Audit Log (append-only)


Audit record example

email_id | decision | confidence | action | timestamp

Final Rule (Memorize This)

Context → Architecture → Workflow → Decision → State → Failure → Security

This order will never fail you.

0. The Mental Model (explain like I’m 5)​

1. Why Diagrams Exist (Automation-First)​

What diagrams do NOT do​

2. The Only 7 Diagrams You Ever Need​

Golden Rule​

Summary table​

1️⃣ System Context Diagram​

2️⃣ Architecture Diagram​

3️⃣ Workflow Diagram​

4️⃣ Decision / Logic Diagram​

5️⃣ State Diagram​

6️⃣ Failure & Recovery Diagram​

7️⃣ Security & Audit Diagram​

3. Diagram → Execution Mapping (CRITICAL)​

4. Excalidraw Standards (Non-Negotiable)​

Naming​

Colors (keep minimal)​

Boundaries​

Versioning​

5. AI-Aware Automation Diagrams​

Always mark​

Never let AI​

6. Failure & Ops Readiness​

7. Diagram Review Checklist​

8. Example: Automation System (Email → AI Triage → Action Automation)​

Final Rule (Memorize This)​

0. The Mental Model (explain like I’m 5)

1. Why Diagrams Exist (Automation-First)

What diagrams do NOT do

2. The Only 7 Diagrams You Ever Need

Golden Rule

Summary table

1️⃣ System Context Diagram

2️⃣ Architecture Diagram

3️⃣ Workflow Diagram

4️⃣ Decision / Logic Diagram

5️⃣ State Diagram

6️⃣ Failure & Recovery Diagram

7️⃣ Security & Audit Diagram

3. Diagram → Execution Mapping (CRITICAL)

4. Excalidraw Standards (Non-Negotiable)

Naming

Colors (keep minimal)

Boundaries

Versioning

5. AI-Aware Automation Diagrams

Always mark

Never let AI

6. Failure & Ops Readiness

7. Diagram Review Checklist

8. Example: Automation System (Email → AI Triage → Action Automation)

Final Rule (Memorize This)