Skip to main content

Automation Diagramming Playbook

Goal: Design, build, review, and operate automation systems faster with fewer mistakes.

Mission: Reduce risk in automation design and operation.

Principle: Diagrams exist to reduce risk, not to look pretty.


0. The Mental Model (explain like I’m 5)

Automation is a robot doing work for us.

Diagrams are maps for the robot:

  • Where it starts
  • What decisions it makes
  • What can go wrong
  • How it remembers things

If the map is unclear, the robot breaks.


1. Why Diagrams Exist (Automation-First)

Diagrams are thinking tools, not documentation for later.

They help you:

  • Catch bugs before writing code
  • Estimate real ROI (time, cost, risk)
  • Avoid fragile automations
  • Teach others without re-explaining everything

What diagrams do NOT do

  • They do not replace code
  • They do not explain libraries or SDK internals
  • They do not guarantee correctness

If a diagram doesn’t change a decision, it probably shouldn’t exist.


2. The Only 7 Diagrams You Ever Need

Golden Rule

One diagram = one question

If it answers more than one question → split it.

Summary table

Diagram TypeAnswersMandatory?
System ContextWhy / Where✅ Always
ArchitectureWhat✅ Always
WorkflowHow✅ Always
Decision LogicBrain⚠️ If AI/rules
StateMemory⚠️ If long-lived
Failure & RecoveryReality⚠️ For production
SecurityTrustOptional

1️⃣ System Context Diagram

Question it answers: Why does this system exist and where does it sit?

Shows:

  • Users / roles
  • External systems
  • High-level inputs & outputs

Does NOT show:

  • Internal services
  • Databases
  • Business logic

Mandatory: Always

Smell test:

  • Can you explain the system in 30–60 seconds using only this?

2️⃣ Architecture Diagram

Question it answers: What are the major building blocks?

Shows:

  • Services / layers
  • Datastores
  • Clear boundaries

Does NOT show:

  • Step-by-step flows
  • Retry logic

Mandatory: Always

Smell test:

  • Can you assign one owner per box?

3️⃣ Workflow Diagram

Question it answers: How does the automation work step by step?

Shows:

  • Sequence of actions
  • Decision points
  • Loops & retries

Does NOT show:

  • UI styling
  • Code details

Mandatory: Always

Smell test:

  • Can someone implement this without asking questions?

4️⃣ Decision / Logic Diagram

Question it answers: How does the system decide?

Shows:

  • Conditions
  • Thresholds
  • Outcomes
  • Human overrides

Mandatory: If rules or AI exist

Smell test:

  • Can a non-engineer understand why a decision was made?

5️⃣ State Diagram

Question it answers: What states can this object be in over time?

Shows:

  • Valid states
  • Allowed transitions
  • Terminal states

Mandatory: If data lives beyond one request

Smell test:

  • Is every transition intentional?

6️⃣ Failure & Recovery Diagram

Question it answers: What breaks, and what happens next?

Shows:

  • Failure points
  • Retries
  • Dead ends
  • Human escalation

Mandatory: Before production

Smell test:

  • At 2 AM, is the response obvious?

7️⃣ Security & Audit Diagram

Question it answers: Who can access what, and who did what?

Shows:

  • Auth boundaries
  • Sensitive data flow
  • Audit logs

Mandatory: Enterprise / sensitive data

Smell test:

  • Can you answer “who accessed this and when?”

3. Diagram → Execution Mapping (CRITICAL)

Diagrams must map to real artifacts.

DiagramMaps To
ContextREADME, pitch, scope
ArchitectureRepo structure, services
Workflown8n / Celery / Temporal / Airflow
DecisionRules engine, config, prompts
StateDB tables, enums
FailureRetries, DLQs, alerts
SecurityIAM, auth middleware, audit logs

If you cannot point to the code/config this diagram represents, it is lying.


4. Excalidraw Standards (Non-Negotiable)

Naming

  • One diagram = one file

  • Filename answers the question

    • workflow_email_triage.excalidraw

Colors (keep minimal)

  • Blue: systems
  • Green: happy path
  • Red: failure
  • Yellow: decision

Boundaries

  • Draw system boundaries explicitly
  • External systems always outside

Versioning

  • Diagrams live in the repo
  • Updated with logic changes
  • PRs must include diagram updates

5. AI-Aware Automation Diagrams

AI introduces uncertainty. Diagrams must show it.

Always mark

  • AI decision points
  • Confidence thresholds
  • Human-in-the-loop gates

Never let AI

  • Change money
  • Delete data
  • Escalate humans

Without a human checkpoint.

If AI decides, humans must be able to override.


6. Failure & Ops Readiness

Before production, you must answer:

  • What fails first?
  • What retries?
  • When do we stop retrying?
  • Who gets notified?

If it’s not drawn, it’s not owned.


7. Diagram Review Checklist

Before building:

  • Context diagram approved
  • Workflow fully defined
  • Decisions explicit

Before production:

  • State diagram validated
  • Failure paths clear
  • Security reviewed

8. Example: Automation System (Email → AI Triage → Action Automation)

  • System context
[ External Senders ]

[ Email ]

┌────────────────────┐
│ Email Triage Bot │
└────────────────────┘
↓ ↓ ↓
[ Slack ] [ CRM ] [ Task DB ]

Architecture:

┌──────────────┐
│ Email Ingest │ ← IMAP / Gmail API
└──────┬───────┘

┌──────────────┐
│ Preprocessor │ ← validation, cleaning
└──────┬───────┘

┌──────────────┐
│ Decision │ ← AI + rules
│ Engine │
└──────┬───────┘

┌──────────────┐
│ Action Layer │ ← Slack / CRM / Tasks
└──────┬───────┘

┌──────────────┐
│ Persistence │ ← DB + audit logs
└──────────────┘

Workflow:

Email Received

Validate Sender

Extract Content

Is Email Processable?
┌─────Yes─────┐
↓ ↓
Run AI Triage Ignore / Log

Decide Action

Execute Action

Store Result

Decision:

Is Urgent?
|
|-- Yes → Confidence ≥ 0.85?
| |
| |-- Yes → Notify Slack (Immediate)
| |
| |-- No → Human Review Queue
|
|-- No → Create Task (Normal Priority)

State:

RECEIVED

PROCESSING

DECIDED

ACTION_TAKEN

COMPLETED


Failure transition

PROCESSING → FAILED → RETRYING → PROCESSING

DEAD_LETTER

Failure:

Email API Fails

Retry (x3, backoff)

Still Fails?
| Yes

Store in DLQ

Alert Ops (Slack / Email)

Security:

Email API
↓ (OAuth)
Ingestion Service
↓ (Service Token)
Decision Engine

Encrypted Database

Audit Log (append-only)


Audit record example

email_id | decision | confidence | action | timestamp


Final Rule (Memorize This)

Context → Architecture → Workflow → Decision → State → Failure → Security

This order will never fail you.