For the technical reader
How it is built
A deterministic, workflow-first ops platform. Control flow lives in plain Python; AI is confined to three narrow jobs (copywriting, ambiguous classification, unstructured extraction). Notion is the shared data layer across two workspaces, GitHub Actions is the scheduler, and Cloudflare serves the human surfaces.
The two-workspace contract
System A, Class Ops v2
The live operations platform, treated as sacred. No automation ever writes here. Pipelines read sessions and events; the only writable surface is a carve-out Email Triage table (relation fields only).
- Sessions
- Events
- Email Triage (carve-out)
System B, El Brain-o
The agent-owned brain and its observability. Everything the workers produce or learn lives here. Outbound drafts route through the Outbox (TKC root) for human approval before anything is sent.
- Outbox, human gate
- + all databases below
The stack
| Layer | Choice | Why / detail |
|---|---|---|
| Pipelines + workers | Python 3.11 | anthropic, notion-client, pydantic, boto3, langfuse, gspread |
| Dashboard + forms | TypeScript + Hono on Cloudflare Workers | edge-rendered, KV-cached, Vitest-tested |
| Scheduling | GitHub Actions cron (29 workflows) | staggered by the minute to respect Notion's 3 req/sec |
| Data layer | Notion API (2025-09-03 data sources) | two-workspace contract, one shared token bucket |
| AI | Anthropic Claude (+ Gemini) | Haiku / Sonnet / Opus by confidence; narrow calls only |
| Backups | Cloudflare R2 | nightly encrypted snapshot with a tested restore drill |
System B data model
Knowledge
- Context Items
- Commitments
- Decisions
- Distilled Insights
- Lessons
Operations
- Pipeline State
- Pipeline Runs
- Pipeline Issues
- Pipeline Overrides
Reference
- Programs
- People
- Class Manager Templates
- Program Comms Overrides
Intake / surface
- Inbox
- Event Requests
Guardrails, shipped with the feature, not bolted on
Cost caps
Every Claude call is metered by TracedAnthropic and rolled into Pipeline Runs. A $4/day soft cap warns; the $100/month hard ceiling raises SpendCapExceeded, sets the pipeline Halt Flag, and stops the next run. No silent overrun.
scripts/claude_wrapper.pyKill switches
Each pipeline has a Halt Flag in the Pipeline State DB, checked at worker entry; a set flag exits with code 2 before any work. The dashboard adds DASHBOARD_ENABLED (whole surface) and WRITES_ENABLED (just the write routes).
scripts/notion_helpers.pyRetry + rate limit
All Notion I/O funnels through one _with_retry chokepoint (backoffs 2s, 8s, 20s on 5xx). Crons are staggered by the minute and the dashboard caches 60s, holding traffic well under Notion's 3 req/sec token bucket.
scripts/notion_helpers.pyObservability
Every run writes a Pipeline Runs row (tokens, cost, items, status); failures raise a Pipeline Issue with severity. Traces also append to claude_traces.jsonl and optionally push to Langfuse.
scripts/cost_rollup.pyModel tiers and price (USD per million tokens)
Routine, high-confidence work uses the cheap model; the expensive one is reserved for genuinely ambiguous calls. Caching makes a repeated prompt ten times cheaper.
| Model | Input | Cached | Output | Used for |
|---|---|---|---|---|
| Opus 4.7 | $15.00 | $1.50 | $75.00 | Low-confidence only (email matching) |
| Sonnet 4.6 | $3.00 | $0.30 | $15.00 | Drafting + synthesis |
| Haiku 4.5 | $1.00 | $0.10 | $5.00 | High-confidence routine triage |
Source of truth: scripts/config.py (MODEL_PRICING, DAILY_SOFT_CAP_USD = 4, MONTHLY_HARD_CEILING_USD = 100, relevance gate 0.85).
The most-bled-on bug, now a hard rule
Every date is computed in Eastern Time, never from a raw timestamp
Notion stores a session time as a UTC instant. A Monday 8:30pm ET class is stored as 2026-06-16T00:30Z, which is the next day in UTC. Slicing the date off a raw timestamp gives the wrong day for every evening class. So nothing derives a calendar day without converting to America/New_York first.
Why workflow-first, not agentic
"An AI decides what to do next" is never the answer here. Each workflow is deterministic code with an AI call at exactly one narrow step, and it declares its cost ceiling, its confidence tiers, and its cache strategy up front. The Outbox handoff is what makes the system feel agent-like to a human reviewer; the implementation underneath is boring, logged, and reversible on purpose.