Skip to content

For the technical reader

How it is built

A deterministic, workflow-first ops platform. Control flow lives in plain Python; AI is confined to three narrow jobs (copywriting, ambiguous classification, unstructured extraction). Notion is the shared data layer across two workspaces, GitHub Actions is the scheduler, and Cloudflare serves the human surfaces.

The two-workspace contract

read only

System A, Class Ops v2

The live operations platform, treated as sacred. No automation ever writes here. Pipelines read sessions and events; the only writable surface is a carve-out Email Triage table (relation fields only).

  • Sessions
  • Events
  • Email Triage (carve-out)
read / write

System B, El Brain-o

The agent-owned brain and its observability. Everything the workers produce or learn lives here. Outbound drafts route through the Outbox (TKC root) for human approval before anything is sent.

  • Outbox, human gate
  • + all databases below

The stack

Layer Choice Why / detail
Pipelines + workers Python 3.11 anthropic, notion-client, pydantic, boto3, langfuse, gspread
Dashboard + forms TypeScript + Hono on Cloudflare Workers edge-rendered, KV-cached, Vitest-tested
Scheduling GitHub Actions cron (29 workflows) staggered by the minute to respect Notion's 3 req/sec
Data layer Notion API (2025-09-03 data sources) two-workspace contract, one shared token bucket
AI Anthropic Claude (+ Gemini) Haiku / Sonnet / Opus by confidence; narrow calls only
Backups Cloudflare R2 nightly encrypted snapshot with a tested restore drill

System B data model

Knowledge

  • Context Items
  • Commitments
  • Decisions
  • Distilled Insights
  • Lessons

Operations

  • Pipeline State
  • Pipeline Runs
  • Pipeline Issues
  • Pipeline Overrides

Reference

  • Programs
  • People
  • Class Manager Templates
  • Program Comms Overrides

Intake / surface

  • Inbox
  • Event Requests

Guardrails, shipped with the feature, not bolted on

Cost caps

Every Claude call is metered by TracedAnthropic and rolled into Pipeline Runs. A $4/day soft cap warns; the $100/month hard ceiling raises SpendCapExceeded, sets the pipeline Halt Flag, and stops the next run. No silent overrun.

scripts/claude_wrapper.py

Kill switches

Each pipeline has a Halt Flag in the Pipeline State DB, checked at worker entry; a set flag exits with code 2 before any work. The dashboard adds DASHBOARD_ENABLED (whole surface) and WRITES_ENABLED (just the write routes).

scripts/notion_helpers.py

Retry + rate limit

All Notion I/O funnels through one _with_retry chokepoint (backoffs 2s, 8s, 20s on 5xx). Crons are staggered by the minute and the dashboard caches 60s, holding traffic well under Notion's 3 req/sec token bucket.

scripts/notion_helpers.py

Observability

Every run writes a Pipeline Runs row (tokens, cost, items, status); failures raise a Pipeline Issue with severity. Traces also append to claude_traces.jsonl and optionally push to Langfuse.

scripts/cost_rollup.py

Model tiers and price (USD per million tokens)

Routine, high-confidence work uses the cheap model; the expensive one is reserved for genuinely ambiguous calls. Caching makes a repeated prompt ten times cheaper.

Model Input Cached Output Used for
Opus 4.7 $15.00 $1.50 $75.00 Low-confidence only (email matching)
Sonnet 4.6 $3.00 $0.30 $15.00 Drafting + synthesis
Haiku 4.5 $1.00 $0.10 $5.00 High-confidence routine triage

Source of truth: scripts/config.py (MODEL_PRICING, DAILY_SOFT_CAP_USD = 4, MONTHLY_HARD_CEILING_USD = 100, relevance gate 0.85).

The most-bled-on bug, now a hard rule

Every date is computed in Eastern Time, never from a raw timestamp

Notion stores a session time as a UTC instant. A Monday 8:30pm ET class is stored as 2026-06-16T00:30Z, which is the next day in UTC. Slicing the date off a raw timestamp gives the wrong day for every evening class. So nothing derives a calendar day without converting to America/New_York first.

todayET() etDateOf(iso) formatDateET(iso)
dashboard/src/dates.ts

Why workflow-first, not agentic

"An AI decides what to do next" is never the answer here. Each workflow is deterministic code with an AI call at exactly one narrow step, and it declares its cost ceiling, its confidence tiers, and its cache strategy up front. The Outbox handoff is what makes the system feel agent-like to a human reviewer; the implementation underneath is boring, logged, and reversible on purpose.