# Agent Context Evolution (Repo-Specific) + Forward Roadmap This document describes how documentation for **our production system** (and “agent context”) has evolved in this repo, what patterns are emerging, and how that evolution is likely to continue as the codebase grows. It is intentionally *not* an end‑state proposal. It’s a “maturity model” you can use to keep pruning/refining the context system without losing critical engineering knowledge. --- ## Current State (What Exists Today) Our production system now has **two parallel doc layers**: 1. **Human-readable documentation** (source of truth) - `README.md` - `docs/architecture/*` - `docs/testing/*` - `docs/backlog.md` - “Historical / point-in-time” docs (excluded from default agent reads): `docs/release-history.md`, `docs/bug-tracking/*`, `docs/code-review/*`, ADRs, older milestone plans 2. **Token-optimized “AI context”** (curated projection of the above) - `docs/context-mgmt/context-core.yaml` - `docs/context-mgmt/context-api-patterns.yaml` - `docs/context-mgmt/context-development.yaml` - `docs/context-mgmt/context-backlog.yaml` - Loaded via `.claude/commands/read-context.md` There is also a **process/control layer** (the docs that tell the agent how to behave): - `AGENTIC-CODING-STANDARD.md` (workflow + checklists) - `.claude/commands/*` (read context, update docs, plan review, etc.) - `.claude/hooks/*` (guardrails that enforce conventions) This is the key shift: documentation is no longer only “descriptive”; it is also **operational control** for agentic development. --- ## What Changed Over Time (Based on Git History) Dates below are from git history. The exact code changes matter less than the *structural* changes to the doc system. ### Phase 1 — Foundational Docs (single overview + ad-hoc milestone notes) - **Start of documentation**: README + one architecture overview appear with Milestone 1. - `2025-12-04` adds `README.md`, `docs/architecture/overview.md`, and early milestone documents. - Milestone docs start out as “folders of notes” (including non-Markdown artifacts), not yet a consistent system. **Theme:** documentation begins as “human narrative + planning notes”. ### Phase 2 — Domain Decomposition (split architecture by subsystem) - Architecture splits into frontend/backend docs. - `2025-12-04` “split architecture docs” **Theme:** once the system grows, “one overview” stops scaling; docs split by *cognitive load boundaries* (frontend vs backend). ### Phase 3 — Operational Runbooks (testing + migrations + conventions) - Testing becomes first-class; docs expand to include commands, patterns, fixtures. - `2025-12-05` introduces the testing framework and related docs work. - A formal agent workflow + checklists appears; `.claude` starts showing up. - `2025-12-05` includes early `.claude` + `AGENTIC-CODING-STANDARD.md` history begins around here. - `2025-12-05` adds `/docs-update` command (making “update docs” an explicit step). **Theme:** docs evolve from “what the system is” to “how to change it safely”. ### Phase 4 — Separation of Active vs Historical (release-history + tracking hygiene) - Release history is introduced to archive completed milestones. - `2025-12-17` adds `docs/release-history.md` - Later, milestone history is consolidated and roadmap simplified. - `2025-12-29` “Consolidate milestone history and simplify README roadmap” **Theme:** once tracking docs grow, you avoid loading “everything ever” by moving closed work into an archive. ### Phase 5 — Token-Optimized Agent Context (YAML projection of source docs) - A single token-optimized context file is created. - `2025-12-22` adds `docs/context-mgmt/CONTEXT.yaml` - It immediately splits into multiple topic files + a strategy doc. - `2025-12-22` replaces `CONTEXT.yaml` with the 4 `context-*.yaml` files and adds `docs/context-mgmt/docs-analysis.md` - Pruning begins as the context backlog grows. - `2025-12-30` condenses milestone history in `context-backlog.yaml` (large deletion, small summary) **Theme:** the “agent context” becomes a curated artifact that gets *actively edited for size and utility*, not just appended to. ### Phase 6 — Enforced Guardrails (hooks + preventive checklists) - Documentation is consolidated and agent tooling gets stronger. - `2026-01-06` moves/cleans doc layout and adds `.claude/hooks/*` - Checklists expand from “workflow quality” into “preventive engineering conventions” (including security). - `2026-01-07` adds preventive conventions checklist content - `2026-01-07` adds preventive security checklist items **Theme:** when complexity rises, “context” alone isn’t enough—**enforcement** reduces reliance on memory and reduces agent/human error rates. --- ## Emerging Patterns (What Your Doc System Is Optimizing For) ### 1) Split by *decision surface*, not by file count The successful splits in this repo track decision boundaries: - Architecture: overview vs backend vs frontend - Context: invariants/core vs API/patterns vs dev/runbooks vs backlog/planning - History: “current work” vs “archive” ### 2) Shift from prose → “control primitives” Agent-friendly docs increasingly use: - invariants (“must always be true”) - state machines (“valid states + transitions”) - checklists (“must-do steps before/after coding”) - conventions (“one true way” for recurring patterns) This is the same evolution you see in high-scale human teams: playbooks replace tribal knowledge. ### 3) Layering: source-of-truth vs projection vs enforcement Our production system now has a three-layer system: 1. **Truth**: human docs + code 2. **Projection**: curated YAML context (compressed but semantically complete) 3. **Enforcement**: commands + hooks + checklists (behavior shaping) The more the codebase grows, the more value shifts from (1) to (2)+(3). ### 4) “Archive pressure” is the first scalability lever The earliest big win is always: - move completed work out of active backlog - keep long histories out of default reads - link to history instead of embedding it This is cheaper than introducing automation and usually buys a lot of time. ### 5) Drift becomes the dominant failure mode Once you have: - human docs - token-optimized context copies - checklists / commands the biggest risk is **semantic drift** (they disagree). The repo already reflects this risk by making “update AI context” a required part of `docs-update`. --- ## Forward Roadmap: Likely Next Phases (With Triggers) You said you’re comfortable with ~50–75K tokens of total agentic context. The key is to manage *what is in the default read* vs *what is on-demand*. ### Phase 7 (next) — Context Budgeting + Default Read Slimming **Trigger signals** - The 4 YAML context files trend toward “read everything always” but start crowding out code context in large tasks. - Agents start missing relevant code because context consumes too much of the window. **What changes** - Make `context-core.yaml` a true “always load” file; keep it lean. - Treat other context files as “modules”: load by task type (backend vs frontend vs planning). - Add a tiny “context index” (1–2 pages) that helps route which modules to load. **Pruning rule** - Move “examples” and “long lists” out of default modules; keep only one canonical example per pattern, and link to optional deep dives. ### Phase 8 — Context as a Build Artifact (semi-automated generation) **Trigger signals** - Manual sync cost becomes noticeable. - You see recurring drift bugs (“human doc says X, YAML says Y”). **What changes** - Add a simple generator/linter that: - reports size (lines/chars) per context module - checks for obvious staleness indicators (e.g., referenced files deleted/renamed) - optionally extracts structured lists (endpoints/models) from source docs - Treat YAML context as “compiled output”: it can still be hand-edited, but generation/linting prevents silent drift. ### Phase 9 — Retrieval-Native Context (on-demand deep loads) **Trigger signals** - Even modular YAML context grows beyond the comfortable default-read budget. - Work is increasingly “localized” (e.g., auth work doesn’t need audio capture details). **What changes** - Default read becomes: core invariants + workflow + index. - Domain context (auth/audio/LLM/calendar/etc.) becomes opt-in modules. - Agent workflows include a step: “load the module for the subsystem I’m changing”. This is the point where “don’t load everything” becomes a feature, not a compromise. --- ## Practical Pruning Playbook (What to Remove First) When you need to shrink context, prune in this order: 1. **Duplication** (same fact in multiple files) 2. **History in active context** (closed milestones, old decisions) 3. **Verbose examples** (keep one canonical example; move the rest to optional docs) 4. **Exhaustive inventories** (lists of every file/function) → replace with entrypoints + search instructions 5. **Narrative prose** that doesn’t change decisions (convert to invariants/checklists/state machines) When you’re unsure whether to prune a section, ask: *does this change what code an agent would write?* If not, move it out of default context. --- ## Maintenance Loop (Keeping It Healthy Over Time) The repo already encodes this workflow in `.claude/commands/docs-update.md`. The key additions as complexity grows: - **Budget check** (monthly/quarterly): measure growth and decide what becomes “on-demand” - **Drift check**: when updating human docs, update YAML in the same PR/branch - **Archive cadence**: move “done” items out of active tracking on a predictable schedule - **Changelog the context system**: when you change how context is structured, write it down here --- ## Notes on Your Original Description Your story is broadly accurate in *shape* (human docs → split → add operational docs → backlog bloat → archive → token-optimized agent context → prune examples), but in this repo the “agent copy” is **YAML** (not XML), and the timeline shows a distinct later phase where enforcement (hooks + preventive checklists) becomes a primary scaling strategy.