# Agent Context Evolution (Repo-Specific) + Forward Roadmap

This document describes how documentation for **our production system** (and “agent context”) has evolved in this repo, what patterns are emerging, and how that evolution is likely to continue as the codebase grows.

It is intentionally *not* an end‑state proposal. It’s a “maturity model” you can use to keep pruning/refining the context system without losing critical engineering knowledge.

---

## Current State (What Exists Today)

Our production system now has **two parallel doc layers**:

1. **Human-readable documentation** (source of truth)
   - `README.md`
   - `docs/architecture/*`
   - `docs/testing/*`
   - `docs/backlog.md`
   - “Historical / point-in-time” docs (excluded from default agent reads): `docs/release-history.md`, `docs/bug-tracking/*`, `docs/code-review/*`, ADRs, older milestone plans

2. **Token-optimized “AI context”** (curated projection of the above)
   - `docs/context-mgmt/context-core.yaml`
   - `docs/context-mgmt/context-api-patterns.yaml`
   - `docs/context-mgmt/context-development.yaml`
   - `docs/context-mgmt/context-backlog.yaml`
   - Loaded via `.claude/commands/read-context.md`

There is also a **process/control layer** (the docs that tell the agent how to behave):

- `AGENTIC-CODING-STANDARD.md` (workflow + checklists)
- `.claude/commands/*` (read context, update docs, plan review, etc.)
- `.claude/hooks/*` (guardrails that enforce conventions)

This is the key shift: documentation is no longer only “descriptive”; it is also **operational control** for agentic development.

---

## What Changed Over Time (Based on Git History)

Dates below are from git history. The exact code changes matter less than the *structural* changes to the doc system.

### Phase 1 — Foundational Docs (single overview + ad-hoc milestone notes)

- **Start of documentation**: README + one architecture overview appear with Milestone 1.
  - `2025-12-04` adds `README.md`, `docs/architecture/overview.md`, and early milestone documents.
- Milestone docs start out as “folders of notes” (including non-Markdown artifacts), not yet a consistent system.

**Theme:** documentation begins as “human narrative + planning notes”.

### Phase 2 — Domain Decomposition (split architecture by subsystem)

- Architecture splits into frontend/backend docs.
  - `2025-12-04` “split architecture docs”

**Theme:** once the system grows, “one overview” stops scaling; docs split by *cognitive load boundaries* (frontend vs backend).

### Phase 3 — Operational Runbooks (testing + migrations + conventions)

- Testing becomes first-class; docs expand to include commands, patterns, fixtures.
  - `2025-12-05` introduces the testing framework and related docs work.
- A formal agent workflow + checklists appears; `.claude` starts showing up.
  - `2025-12-05` includes early `.claude` + `AGENTIC-CODING-STANDARD.md` history begins around here.
  - `2025-12-05` adds `/docs-update` command (making “update docs” an explicit step).

**Theme:** docs evolve from “what the system is” to “how to change it safely”.

### Phase 4 — Separation of Active vs Historical (release-history + tracking hygiene)

- Release history is introduced to archive completed milestones.
  - `2025-12-17` adds `docs/release-history.md`
- Later, milestone history is consolidated and roadmap simplified.
  - `2025-12-29` “Consolidate milestone history and simplify README roadmap”

**Theme:** once tracking docs grow, you avoid loading “everything ever” by moving closed work into an archive.

### Phase 5 — Token-Optimized Agent Context (YAML projection of source docs)

- A single token-optimized context file is created.
  - `2025-12-22` adds `docs/context-mgmt/CONTEXT.yaml`
- It immediately splits into multiple topic files + a strategy doc.
  - `2025-12-22` replaces `CONTEXT.yaml` with the 4 `context-*.yaml` files and adds `docs/context-mgmt/docs-analysis.md`
- Pruning begins as the context backlog grows.
  - `2025-12-30` condenses milestone history in `context-backlog.yaml` (large deletion, small summary)

**Theme:** the “agent context” becomes a curated artifact that gets *actively edited for size and utility*, not just appended to.

### Phase 6 — Enforced Guardrails (hooks + preventive checklists)

- Documentation is consolidated and agent tooling gets stronger.
  - `2026-01-06` moves/cleans doc layout and adds `.claude/hooks/*`
- Checklists expand from “workflow quality” into “preventive engineering conventions” (including security).
  - `2026-01-07` adds preventive conventions checklist content
  - `2026-01-07` adds preventive security checklist items

**Theme:** when complexity rises, “context” alone isn’t enough—**enforcement** reduces reliance on memory and reduces agent/human error rates.

---

## Emerging Patterns (What Your Doc System Is Optimizing For)

### 1) Split by *decision surface*, not by file count

The successful splits in this repo track decision boundaries:

- Architecture: overview vs backend vs frontend
- Context: invariants/core vs API/patterns vs dev/runbooks vs backlog/planning
- History: “current work” vs “archive”

### 2) Shift from prose → “control primitives”

Agent-friendly docs increasingly use:

- invariants (“must always be true”)
- state machines (“valid states + transitions”)
- checklists (“must-do steps before/after coding”)
- conventions (“one true way” for recurring patterns)

This is the same evolution you see in high-scale human teams: playbooks replace tribal knowledge.

### 3) Layering: source-of-truth vs projection vs enforcement

Our production system now has a three-layer system:

1. **Truth**: human docs + code
2. **Projection**: curated YAML context (compressed but semantically complete)
3. **Enforcement**: commands + hooks + checklists (behavior shaping)

The more the codebase grows, the more value shifts from (1) to (2)+(3).

### 4) “Archive pressure” is the first scalability lever

The earliest big win is always:

- move completed work out of active backlog
- keep long histories out of default reads
- link to history instead of embedding it

This is cheaper than introducing automation and usually buys a lot of time.

### 5) Drift becomes the dominant failure mode

Once you have:

- human docs
- token-optimized context copies
- checklists / commands

the biggest risk is **semantic drift** (they disagree). The repo already reflects this risk by making “update AI context” a required part of `docs-update`.

---

## Forward Roadmap: Likely Next Phases (With Triggers)

You said you’re comfortable with ~50–75K tokens of total agentic context. The key is to manage *what is in the default read* vs *what is on-demand*.

### Phase 7 (next) — Context Budgeting + Default Read Slimming

**Trigger signals**
- The 4 YAML context files trend toward “read everything always” but start crowding out code context in large tasks.
- Agents start missing relevant code because context consumes too much of the window.

**What changes**
- Make `context-core.yaml` a true “always load” file; keep it lean.
- Treat other context files as “modules”: load by task type (backend vs frontend vs planning).
- Add a tiny “context index” (1–2 pages) that helps route which modules to load.

**Pruning rule**
- Move “examples” and “long lists” out of default modules; keep only one canonical example per pattern, and link to optional deep dives.

### Phase 8 — Context as a Build Artifact (semi-automated generation)

**Trigger signals**
- Manual sync cost becomes noticeable.
- You see recurring drift bugs (“human doc says X, YAML says Y”).

**What changes**
- Add a simple generator/linter that:
  - reports size (lines/chars) per context module
  - checks for obvious staleness indicators (e.g., referenced files deleted/renamed)
  - optionally extracts structured lists (endpoints/models) from source docs
- Treat YAML context as “compiled output”: it can still be hand-edited, but generation/linting prevents silent drift.

### Phase 9 — Retrieval-Native Context (on-demand deep loads)

**Trigger signals**
- Even modular YAML context grows beyond the comfortable default-read budget.
- Work is increasingly “localized” (e.g., auth work doesn’t need audio capture details).

**What changes**
- Default read becomes: core invariants + workflow + index.
- Domain context (auth/audio/LLM/calendar/etc.) becomes opt-in modules.
- Agent workflows include a step: “load the module for the subsystem I’m changing”.

This is the point where “don’t load everything” becomes a feature, not a compromise.

---

## Practical Pruning Playbook (What to Remove First)

When you need to shrink context, prune in this order:

1. **Duplication** (same fact in multiple files)
2. **History in active context** (closed milestones, old decisions)
3. **Verbose examples** (keep one canonical example; move the rest to optional docs)
4. **Exhaustive inventories** (lists of every file/function) → replace with entrypoints + search instructions
5. **Narrative prose** that doesn’t change decisions (convert to invariants/checklists/state machines)

When you’re unsure whether to prune a section, ask: *does this change what code an agent would write?* If not, move it out of default context.

---

## Maintenance Loop (Keeping It Healthy Over Time)

The repo already encodes this workflow in `.claude/commands/docs-update.md`. The key additions as complexity grows:

- **Budget check** (monthly/quarterly): measure growth and decide what becomes “on-demand”
- **Drift check**: when updating human docs, update YAML in the same PR/branch
- **Archive cadence**: move “done” items out of active tracking on a predictable schedule
- **Changelog the context system**: when you change how context is structured, write it down here

---

## Notes on Your Original Description

Your story is broadly accurate in *shape* (human docs → split → add operational docs → backlog bloat → archive → token-optimized agent context → prune examples), but in this repo the “agent copy” is **YAML** (not XML), and the timeline shows a distinct later phase where enforcement (hooks + preventive checklists) becomes a primary scaling strategy.