Claude 3.5 Sonnet
200,000 tokens (~150,000 words)
Context Budget is the practice of enforcing strict size limits on documentation files to preserve AI agent context windows. Every line in a file “costs” tokens—waste those tokens, and AI performance suffers.
AI coding agents have limited context windows:
Claude 3.5 Sonnet
200,000 tokens (~150,000 words)
GPT-4 Turbo
128,000 tokens (~96,000 words)
GitHub Copilot
32,000 tokens (~24,000 words)
The problem: This seems like a lot, but it fills up quickly:
Your codebase: 50,000 tokensDocumentation: 30,000 tokensConversation history: 20,000 tokensAI's working memory: 10,000 tokens-------------------------------------------Total: 110,000 tokensThe solution: Treat context like a budget. Every line of documentation must justify its existence.
Cortex TMS enforces strict size limits on HOT files and relaxed limits on WARM files:
| File | Max Lines | Why This Limit? |
|---|---|---|
NEXT-TASKS.md | 200 lines | One sprint maximum (1-2 weeks) |
.github/copilot-instructions.md | 100 lines | Critical rules only, no explanations |
| File | Max Lines | Why This Limit? |
|---|---|---|
docs/core/PATTERNS.md | 650 lines | Reference manual with quick-reference index |
docs/core/DOMAIN-LOGIC.md | 400 lines | Core rules + Maintenance Protocol |
docs/core/ARCHITECTURE.md | 500 lines | System design without implementation details |
docs/core/GIT-STANDARDS.md | 250 lines | Git conventions + commit standards |
docs/core/GLOSSARY.md | 200 lines | Terminology definitions |
No limits. Archive files can grow indefinitely—AI agents ignore them unless explicitly asked.
Let’s compare two projects:
# NEXT-TASKS.md (850 lines)
## Q1 2026 Tasks- [ ] 50 tasks planned for January-March...
## Q2 2026 Tasks- [ ] 60 tasks planned for April-June...
## Q3 2026 Tasks- [ ] 70 tasks planned for July-September...
## Completed Tasks (2025)- [x] 100 tasks from last year...Context cost: ~3,400 tokens read every session
AI behavior:
# NEXT-TASKS.md (180 lines)
## Active Sprint: User Authentication**Why this matters**: Mobile app needs secure API access
- [ ] JWT token generation- [ ] Token validation middleware- [ ] Refresh token rotation...Context cost: ~720 tokens
AI behavior:
Savings: 2,680 tokens per session = 30-40% more room for code and conversation
Before archiving a task, check file size:
wc -l NEXT-TASKS.md# Output: 185 NEXT-TASKS.mdIf approaching 200 lines → Archive completed tasks or move backlog items to FUTURE-ENHANCEMENTS.md.
Use the Cortex TMS CLI:
cortex-tms validate --strictOutput:
✓ NEXT-TASKS.md (185 lines) - Under limit✗ copilot-instructions.md (120 lines) - EXCEEDS LIMIT (max 100)
Recommendations:- Move detailed examples to docs/core/PATTERNS.md- Keep only critical rules in copilot-instructions.mdAdd a git hook to prevent oversized commits:
#!/bin/bashNEXT_TASKS_LINES=$(wc -l < NEXT-TASKS.md)
if [ $NEXT_TASKS_LINES -gt 200 ]; then echo "❌ NEXT-TASKS.md exceeds 200 lines ($NEXT_TASKS_LINES lines)" echo "Archive completed tasks before committing" exit 1fi
echo "✓ NEXT-TASKS.md size OK ($NEXT_TASKS_LINES lines)"When NEXT-TASKS.md approaches 200 lines:
Archive Completed Tasks
Move finished tasks to docs/archive/sprint-YYYY-MM.md
# Before: 195 lines# After: 120 linesMove Backlog to FUTURE
Move low-priority tasks to FUTURE-ENHANCEMENTS.md
# Move "nice to have" tasks out of current sprintBreak Down Large Tasks
Split 50-line epics into smaller increments
# Instead of 1 huge task, create 5 focused onesWhen docs/core/PATTERNS.md exceeds 650 lines:
Option 1: Split into Multiple Files
docs/core/├── PATTERNS.md (index + core patterns)├── PATTERNS-FRONTEND.md (React/Vue patterns)├── PATTERNS-BACKEND.md (API patterns)└── PATTERNS-DATA.md (Database patterns)Option 2: Create Quick-Reference Index
## Quick Reference| Pattern | Line | Use When ||---------|------|----------|| Auth Pattern | 45 | Implementing login/signup || API Pattern | 120 | Building REST endpoints || ...
## Detailed Patterns### Auth Pattern...AI can scan the index, jump to the relevant section.
Bad (wastes context):
## Button Pattern
[300 lines of duplicated code from Button.tsx]Good (saves context):
## Button Pattern
**Canonical Example**: `src/components/Button.tsx:15`
**Key Rules**:- Use `cva` for variant composition- Support `size`, `variant`, `disabled` propsBad:
## JWT Authentication
Here's a complete implementation:[500 lines of code]Good:
## JWT Authentication
**Canonical Implementation**: `src/middleware/auth.ts`
**Key Points**:- Use RS256 (not HS256)- Set 15-minute expiry- Include user_id and role in payload
**Example**:```typescriptconst token = jwt.sign({ user_id, role }, privateKey, { algorithm: 'RS256', expiresIn: '15m'});### 3. Archive Aggressively
**Rule of thumb**: If you haven't referenced it in 2 months, archive it.
Historical context is valuable, but it's **COLD** context. Keep it in `docs/archive/`, not in files AI reads regularly.
### 4. Prefer Bullet Points Over Prose
**Bad** (verbose):```markdownWhen you are implementing authentication, you should make surethat you are using the RS256 algorithm instead of HS256 becauseRS256 is more secure and uses asymmetric keys which are betterfor distributed systems where you can't share a secret safely.Good (concise):
- Use RS256 (not HS256) for JWT signing- Why: Asymmetric keys are safer for distributed systemsSavings: ~60% fewer tokens for the same information.
Use AI agent analytics (if available) to see context usage:
Session #1234- Code files read: 45,000 tokens- NEXT-TASKS.md: 800 tokens- docs/core/PATTERNS.md: 2,200 tokens- Conversation history: 15,000 tokens----------------------------------------Total: 63,000 tokensContext remaining: 137,000 tokensRed flags:
If AI repeatedly reads docs/core/ARCHITECTURE.md (5,000 tokens):
Option 1: Summarize it → Reduce to 2,000 tokens
Option 2: Split it → Create focused sub-docs
Option 3: Promote to HOT → Move critical parts to copilot-instructions.md
Mistake: Adding tasks to NEXT-TASKS.md without removing completed ones
Impact: File grows from 180 → 220 → 300 lines over 2 months
Fix: Archive completed tasks immediately after marking ✅ Done
Mistake: Copying entire source files into PATTERNS.md “for reference”
Impact: PATTERNS.md grows to 2,000 lines
Fix: Use canonical links. Provide rules, not implementations.
Mistake: Keeping 2 years of changelog in CHANGELOG.md at project root (HOT tier)
Impact: AI reads 5,000 lines of irrelevant history every session
Fix: Keep only current version in HOT. Archive old versions to docs/archive/v1.0-CHANGELOG.md (COLD).
Faster AI Response
Less context to process = faster generation. AI spends time coding, not reading irrelevant docs.
Better AI Memory
AI “remembers” earlier instructions because context isn’t crowded with noise.
More Room for Code
Leaner docs = more tokens available for showing AI your actual codebase.
Enforced Clarity
Size limits force concise writing. No room for fluff or redundancy.
Let’s calculate the context budget for a typical project:
NEXT-TASKS.md (unconstrained): 3,500 tokensREADME.md (kitchen sink): 2,000 tokensdocs/architecture.md (verbose): 4,500 tokensdocs/api.md (duplicates code): 3,000 tokensCHANGELOG.md (2 years of history): 2,500 tokens----------------------------------------------------Documentation total: 15,500 tokens
Context remaining for code: 84,500 tokens (of 100k)HOT Tier: NEXT-TASKS.md (200 lines): 800 tokens copilot-instructions.md (100 lines): 400 tokens
WARM Tier (read on demand): docs/core/PATTERNS.md: 2,200 tokens docs/core/ARCHITECTURE.md: 1,800 tokens
COLD Tier (ignored): docs/archive/* (not loaded) 0 tokens----------------------------------------------------Documentation total: 5,200 tokens
Context remaining for code: 94,800 tokens (of 100k)Savings: 10,300 tokens = 66% less documentation overhead
That’s enough room for 50+ additional source files or 100+ turns of conversation.
Before marking a task ✅ Done, verify:
NEXT-TASKS.md is under 200 linescopilot-instructions.md is under 100 linesdocs/archive/FUTURE-ENHANCEMENTS.mdTool: Run cortex-tms validate --strict to check automatically.
Now that you understand context budgets: