tldraw, a popular React drawing library, recently announced they’re pausing external contributions due to challenges with AI-generated pull requests.
“Like many other open-source projects on GitHub, tldraw has recently seen a significant increase in contributions generated entirely by AI tools. While some of these pull requests are formally correct, most suffer from incomplete or misleading context, misunderstanding of the codebase, and little to no follow-up engagement from their authors.”
While we don’t have comprehensive data on how widespread this issue is, tldraw’s experience resonates with challenges we’ve observed in our own project and heard about anecdotally from other maintainers.
The “AI PR Tsunami” Problem
Here’s what’s happening:
- Developer uses Claude Code or GitHub Copilot to implement a feature
- AI writes code that compiles and passes basic tests
- PR gets submitted with confidence (“AI said it’s good!”)
- Maintainer reviews and finds:
- Pattern violations (doesn’t match existing conventions)
- Architectural drift (solves problem incorrectly for this codebase)
- Missing context (no ADR, no documentation update)
- Incomplete implementation (edge cases ignored)
Sound familiar?
The problem isn’t AI. The problem is AI doesn’t know YOUR architectural decisions.
Why This Happens: The Context Gap
AI coding assistants are phenomenal at:
- ✅ Writing syntactically correct code
- ✅ Implementing well-known algorithms
- ✅ Generating boilerplate
- ✅ Suggesting completions based on immediate context
But they’re terrible at:
- ❌ Understanding your project’s architectural philosophy
- ❌ Following established patterns you’ve codified
- ❌ Remembering decisions from ADRs written 6 months ago
- ❌ Knowing when to break rules (and when not to)
Example: Your team decided to use React Query for data fetching (documented in docs/decisions/003-use-react-query.md). AI suggests a custom useEffect hook because it doesn’t know about that decision.
Result? Architectural drift.
Multiply this by 10 contributors using AI tools, and you get the “tsunami.”
Common Approaches (And Their Challenges)
Teams are trying different approaches:
- More human code review → Can lead to maintainer burnout
- Stricter PR templates → Mixed results with compliance
- Lengthy contribution guides → AI tools may not reference them
- Closing external contributions → Reduces community involvement (tldraw’s choice)
The challenge: Current AI coding tools don’t automatically incorporate project-specific architectural context.
Our Experiment: Pattern-Based Review
We’re testing an idea: what if there was a tool that could:
- Read your documented architectural decisions
- Check code against your documented patterns
- Flag potential violations before submitting PRs
- Reference your project’s unique conventions
That’s what we’re building with Guardian. It’s early, and we’re learning as we go.
Meet Guardian: AI-Powered Code Review
Guardian is Cortex TMS’s new CLI tool that audits code against your project’s documented patterns and architectural decisions.
How It Works
Step 1: Document your patterns (you should already be doing this)
## Data Fetching Pattern
❌ Don't: Custom useEffect hooks for API calls✅ Do: Use React Query for all server state
**Why**: Prevents stale data, reduces boilerplate, standardizes error handling
## Example\`\`\`typescript// ❌ Violates patternconst [data, setData] = useState(null);useEffect(() => { fetch('/api/users').then(setData) }, []);
// ✅ Follows patternconst { data } = useQuery({ queryKey: ['users'], queryFn: fetchUsers });\`\`\`Step 2: Run Guardian on AI-generated code
cortex review src/components/UserList.tsxStep 3: Guardian catches violations
🔍 Reviewing src/components/UserList.tsx
❌ Pattern Violation (Line 12-18)Pattern: Data FetchingSeverity: HIGH
Found: Custom useEffect hook for API callExpected: React Query (see docs/decisions/003-use-react-query.md)
Suggestion: Replace useEffect with useQuery hook
---
✅ Architecture: Follows component structure pattern✅ Domain Logic: Proper error boundary usage❌ Pattern Compliance: 1 violation found
Review complete. Fix violations before committing.Step 4: Fix before committing
No more “oops, I’ll fix that in the next PR.”
Our Early Experience: Dogfooding Guardian
We’ve been testing Guardian on Cortex TMS itself. Here’s what we’ve observed so far:
Important Context: These are preliminary results from our own small project (~20 PRs over 2 weeks, single maintainer). This is not independent verification or proof of effectiveness at scale.
Before Guardian (AI-assisted development, manual review)
- ⏱️ Review time: 20-30 min per PR (our experience)
- 🔄 Back-and-forth cycles: 3-4 rounds (typical for us)
- 🐛 Pattern violations reaching main: ~15% (in our commits)
- 😓 Maintainer frustration: High (subjective)
After Guardian (AI + Guardian pre-review)
- ⏱️ Review time: 7-10 min per PR (our experience, ~66% reduction)
- 🔄 Back-and-forth cycles: 1-2 rounds (our experience)
- 🐛 Pattern violations reaching main: ~3% (in our commits)
- ✅ Clean PRs: ~80% pass first review (our experience)
Our takeaway: In our limited testing, Guardian has helped us catch issues earlier. We can’t yet claim this will work for every project or team.
Our Hypothesis: Pattern-Based Pre-Review
Guardian is our experiment to address these challenges. The approach:
-
Enforcing Documented Patterns
- AI-generated code must match YOUR conventions
- Violations flagged immediately with references to docs
-
Preventing Architectural Drift
- Code audited against ADRs and domain logic
- Ensures consistency across AI-assisted contributions
-
Reducing Review Burden
- Maintainers focus on logic, not style/pattern violations
- 66% faster review cycles
-
Educating Contributors
- Guardian explains WHY code violates patterns
- Links to relevant documentation for context
-
Scaling Code Review
- Same quality bar whether you have 5 or 50 contributors
- AI-assisted contributions become assets, not liabilities
Real Example: Guardian in Action
Here’s a real commit from Cortex TMS development where Guardian caught an issue:
PR: Add blog infrastructure (TMS-284a)
Guardian Review:
$ cortex review website/src/pages/blog/index.astro
❌ Pattern Violation (Line 8)Pattern: Component ImportsExpected: Import CardGrid from '@astrojs/starlight/components'Found: Manual grid implementation
Reference: docs/core/PATTERNS.md#starlight-components
✅ Fixed before commitResult: Clean PR, no maintainer back-and-forth, pattern consistency maintained.
Get Started: Three Ways to Use Guardian
1. Pre-Commit Hook (Recommended)
cortex review $(git diff --staged --name-only --diff-filter=d | grep -E '\.(ts|tsx|js|jsx)$')2. Manual Review Before PR
# Review your changes before committingcortex review src/3. CI/CD Integration
- name: Guardian Review run: cortex review --strict src/Guardian vs. Traditional Linting
ESLint/Prettier: Syntax and formatting (necessary but not sufficient)
Guardian: Architectural patterns and domain logic
| Tool | Scope | Example Check |
|---|---|---|
| ESLint | Syntax | ”Missing semicolon” |
| Prettier | Formatting | ”Inconsistent indentation” |
| Guardian | Architecture | ”Should use React Query, not useEffect (ADR-003)” |
Guardian complements linting—it doesn’t replace it.
What We’re Learning
AI-assisted development is here to stay. The challenge is finding ways to maintain code quality and architectural consistency.
Our approach with Guardian is experimental. We’re testing whether pattern-based pre-review can help:
- Keep AI-accelerated development productive
- Maintain architectural consistency
- Reduce review burden on maintainers
- Scale community contributions effectively
Is it working? Too early to say definitively. Our early internal results are promising, but we need more real-world testing and feedback.
Trade-offs & When Guardian Doesn’t Help
Guardian is an experiment, and like all tools, it has real costs and limitations.
1. False Positive Noise
The risk: Guardian can flag violations that aren’t actually violations (hallucinations).
Impact: Instead of reducing review burden, this creates a NEW type of noise—developers have to evaluate whether Guardian’s feedback is correct.
Our current mitigation: We treat Guardian as a “second opinion,” not ground truth. When Guardian flags something, we manually review and decide whether to fix or dismiss it.
Future work: We’re considering adding // guardian-ignore comments or a CLI flag to suppress specific checks. This would make the “dismiss” decision explicit and portable.
Why it’s hard: Building a guardian-ignore is easy; building one that doesn’t just become a way for lazy AI agents to bypass quality checks is the real challenge. We’re thinking carefully about how to prevent ignore comments from becoming an escape hatch for low-quality contributions.
2. Documentation Drift (Architectural Debt)
The risk: Guardian enforces whatever is in PATTERNS.md. If your team changes a pattern but forgets to update the doc, Guardian will enforce outdated rules.
Impact: Developers get frustrated when Guardian enforces patterns the team no longer follows. This creates architectural debt—the gap between documented patterns and actual practices grows.
Our mitigation: We treat PATTERNS.md as “living documentation”—when we change a pattern in code, we update the doc in the same PR. Guardian actually helps here by forcing us to keep docs current.
The upside: This friction surfaces documentation drift immediately, preventing the accumulation of architectural debt caused by lack of shared context.
3. Cost & Latency
The risk: Running an LLM on every pre-commit hook adds latency (5-15 seconds) and API costs ($0.01-0.05 per review).
Impact: Slows down the “inner loop” of development. Frequent commits become more expensive.
Our approach: Two deployment modes
Pre-commit (Education Tool): Helps developers learn patterns in real-time. You catch violations before they’re committed. Best for onboarding and AI-assisted development.
CI/CD (Safety Net): Runs on PR creation/update. Doesn’t slow down local commits. Ensures standards are never breached in the main branch. Best for high-frequency workflows.
Cost transparency (BYOK model):
- Pre-commit: Individual contributor pays (uses their API key)
- CI/CD: Project maintainer/org pays (API key stored in CI secrets)
Cost example: At ~$0.03 per review and 20 reviews/week, that’s ~$0.60/week or ~$30/year per developer. For open source projects, contributors bear the pre-commit cost; maintainers bear the CI cost. Budget accordingly.
Our workflow: We use pre-commit for features (learning mode) and CI for hotfixes (safety net mode).
When You Shouldn’t Use Guardian
Skip Guardian if:
- Your team already has strong pattern adherence (no AI-assisted development)
- Your patterns aren’t clearly documented yet (Guardian will be confused)
- You’re optimizing for commit speed over review quality
- Your team is skeptical of LLM-based tools (adoption matters more than tech)
- You can’t afford the latency (
10 seconds) or cost ($0.03 per review)
Remember: Guardian is a bet that ~10 seconds of pre-commit checking saves 10+ minutes of human review. If that math doesn’t work for your team, don’t use it.
Try Guardian (Experimental)
Guardian is part of Cortex TMS v2.7 (released January 2026). It’s new and experimental—we’re still learning what works.
# Install globallynpm install -g cortex-tms
# Initialize in your projectcortex init
# Review code against your patternscortex review src/Requirements:
docs/core/PATTERNS.md(document your conventions)- OpenAI or Anthropic API key (BYOK - bring your own key)
Current limitations:
- Early stage tool - expect false positives and false negatives
- Target accuracy: 70%+ on architectural violations (unverified)
- Works best when patterns are clearly documented
- LLM API costs apply (you provide your own key)
We’d love your feedback on whether this approach is useful for your project.
Learn More
- 📖 Guardian CLI Reference - Full command documentation
- 🎯 Use Case: Open Source Maintainers - How Guardian scales OSS
- 🏢 Use Case: Enterprise Teams - Team governance at scale
- 🤝 AI Collaboration Policy - How we build with AI
Join the Conversation
Have you experienced the “AI PR Tsunami” in your projects? How are you handling it?
Built using our own standard. Every feature in Cortex TMS, including Guardian, is dogfooded during development. We experience the problems before you do. See how we build →