Trust

You can’t manage an engineer you can’t predict. Without enforceable conventions, Claude reverts to its training distribution, an average of every codebase on the internet, bad code and all. GAIA’s codebase is what you actually want Claude matching. With GAIA, Claude writes code that follows best practices on day one, and can’t ship code that doesn’t.

How GAIA makes Claude trustworthy

  • Coding principles. GAIA’s coding rules embed Karpathy’s four coding principles (Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution), plus two of GAIA’s own: Always Use TDD and Always Verify Your Work.
  • Best practices baked in. Rules encode the conventions directly instead of hoping Claude infers them from whatever’s already in the repo.
  • Guardrails against technical debt. Rules block debt-accumulating patterns from being written in the first place: untyped exports, untested components, hardcoded strings, a11y gaps.
  • Consistently clean code. 20+ ESLint plugins, strict TypeScript, and Prettier enforce style and correctness on every file Claude touches. No negotiation, no drift.
  • Test-driven development. The bundled tdd skill runs a red-green-refactor loop with tests before code, tailored for Vitest, React Testing Library, Storybook composeStory, and MSW.
  • Code-review audit before every merge. A Claude subagent scans the branch diff for security, performance, code smells, and antipatterns, then blocks the merge until the issues are fixed and committed.
  • Quality gate before commit. Typecheck, lint, tests, and build must all pass. Not "mostly clean", actually clean.

Token economics

Context bloat isn’t just CLAUDE.md sprawl. Instructions get dropped into global memory, forgotten, and accumulate into redundancies and conflicts, an invisible cost that compounds every session. GAIA keeps token usage minimal by design.

How GAIA keeps Claude token-efficient

  • Rules are scoped to activate only when needed. Claude loads the ones that match what it’s editing, nothing else.
  • Obsidian wiki, fetched on demand. Project knowledge lives as focused, linked Markdown pages. Claude opens the one page it needs ("How does dark mode wire through?") instead of preloading the whole manual.
  • Wiki behavior tailored to GAIA. Session hooks keep Obsidian’s workflow (ingest cadence, cache discipline, link hygiene) aligned with the project’s conventions.
  • Periodic knowledge audit. Sweeps memory, wiki, and autoloaded files for duplication, conflicts, and stale instructions before they start costing tokens.
  • Session continuity. /handoff and /pickup replace re-briefing Claude from scratch at every session start.

Agentic design

The canonical taxonomy catalogs 29 agentic design patterns. GAIA implements 12 of those 29 structurally: wired in through hooks, agents, rules, commands, and wiki conventions, so each runs the same way every session, every engineer, every model variant. Not as emergent model behavior on top of a vanilla setup.

Headline patterns

The Stop HookClaude can't ship broken code. Pre-tool-use hooks block destructive git, watch-mode tests, force-pushes to main, and eslint-config edits before they happen. Pre-commit hooks gate every commit on typecheck, lint, tests, and build.
ReflectionTwo blocking review layers before merge. The Quality Gate (typecheck, lint, test, build) loops until clean. The code-review audit tiers findings as Critical, Important, and Suggestions and blocks every merge.
Memory ManagementFive tiers so Claude stops relearning your codebase. Wiki for long-term, hot cache for session start, /handoff and /pickup for episodic, agent memory across reviews, user memory across projects. /audit-knowledge sweeps for duplication and bloat.
Multi-Agent CollaborationA specialist for every concern, dispatched in parallel. The code-review audit dispatches React Patterns, TypeScript and Architecture, Translation, and react-doctor specialists from a single tool call. Extension files inject library-specific rules into the right specialist at runtime.

Tier and isolation

RoutingConstraints route to context, not loaded globally. Path-scoped rules auto-load only when Claude is editing matching files. Conditional Bash hooks route commands to specific scripts based on shape.
Resource-Aware OptimizationCost and quality discipline wired in. /audit-knowledge runs research on Opus with ultrathink and the mechanical apply step on Sonnet. The code-review audit declares model: sonnet for structured review, leaving Opus for harder reasoning.
PlanningPlans are durable artifacts requiring user approval. /orchestrate writes per-task docs, a task graph with phases, an execution playbook, and a kickoff prompt to .claude/plans/. The plan never executes until you say go.
Session IsolationSub-agents run in fresh contexts so shared state can't corrupt their work. Each task doc is self-contained for a fresh-context sub-agent. /orchestrate offers a git-worktree branch for filesystem-level isolation.

Tooling and safety

Tool UseA curated React-specific tool surface. ESLint with 20+ plugins, TypeScript, Vitest with React Testing Library, Playwright, Storybook with Chromatic, MSW, the gh CLI, react-doctor, the Obsidian wiki, and typescript-lsp.
ParallelizationIndependent work runs concurrently. The audit dispatches its specialist subagents in a single tool-call message. The orchestrator runs per-phase sub-agents in parallel where dependencies allow.
Guardrails & SafetyDefense in depth, layered from the filesystem up. Filesystem deny list keeps secrets out (.env, credentials, keys). Block hooks reject debt-accumulating patterns at the source. The audit's security dimension covers XSS, SSRF, IDOR, and dependency vulnerabilities.
Human-in-the-LoopSix structurally enforced checkpoints between Claude's intent and impact. Quality Gate, code-review audit, plan approval, phase gates, destructive-git hook, PR-merge reminder. The bypass paths are blocked at the hook layer.

A second brain for Claude

With an Obsidian wiki, Claude understands what you are actually building, not just the code in front of it. Product features, user flows, design rationale, business decisions, architecture, and dependencies all live as focused markdown pages committed to git, not chat history, not machine-local memory.

GAIA ships with the claude-obsidian integration wired up. The vault structure, the ingestion commands, and the maintenance skills are configured before you write your first page.

On top of the integration, GAIA layers project-specific commands and hooks that turn the wiki into a self-maintaining knowledge base. Duplicates, conflicts, and stale information get pruned, so the vault stays clean as the project grows.

Token costs do not balloon as the wiki gets richer. Claude only reads the specific information a task needs, so a project with 1,000 pages of context is no more expensive to work in than one with 10.

Why Obsidian? A local markdown vault means the project’s knowledge persists, compounds, and stays yours, not held in a vendor’s database or trapped in chat history.

Domain isolation is mandatory. Technical, branding, and business knowledge are kept siloed, and Claude does not cross-load between them unless a task genuinely spans more than one.

A Karpathy-style CLAUDE.md

A Karpathy-style CLAUDE.md is wiki-first, don’t-preload, lazy-fetch. It tells Claude where to look, not to load everything upfront. See Karpathy’s original for the philosophy.

GAIA ships with coding rules that only load when Claude is writing code. The first four come from Karpathy’s CLAUDE.md:

  • Think Before Coding - Surface assumptions, push back on complexity, ask when unclear.
  • Simplicity First - Write the minimum code that solves the problem, no speculative abstractions.
  • Surgical Changes - Touch only what is needed; match existing style; leave unbroken things alone.
  • Goal-Driven Execution - Define verifiable success criteria before starting; loop until verified.

GAIA adds two more principles on top of these:

  • Always Use TDD - Vitest for units, Playwright for user flows, tests before code.
  • Always Verify Your Work - Run the quality gate process; fix all warnings and errors before reporting done.

GAIA’s variant also adds three wiki habits on top of Karpathy’s framing. Domain isolation means technical work fetches only from wiki/app/, brand work from its own domain. wiki/hot.md auto-load is a 200-word recent-context cache loaded at every session start to surface work-in-progress. Memory discipline keeps machine-local memory for personal preferences only. Durable knowledge lives in the wiki or .claude/rules/.

Three concrete benefits

  • Tokens saved. Lazy-fetch only the pages Claude needs. Adding 20 more wiki pages does not bloat every session, since they are not loaded unless asked.
  • Context stays focused. Domain isolation prevents unrelated information from leaking into the context window. Technical questions get technical answers only.
  • The pattern scales. Whether you are building a 2-person startup or a 100-person team, the wiki-first discipline keeps sessions lean and predictable.

The stack

20+ ESLint plugins, four testing layers (unit, integration, E2E, visual) with mocking, i18n, dark mode, forms with validation, and Storybook. All pre-configured and documented for Claude.

GAIA

Spend your time on the product, not the workflow

Pair-programming, not babysitting