Claude Code Memory System: 5 Practical Layers That Compound

Imagine finishing a client task, pressing one button, and 30 seconds later your AI agent permanently knows what you learned — the new contact, the deployment quirk, the technical decision. Not just in this session. In every future session, across every project, on every machine.

That's what a claude code memory system does when it's built right. It stops the daily ritual of re-explaining your stack, your clients, and your conventions to an AI that forgot everything overnight.

I manage 15+ client projects across WordPress, Next.js, and Python backends. After six months of iteration, my claude code memory system uses five layers: a structured GitHub repo as the knowledge base, a routing index that replaces semantic search, and a Notion-driven ingest pipeline that captures knowledge where the work already happens. No extra tools required — this works with Claude Code, Cursor, Windsurf, Gemini CLI, or any agent that can read files.

Here's the exact setup, every file, and the reasoning behind each layer.

Why Every Claude Code Memory System Article Says "Use Obsidian"

Search for "claude code memory system" and you'll find dozens of guides. Almost all of them recommend the same stack: Obsidian vault + smart-connections MCP server + qmd MCP server. The viral thread by @nyk_builderz popularized this as a three-tier architecture — session memory, knowledge graph, ingestion pipeline.

The concept is sound. The tooling choice is a community bias, not a technical requirement.

The "tools for thought" community — Obsidian, Logseq, Roam — has been active on X and YouTube for five years. When AI memory became a topic, they were first to publish because they already had the vaults. That head start created an echo chamber where "persistent memory" became synonymous with "Obsidian."

What Claude Code actually needs to build a claude code memory system:

Read — to open a markdown file
Glob — to find files by pattern
Grep — to search content across files

That's it. These tools work on any directory of markdown files. Obsidian's graph visualization, community plugins, and wikilink syntax add zero value for the AI agent reading your files. Claude doesn't care if your notes are in ~/obsidian-vault or ~/repositories/knowledge-base. It reads the same markdown either way.

The 5 Layers of a Practical Claude Code Memory System

A claude code memory system that compounds needs five layers. Each solves a different failure mode. Skip one, and the others lose leverage.

Layer 1: CLAUDE.md — The Identity File

CLAUDE.md is the first file Claude reads in every session. Most people treat it like a config dump. The teams that get leverage treat it like a teaching document — telling the AI who you are, how you work, and what to never do.

What belongs here:

Architecture decisions that don't change weekly (stack, hosting, conventions)
Workflow rules (conventional commits, branch naming, PR process)
Explicit boundaries (never push to main, never SSH without approval, never log secrets)
Tool preferences (which SSH aliases map to which servers, which MCP servers to use)
Client routing — a pointer to where detailed client info lives

What does NOT belong: anything that changes often. Put shifting details in topic files (Layer 3) and reference them from CLAUDE.md.

Rule of thumb: If you have to update CLAUDE.md more than once a week, you're putting too much in it.

Layer 2: Auto-Memory — The Session Learner

Claude Code has a built-in auto-memory system at ~/.claude/projects/<hash>/memory/. It persists observations across sessions — patterns it noticed, corrections you made, preferences you expressed.

The key file is MEMORY.md. It's loaded into every conversation automatically. The rule: keep it under 200 lines. It's a routing document, not a dump. Detailed notes go in topic files and get linked from MEMORY.md.

This is the cheapest layer of a claude code memory system to implement. Enable it, let Claude learn from corrections, and it immediately stops re-asking questions you've answered before. But it has a ceiling — it can only hold what fits in the auto-memory directory, and it's scoped to one project path.

That's where Layer 3 changes everything.

Layer 3: A GitHub Repo as Your AI's Brain

This is where my claude code memory system diverges from every Obsidian guide. Instead of a local vault with MCP plugins, I use a plain GitHub repository with structured markdown files.

The repo is called personal-memory and has this structure:

personal-memory/
├── MEMORY.md              # Routing index (under 200 lines)
├── ORG.md                 # Operating rules, review routine
├── clients/
│   ├── _TEMPLATE.md       # Standardized client file template
│   ├── acme-corp.md       # Per-client: hosting, stack, contacts, quirks
│   ├── northwind.md
│   └── ... (15+ clients)
├── engineering/
│   ├── patterns.md        # Reusable patterns (2+ projects)
│   ├── debugging.md       # Hard bugs: symptom → check → fix
│   └── architecture-principles.md
├── playbooks/
│   ├── deploy.md
│   ├── bloggproduksjon.md
│   └── incidents.md
└── policies/
    ├── security.md
    └── access.md

The routing index pattern is critical. MEMORY.md is a table of contents — one line per file, organized by question ("Which client?" → clients/X.md, "How do we deploy?" → playbooks/deploy.md). When Claude starts a session, it reads MEMORY.md, identifies the relevant file, and reads only that. No semantic search needed. No MCP server. Just file paths.

Why GitHub beats Obsidian for a claude code memory system:

Feature	GitHub Repo	Obsidian Vault
Version control	Native git (diff, blame, revert)	Plugin-dependent
Access from VPS/CI	git clone	Requires sync setup
Collaboration	PRs, branch review	Not built-in
Automation	Webhooks, GitHub Actions	Community plugins
Search by AI agent	Glob + Grep (built-in)	MCP servers required
Mobile editing	GitHub.com / any editor	Obsidian mobile app

The clients/_TEMPLATE.md file enforces consistency. Every new client gets the same sections: overview, systems, deploy routine, contacts, notes. Claude knows exactly where to find hosting details or the SSH alias for any client — because the structure is identical across all files.

Layer 4: The Routing Index — How Claude Finds the Right File

The routing index is what makes a claude code memory system work at scale. Without it, Claude has to search across dozens of files. With it, Claude reads one document and navigates directly to the answer.

Here's a simplified version of mine:

# Routing Index

| Question | Document |
|----------|----------|
| Client hosting, stack, contacts? | clients/<name>.md |
| How do we deploy? | playbooks/deploy.md |
| Reusable code patterns? | engineering/patterns.md |
| Hard-to-debug issues? | engineering/debugging.md |
| Security and access policies? | policies/security.md |

When I say "we're deploying acme-corp," Claude reads the routing index, opens clients/acme-corp.md, and immediately knows the hosting provider, SSH alias, deploy procedure, and known quirks — without me explaining anything.

This pattern scales to any number of files. The index stays under 200 lines. The details live in topic files. Claude navigates by path, not by semantic similarity. It's faster, more predictable, and requires zero infrastructure.

Layer 5: Notion-Driven Ingest — The Closed Loop

This is the layer that makes a claude code memory system compound instead of decay.

Most knowledge dies in the gap between "I learned something useful" and "my AI can find it next session." Ingest closes that gap with a single button press.

The architecture:

I finish a task or meeting in Notion
I press the "Ingest" button on the task or note
Notion fires a webhook to a Fastify server running on my Hetzner VPS
The server fetches the task context — title, description, comments, linked entities (client name, project, repo URL)
It clones personal-memory, runs Claude Code with an ingest prompt
Claude reads the routing index, finds the right file, updates it with non-trivial learnings
Commits and pushes directly to main
Posts a summary back to the Notion task as a comment — which files were updated, commit link, estimated cost

A real example: I completed a plugin compatibility task for a client (PR #20, version 1.0.28). Pressed "Ingest." Thirty seconds later, the client's memory file had three new entries: a new contact added as reviewer, a repo reference linked, and a technical decision note about choosing a more reliable detection method over a generic class check. Cost: $0.17. Time: 30 seconds.

Why Notion and not a CLI command? Because the work already happens in Notion. The task has the context — what was done, which client, which repo. A button press is zero friction. A CLI command means context-switching, remembering the right flags, and manually describing what you learned.

But ingest is only half the loop. The same webhook infrastructure supports a query path. I type #claude what do we know about this client? as a Notion comment, and the runner clones personal-memory plus the entity repo (if linked), runs Claude in read-only mode, and posts the answer back as a comment — with full conversation history for follow-ups.

This means I never leave Notion to query my knowledge base. The ingest button writes to memory. The #claude comment reads from it. Same infrastructure, same repos, bidirectional. You can even choose models: #sonnet for quick/cheap answers, #opus for deep analysis, #new to start a fresh session.

How this differs from Claude Code Channels (shipped March 20, 2026): Channels let you trigger Claude Code via Discord, Telegram, or HTTP. My runner predates Channels and is fully self-hosted — no dependency on a running Claude Code process. It spawns sessions on demand, handles its own job queue, and integrates directly with Notion's data model (entity relations, comments API). Channels is great for chat-style interaction; the ingest runner is built for autonomous, fire-and-forget knowledge capture — and now, for querying that captured knowledge without leaving your task manager.

What Makes This a Claude Code Memory System (Not Just Notes)

The difference between notes and a claude code memory system is the closed loop:

Work happens → in any repo, for any client
Knowledge is captured → one button press triggers ingest
Memory is updated → the right file in the right format, pushed to Git
Next session is better → Claude reads the updated file, starts with full context

Every ingest makes the next session more productive. A client's quirky deploy process, a hard-won debugging insight, a new contact's name — captured once, available forever, across every repo and every machine.

The routing index + client templates mean this scales. Adding a new client takes 30 seconds: copy _TEMPLATE.md, fill in the basics, add to the routing index. Claude immediately knows how to work with that client.

5 Lessons After Managing 15+ Client Projects

Keep MEMORY.md under 200 lines. The moment it becomes a dump, Claude starts ignoring parts of it. Treat it like a table of contents, not a document.
Only ingest what surprised you. "Had a meeting" is not worth storing. "Client requires EHF invoicing to municipalities" is. The filter is: would future-you need this to avoid a mistake?
Templates beat freeform. _TEMPLATE.md forces every client file to have the same sections. Claude finds information by position, not by guessing where you wrote it.
Git revert is your safety net. Pushing docs directly to main feels risky until you realize that git revert is one command away. For knowledge files (not code), the speed of direct push outweighs the overhead of PR review.
Separate identity from knowledge. CLAUDE.md is who you are and how you work. The memory repo is what you know. Mixing them creates a file that's too long, changes too often, and loses its purpose.

Frequently Asked Questions

What is a claude code memory system?

A claude code memory system is a set of structured files and processes that give AI coding agents persistent context across sessions. Instead of re-explaining your architecture, conventions, and project details every time, the agent reads from a maintained knowledge base. The most effective systems use a routing index pattern — a short table of contents that points to detailed topic files.

Can I use Notion instead of Obsidian for a claude code memory system?

Yes. Obsidian is popular because the tools-for-thought community published first, not because it’s technically superior. Any AI agent that can read markdown files works with a GitHub repo or local directory. Notion adds value as an orchestration layer — triggering ingest via webhooks when tasks complete — rather than as the knowledge store itself.

How do I set up a webhook runner for Claude Code?

Build a simple HTTP server (Node.js/Fastify works well) that receives a Notion webhook with a page ID, fetches the task context via Notion API, clones your knowledge repo, runs Claude Code with an ingest prompt, and pushes changes to main. The server runs on any VPS. Total infrastructure: one Fastify process, one systemd service, one nginx proxy rule.

What is the difference between Claude Code Channels and a self-hosted runner?

Claude Code Channels (March 2026) let you trigger Claude via Discord, Telegram, or HTTP webhooks and require a running Claude Code process with claude.ai authentication. A self-hosted runner spawns Claude Code sessions on demand, manages its own job queue, and integrates directly with external APIs like Notion. Channels is better for conversational interaction; a runner is better for autonomous, fire-and-forget tasks like knowledge ingest.

How do I manage AI memory across multiple client projects?

Use a single knowledge repository with a routing index and per-client files following a standardized template. The routing index maps questions to file paths. Each client file has identical sections (overview, systems, deploy routine, contacts, notes). This lets the AI agent navigate to the right context in one step, regardless of which project you’re currently working in.

The best claude code memory system is the one you actually maintain. Obsidian, GitHub, a folder on your desktop — the tool matters less than the structure. A routing index under 200 lines, standardized topic files, and an ingest pipeline that connects to where your work already happens.

Start with CLAUDE.md. Add auto-memory. When you outgrow it, create a repo with a routing index. When you want it to compound automatically, build the ingest loop.

Every session gets better. That's the point.