Back to Learn
Updated 2026-04-245 min read

How to give Claude persistent memory

Claude forgets by default. This is how you give it durable, searchable memory across sessions using MCP and a managed knowledge layer.

Claude
Memory
MCP

Claude forgets everything when a conversation ends. That is by design — each session starts with a fresh context window and no access to previous threads. For casual chat this is fine; for a long-running assistant it is crippling. The fix is an external memory store that Claude can read and write through a tool. MCP makes this pattern standard.

The two kinds of 'memory' people mean

Terminology first. 'Memory' in LLM land usually means one of two things:

  • Working memory — the current conversation's tokens. Limited by the context window. Vanishes when you close the tab.
  • Long-term memory — facts the model can retrieve on demand across sessions. Stored outside the model, usually in a database with semantic search.

This article is about the second kind. We will not try to enlarge Claude's context window; we will give it a sidecar.

Architecture in three boxes

  • A memory store — anywhere you can search by meaning. A managed service, a vector DB, or a hybrid keyword+vector index.
  • Two tools exposed through MCP — typically memory_save(content, tags) and memory_search(query, limit).
  • A short rule in Claude's system prompt — 'before answering, search memory; when the user tells you something important, save it.'

The protocol Claude follows

Once the tools are wired up, the loop is the model's responsibility, not yours. A well-tuned system prompt pushes Claude to:

  • Call memory_search on any question that might touch prior context.
  • Call memory_save when the user states a preference, a fact, or a long-running goal.
  • Attribute recalled facts back to their source memory when asked.
  • Supersede stale memories with new ones instead of accumulating contradictions.

What to store, and what not to

Good memory is concise, dated, and specific. 'User prefers TypeScript over Python for new services, as of 2026-04-12' is useful. Dumping entire conversations into memory is not — it bloats retrieval and surfaces noise. Treat memory as a notebook, not a transcript.

Scope matters

Scope memories per user, and within a user, per project. Cross-user leakage is the fastest way to lose trust; cross-project leakage makes recall noisier and less useful. A simple tag-based scheme ({ owner: userId, project: slug }) carries you a long way before you need anything fancier.

Privacy and deletion

Persistent memory is a promise you made to the user. Expose a way to list, edit, and delete stored memories, and honour it. Encrypt at rest, scope tightly, and keep memory out of model training. On 3meel, documents and memory are encrypted in transit and stored in access-controlled object storage; they are never used to train models.

The 3meel approach

3meel ships a memory surface through MCP out of the box. Connect Claude Desktop with your API key, and Claude gains memory_save, memory_search, memory_resume, and memory_delete tools. Memories are scoped per project and can supersede each other explicitly, so conflicting facts resolve cleanly rather than piling up.

Give Claude a memory in five minutes — sign up, paste the MCP config block, and start saving facts from your first conversation.

Start free

Keep reading

MCP (Model Context Protocol) explained

What the Model Context Protocol is, why Anthropic created it, how clients and servers talk, and where to use it in production.

What is RAG as a Service?

A plain-English definition of Retrieval-Augmented Generation as a Service, what problems it solves, how it works, and when to use it.

Secure MCP servers: best practices

MCP servers can expose your data and infrastructure to an LLM. Here is how to authenticate, authorise, rate-limit, and audit them without shooting yourself in the foot.