mem0
The most popular off-the-shelf memory system for AI agents in production. If you're running agents for multiple employees or customers and need each one to have their own private, long-term memory — this is the battle-tested default. Thousands of teams already run it; it's well-documented and well-supported.
- Mid-sized companies (50–500) running agents for real users
- Any product that gives each end-user their own agent
- Teams that value 'boring' and 'widely adopted' over 'frontier'
- Apps that need memory across web, mobile, and CLI surfaces
What you'll do
mem0 is the production default for giving your AI agents long-term, multi-user memory. Most teams reach for it once they outgrow a single-user setup. It's mature, well-documented, and has both Python and TypeScript SDKs. Budget 20 minutes.
Before you start
- Python 3.10+ (or Node.js 20+ if you prefer the TS SDK)
- An OpenAI API key (or compatible — mem0 uses an LLM for extraction)
- Optional: a vector database (mem0 ships with a local one that works fine for <100 users)
Step-by-step install
- 011. Install the SDK
Pick Python or TypeScript based on whatever backend you're already running.
# Python pip install mem0ai # OR TypeScript npm install mem0ai
- 022. Set your API key
mem0 uses OpenAI (or any OpenAI-compatible model) to extract memories from conversations. Export your key as an environment variable.
export OPENAI_API_KEY=sk-...
- 033. Try it in a script
Start with a 10-line Python script to verify mem0 is working end-to-end.
from mem0 import Memory m = Memory() # Add a memory for a specific user m.add("I prefer dark mode and pair programming", user_id="amara") # Retrieve it later results = m.search(query="amara's preferences", user_id="amara") print(results)Tip: Every call is scoped by user_id — that's the unit of memory isolation. One scope per end-user, employee, or tenant. - 044. Wire it into your agent loop
Wherever your agent receives a user message, call m.search() to pull relevant context and prepend it to the prompt. Wherever it generates a response or the user commits a decision, call m.add() to store it. See the mem0 docs for framework integrations (LangChain, LlamaIndex, raw OpenAI).
- 055. Or: use the MCP server for Claude Code / Cursor
If you want mem0 behind an MCP-speaking agent instead of your own backend, install the MCP bridge and point Claude at it.
{ "mcpServers": { "mem0": { "command": "npx", "args": ["-y", "@mem0/mcp"], "env": { "OPENAI_API_KEY": "sk-...", "MEM0_USER_ID": "your-org-or-user-id" } } } } - 066. Upgrade to the hosted or self-hosted server when you scale
For production, point mem0 at a persistent Postgres or a managed vector DB (Pinecone, Weaviate, Qdrant). The mem0 docs walk through each backend. For sub-100-user teams, the default local store is fine.
Your first 10 minutes
- 01Decide your scoping strategy: one user_id per employee? per end-customer? per tenant? This is the single most important design decision.
- 02Wire mem0 into one real workflow end-to-end (e.g. your CX agent) before expanding. Prove retrieval quality on one flow.
- 03Add retrieval logging — you want to see which memories are being pulled and whether they're relevant.
- 04Connect Cognition CLO — mem0 stores, CLO models which concepts are decaying per user.
- 05Set up a weekly review of the memory graph for a sample user. mem0's LLM extraction sometimes drops things you want kept.
Troubleshooting
mem0 is storing too few things — obvious memories are missing.
mem0's default extraction is conservative. Pass custom_instructions to Memory() that describe what kinds of facts to keep. Check the mem0 cookbook for examples.
Retrieval returns irrelevant memories.
Two common fixes: (1) double-check you're searching within the right user_id scope, (2) use a more specific query — mem0's search is semantic, so 'user preferences' returns less than 'amara's notification preferences'.
My mem0 bill is growing fast.
The extraction step is the main cost driver. Route it to a smaller model (e.g. gpt-4o-mini) or batch calls. Consider using mem0's async client so you don't block on extraction.
mem0 holds the knowledge. Cognition CLO models retention per employee per concept using a Weibull forgetting curve — so you see decay before it becomes a missed SOP or a failed audit.