AI Agent Memory: Short-Term, Long-Term, and Semantic

Why AI Agents Need Memory

A language model by itself has no memory beyond its context window. Once that fills up, earlier information is lost. For agents handling long tasks or returning users, this is a fatal limitation. Agent memory systems solve this by storing, retrieving, and updating information across sessions.

Three Types of Agent Memory

1. Short-Term Memory (Context Window)

This is the conversation history and current task state within one session. Modern models support 128K–1M token context windows, which covers most single sessions. The challenge is cost: large contexts are expensive, so good agents summarise and compress older content.

2. Long-Term Memory (Persistent Storage)

Information that survives session boundaries. Implemented with databases: key-value stores for user preferences, relational databases for structured facts, file systems for documents. The agent writes to long-term memory explicitly when something is worth remembering.

3. Semantic Memory (Vector Databases)

The most powerful and nuanced type. Text is converted to embeddings (numerical vectors) and stored in a vector database. When the agent needs relevant information, it converts its query to a vector and retrieves the most similar stored vectors. This enables Retrieval-Augmented Generation (RAG) — grounding the agent’s responses in relevant documents.

Popular Vector Databases for Agent Memory

Pinecone, Weaviate, Qdrant, and ChromaDB are the leading vector database options. For Elasticsearch users, Elastic Edge AI’s `elastic-edge-embed` plugin brings embedding generation and vector search directly into existing Elasticsearch clusters without a separate vector database.

Memory Management Strategies

Good agent memory systems need: automatic summarisation of old context, relevance-scored retrieval (not just recency), periodic memory consolidation (merging related memories), and forgetting mechanisms (GDPR-compliant deletion). MiroFish (51.9K stars) is one of the best open-source implementations of agent memory with knowledge graph integration.