Memory Service
The Memory Service provides three types of memory for AEGIS agents: working memory (short-term, per-conversation), episodic memory (long-term, semantic search), and an injection ledger (tracking what context has been injected to prevent duplicates).
Overview
Agent conversations need memory at multiple timescales. The memory service addresses this with three subsystems:
- Working memory stores scratchpad data, extracted entities, and arbitrary key-value pairs for the current conversation session. It lives in Redis with a 24-hour TTL and a 64KB size limit per conversation.
- Episodic memory stores conversation summaries with vector embeddings (using OpenAI’s
text-embedding-3-small), enabling semantic retrieval of relevant past interactions filtered by agent and user. - Injection ledger is a Redis Hash that tracks which skills and context blocks have been injected into a conversation, preventing duplicate injection across multi-turn interactions.
Port & Language
| Property | Value |
|---|---|
| Port | 8002 |
| Language | Python 3.12 |
| Framework | FastAPI |
| Entry point | src/memory/main.py |
Key Endpoints
Working Memory
| Method | Path | Description |
|---|---|---|
GET | /working-memory/{conversation_id} | Get all working memory fields for a conversation. Refreshes the TTL. |
PUT | /working-memory/{conversation_id} | Set fields in working memory (merge, not replace). Body accepts an optional ttl_seconds to override the default whole-key TTL (used by R34 HITL pause/resume to hold a gated mutation). Returns 413 if size exceeds 64KB. |
DELETE | /working-memory/{conversation_id} | Delete specific fields or the entire working memory for a conversation. |
Episodic Memory
| Method | Path | Description |
|---|---|---|
POST | /episodic/store | Store a conversation summary with its embedding vector. |
POST | /episodic/search | Semantic search over episodic memories using cosine similarity. |
GET | /episodic/{conversation_id} | Get all episodic memories for a specific conversation. |
Injection Ledger
| Method | Path | Description |
|---|---|---|
GET | /ledger/{conversation_id} | Get all injected items for a conversation. |
POST | /ledger/{conversation_id}/check | Check if a specific item has been injected. |
POST | /ledger/{conversation_id}/mark | Mark an item as injected. |
POST | /ledger/{conversation_id}/evict | Remove an item from the ledger (allow re-injection). |
Health
| Method | Path | Description |
|---|---|---|
GET | /health | Health check. |
Architecture
Module Breakdown
src/memory/
├── main.py # FastAPI app, all endpoint definitions
├── config.py # Settings from environment variables
├── schemas.py # Pydantic request/response models
├── working.py # WorkingMemory class (Redis-backed)
└── episodic.py # EpisodicMemory class (pgvector-backed)Working Memory (working.py)
The WorkingMemory class wraps a Redis Hash at key working_memory:{conversation_id}.
Key behaviors:
- TTL refresh: Every
getorsetoperation refreshes the 24-hour TTL. - Size limit: After every
set, the total size is checked against a 64KB limit. If exceeded, aValueErroris raised and the endpoint returns413 Payload Too Large. - JSON serialization: Values are stored as JSON strings. On retrieval, the service attempts to parse each value back from JSON.
- Scratchpad: A convenience method
set_scratchpad/get_scratchpadfor the commonscratchpadfield. - Entities: A convenience method
set_entities/get_entitiesfor theentitieslist field.
# Redis key structure
working_memory:{conversation_id} # Hash
scratchpad -> "Current analysis of Rule 37 for well API 42-123-45678..."
entities -> '[{"type": "Well", "id": "42-123-45678", "name": "Smith #1"}]'
well_api -> "42-123-45678"Episodic Memory (episodic.py)
The EpisodicMemory class stores and retrieves conversation summaries with vector embeddings in the episodic_memories PostgreSQL table (using the pgvector extension).
Storage flow:
- Generate an embedding for the summary text using
text-embedding-3-small(1536 dimensions) - Insert into
episodic_memoriestable with the embedding, metadata (key decisions, entities, tools), and conversation reference
Search flow:
- Generate an embedding for the query text
- Run a nearest-neighbor search using cosine distance (
<=>operator) - Filter by
agent_idand optionallyuser_id - Return the top-k results with similarity scores
-- Semantic search query
SELECT id, summary, 1 - (embedding <=> $1::vector) AS similarity
FROM episodic_memories
WHERE agent_id = $2
ORDER BY embedding <=> $1::vector
LIMIT $3Injection Ledger
The injection ledger uses Redis Hash operations via the shared RedisManager:
- Key pattern:
skill:ledger:{conversation_id} - Check:
HEXISTSon the hash field - Mark:
HSETwith the item key and a value (typically"injected"or"1") - Evict:
HDELto remove an item, allowing re-injection - Get all:
HGETALLto dump the entire ledger for a conversation
The injection ledger prevents the orchestration engine from re-injecting the same skill content into a conversation. If a skill was already injected in turn 1, it will not be injected again in turn 2, even if the LLM emits the same SKILL_SELECT pattern.
Dependencies
Python Packages
| Package | Version | Purpose |
|---|---|---|
fastapi | ^0.115 | Web framework |
uvicorn | ^0.34 | ASGI server |
openai | ^1.60 | Embedding generation via text-embedding-3-small |
aegis-shared | local | Shared DB helpers (PostgresPool, RedisManager) |
Infrastructure Dependencies
| Dependency | Purpose |
|---|---|
| Redis 7 | Working memory storage, injection ledger |
| PostgreSQL 15 + pgvector | Episodic memory storage with vector similarity search |
Configuration
| Environment Variable | Default | Description |
|---|---|---|
MEMORY_HOST | 0.0.0.0 | Bind address |
MEMORY_PORT | 8002 | Bind port |
DATABASE_URL | postgresql://aegis:aegis_local@localhost:5432/aegis | PostgreSQL connection string |
REDIS_URL | redis://localhost:6379 | Redis connection string |
OPENAI_API_KEY | (empty) | OpenAI API key for embedding generation |
EMBEDDING_MODEL | text-embedding-3-small | Embedding model to use |
WORKING_MEMORY_TTL | 86400 | Working memory TTL in seconds (24 hours) |
The OPENAI_API_KEY environment variable is required for episodic memory storage and search. Without it, the store and search endpoints will fail. Working memory and the injection ledger function without an API key.
Running Locally
cd services/memory-service
poetry install
poetry run uvicorn memory.main:app --reload --port 8002Requires both Redis and PostgreSQL (with pgvector extension) to be running:
# Start infrastructure
docker compose up -d