Skip to Content

Memory Service

The Memory Service provides three types of memory for AEGIS agents: working memory (short-term, per-conversation), episodic memory (long-term, semantic search), and an injection ledger (tracking what context has been injected to prevent duplicates).

Overview

Agent conversations need memory at multiple timescales. The memory service addresses this with three subsystems:

  • Working memory stores scratchpad data, extracted entities, and arbitrary key-value pairs for the current conversation session. It lives in Redis with a 24-hour TTL and a 64KB size limit per conversation.
  • Episodic memory stores conversation summaries with vector embeddings (using OpenAI’s text-embedding-3-small), enabling semantic retrieval of relevant past interactions filtered by agent and user.
  • Injection ledger is a Redis Hash that tracks which skills and context blocks have been injected into a conversation, preventing duplicate injection across multi-turn interactions.

Port & Language

PropertyValue
Port8002
LanguagePython 3.12
FrameworkFastAPI
Entry pointsrc/memory/main.py

Key Endpoints

Working Memory

MethodPathDescription
GET/working-memory/{conversation_id}Get all working memory fields for a conversation. Refreshes the TTL.
PUT/working-memory/{conversation_id}Set fields in working memory (merge, not replace). Body accepts an optional ttl_seconds to override the default whole-key TTL (used by R34 HITL pause/resume to hold a gated mutation). Returns 413 if size exceeds 64KB.
DELETE/working-memory/{conversation_id}Delete specific fields or the entire working memory for a conversation.

Episodic Memory

MethodPathDescription
POST/episodic/storeStore a conversation summary with its embedding vector.
POST/episodic/searchSemantic search over episodic memories using cosine similarity.
GET/episodic/{conversation_id}Get all episodic memories for a specific conversation.

Injection Ledger

MethodPathDescription
GET/ledger/{conversation_id}Get all injected items for a conversation.
POST/ledger/{conversation_id}/checkCheck if a specific item has been injected.
POST/ledger/{conversation_id}/markMark an item as injected.
POST/ledger/{conversation_id}/evictRemove an item from the ledger (allow re-injection).

Health

MethodPathDescription
GET/healthHealth check.

Architecture

Module Breakdown

src/memory/ ├── main.py # FastAPI app, all endpoint definitions ├── config.py # Settings from environment variables ├── schemas.py # Pydantic request/response models ├── working.py # WorkingMemory class (Redis-backed) └── episodic.py # EpisodicMemory class (pgvector-backed)

Working Memory (working.py)

The WorkingMemory class wraps a Redis Hash at key working_memory:{conversation_id}.

Key behaviors:

  • TTL refresh: Every get or set operation refreshes the 24-hour TTL.
  • Size limit: After every set, the total size is checked against a 64KB limit. If exceeded, a ValueError is raised and the endpoint returns 413 Payload Too Large.
  • JSON serialization: Values are stored as JSON strings. On retrieval, the service attempts to parse each value back from JSON.
  • Scratchpad: A convenience method set_scratchpad / get_scratchpad for the common scratchpad field.
  • Entities: A convenience method set_entities / get_entities for the entities list field.
# Redis key structure working_memory:{conversation_id} # Hash scratchpad -> "Current analysis of Rule 37 for well API 42-123-45678..." entities -> '[{"type": "Well", "id": "42-123-45678", "name": "Smith #1"}]' well_api -> "42-123-45678"

Episodic Memory (episodic.py)

The EpisodicMemory class stores and retrieves conversation summaries with vector embeddings in the episodic_memories PostgreSQL table (using the pgvector extension).

Storage flow:

  1. Generate an embedding for the summary text using text-embedding-3-small (1536 dimensions)
  2. Insert into episodic_memories table with the embedding, metadata (key decisions, entities, tools), and conversation reference

Search flow:

  1. Generate an embedding for the query text
  2. Run a nearest-neighbor search using cosine distance (<=> operator)
  3. Filter by agent_id and optionally user_id
  4. Return the top-k results with similarity scores
-- Semantic search query SELECT id, summary, 1 - (embedding <=> $1::vector) AS similarity FROM episodic_memories WHERE agent_id = $2 ORDER BY embedding <=> $1::vector LIMIT $3

Injection Ledger

The injection ledger uses Redis Hash operations via the shared RedisManager:

  • Key pattern: skill:ledger:{conversation_id}
  • Check: HEXISTS on the hash field
  • Mark: HSET with the item key and a value (typically "injected" or "1")
  • Evict: HDEL to remove an item, allowing re-injection
  • Get all: HGETALL to dump the entire ledger for a conversation

The injection ledger prevents the orchestration engine from re-injecting the same skill content into a conversation. If a skill was already injected in turn 1, it will not be injected again in turn 2, even if the LLM emits the same SKILL_SELECT pattern.

Dependencies

Python Packages

PackageVersionPurpose
fastapi^0.115Web framework
uvicorn^0.34ASGI server
openai^1.60Embedding generation via text-embedding-3-small
aegis-sharedlocalShared DB helpers (PostgresPool, RedisManager)

Infrastructure Dependencies

DependencyPurpose
Redis 7Working memory storage, injection ledger
PostgreSQL 15 + pgvectorEpisodic memory storage with vector similarity search

Configuration

Environment VariableDefaultDescription
MEMORY_HOST0.0.0.0Bind address
MEMORY_PORT8002Bind port
DATABASE_URLpostgresql://aegis:aegis_local@localhost:5432/aegisPostgreSQL connection string
REDIS_URLredis://localhost:6379Redis connection string
OPENAI_API_KEY(empty)OpenAI API key for embedding generation
EMBEDDING_MODELtext-embedding-3-smallEmbedding model to use
WORKING_MEMORY_TTL86400Working memory TTL in seconds (24 hours)

The OPENAI_API_KEY environment variable is required for episodic memory storage and search. Without it, the store and search endpoints will fail. Working memory and the injection ledger function without an API key.

Running Locally

cd services/memory-service poetry install poetry run uvicorn memory.main:app --reload --port 8002

Requires both Redis and PostgreSQL (with pgvector extension) to be running:

# Start infrastructure docker compose up -d
Last updated on