Docker Problems
AEGIS uses Docker Compose to run three infrastructure services: PostgreSQL (with Apache AGE and pgvector), Redis, and Kafka. This page covers common Docker-related issues and how to resolve them.
Docker Not Running
Symptoms
Cannot connect to the Docker daemondocker compose upfails immediatelyError response from daemon: dial unix /var/run/docker.sock: connect: connection refused
Solution
Start Docker Desktop (macOS/Windows) or the Docker daemon (Linux):
# macOS — open Docker Desktop
open -a Docker
# Linux
sudo systemctl start dockerVerify Docker is running:
docker info
docker compose versionContainers Won’t Start
Symptoms
docker compose up -dcompletes but containers are not running- Container status shows
RestartingorExit 1
Diagnosis
# Check container status
docker compose ps
# Check logs for the failing container
docker compose logs postgres
docker compose logs redis
docker compose logs kafkaCommon Causes
Ports already in use:
# Check if something else is using the port
lsof -i :5432
lsof -i :6379
lsof -i :9092See the Port Conflicts page for solutions.
Previous container state:
If a container was stopped uncleanly, its data volume may be in a bad state:
# Remove containers and recreate
docker compose down
docker compose up -dInsufficient resources:
Docker Desktop has memory and CPU limits. AEGIS needs at least 4 GB of memory allocated to Docker. Check Docker Desktop preferences.
PostgreSQL Init Failures
Symptoms
aegis-postgrescontainer exits during startup- Logs show
FATAL: database "aegis" does not existor init script errors - Tables or extensions are missing after startup
Diagnosis
docker compose logs postgres 2>&1 | tail -50How Init Scripts Work
The PostgreSQL container mounts several SQL files into /docker-entrypoint-initdb.d/:
volumes:
- ./infrastructure/docker/postgres/init.sql:/docker-entrypoint-initdb.d/001_init.sql
- ./infrastructure/docker/postgres/002_checklist_compliance_tables.sql:/docker-entrypoint-initdb.d/002_checklist_compliance_tables.sql
- ./infrastructure/docker/postgres/007_entity_type_definitions.sql:/docker-entrypoint-initdb.d/007_entity_type_definitions.sql
- ./infrastructure/docker/postgres/00-create-extension-age.sql:/docker-entrypoint-initdb.d/00-create-extension-age.sqlInit scripts only run on first container creation (when the data volume is empty). If you change an init script, you must delete the volume to re-run it.
Solutions
Init scripts not running (volume already exists):
docker compose down
rm -rf docker-volumes/postgres
docker compose up -d postgresSQL syntax error in init script:
Check the postgres logs for the specific error:
docker compose logs postgres 2>&1 | grep -i errorFix the SQL file and recreate the volume:
docker compose down
rm -rf docker-volumes/postgres
docker compose up -d postgresApache AGE extension fails to load:
The AEGIS PostgreSQL image is based on apache/age:release_PG15_1.6.0 with pgvector compiled on top. The 00-create-extension-age.sql script creates the extension with IF NOT EXISTS to prevent crashes on fresh init.
If AGE is not loading, check that the custom Dockerfile built successfully:
docker compose build postgres
docker compose up -d postgresVerifying the database is ready:
# Connect to PostgreSQL
psql -h localhost -U aegis -d aegis
# Check AGE extension
LOAD 'age';
SET search_path = ag_catalog, "$user", public;
SELECT * FROM ag_catalog.ag_graph;
# Check pgvector extension
SELECT * FROM pg_extension WHERE extname = 'vector';
# Check tables exist
\dtRedis Connectivity Issues
Symptoms
aegis-rediscontainer is running but services cannot connectredis-cli pingreturns an error or times out
Diagnosis
# Check container status and health
docker compose ps redis
# Container health check
docker compose exec redis redis-cli ping
# Expected: PONG
# Check from host
redis-cli -h localhost -p 6379 pingSolutions
Container health check failing:
The Redis container has a health check configured:
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 5s
retries: 5If the health check is failing, the container may be starting slowly. Wait a few seconds and check again.
Redis data corruption:
docker compose stop redis
rm -rf docker-volumes/redis
docker compose up -d redisMemory issues:
Redis defaults can run out of memory if large amounts of working memory or ledger data accumulate. For local development, this is rarely an issue, but you can flush all data:
redis-cli FLUSHALLKafka Connectivity Issues
Symptoms
aegis-kafkacontainer isRestartingin a loop- Services report
KafkaError{code=_TRANSPORT}orNoBrokersAvailable - Topics cannot be created
Diagnosis
# Check container status
docker compose ps kafka
# Check logs
docker compose logs kafka 2>&1 | tail -30Solutions
Cluster ID mismatch:
Kafka in AEGIS uses KRaft mode (no ZooKeeper) with a hardcoded cluster ID. If the data volume has stale metadata from a previous cluster, Kafka will fail to start:
docker compose stop kafka
rm -rf docker-volumes/kafka
docker compose up -d kafkaBroker not reachable from services:
The Kafka advertised listener is configured as localhost:9092:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092This works for services running on the host machine. If you run AEGIS services inside Docker containers, you need to change this to the container network address.
Kafka takes too long to start:
Kafka KRaft mode takes a few seconds to elect a controller and become ready. Services that start before Kafka is ready may fail to connect. The start-all.sh script adds a sleep after infrastructure startup to handle this.
Rebuilding Everything from Scratch
When all else fails, a clean rebuild resolves most Docker issues:
# Stop all services
./infrastructure/scripts/start-all.sh stop
# Stop and remove all containers
docker compose down
# Remove all data volumes
rm -rf docker-volumes/postgres docker-volumes/redis docker-volumes/kafka
# Rebuild the custom PostgreSQL image
docker compose build
# Start fresh
docker compose up -dVerify everything is healthy:
docker compose psExpected output:
NAME IMAGE STATUS PORTS
aegis-kafka confluentinc/cp-kafka:7.6.0 Up X seconds 0.0.0.0:9092->9092/tcp
aegis-postgres aegis-postgres Up X seconds (healthy) 0.0.0.0:5432->5432/tcp
aegis-redis redis:7-alpine Up X seconds (healthy) 0.0.0.0:6379->6379/tcpRebuilding from scratch deletes all local data including knowledge graph entities, agent definitions, episodic memories, and any seeded data. You will need to re-seed after rebuilding.
Checking Container Logs
Quick reference for viewing logs:
# All containers
docker compose logs
# Specific container
docker compose logs postgres
# Follow logs (stream in real time)
docker compose logs -f postgres
# Last 50 lines
docker compose logs --tail 50 postgres
# Logs with timestamps
docker compose logs -t postgresDocker Compose Reference
The full docker-compose.yml defines three services:
| Service | Image | Data Volume | Health Check |
|---|---|---|---|
postgres | Custom (apache/age + pgvector) | docker-volumes/postgres | pg_isready -U aegis |
redis | redis:7-alpine | docker-volumes/redis | redis-cli ping |
kafka | confluentinc/cp-kafka:7.6.0 | docker-volumes/kafka | None (no built-in health check) |
All services are on the aegis-network Docker network. The PostgreSQL container builds from a custom Dockerfile at infrastructure/docker/postgres/Dockerfile which extends the Apache AGE image with pgvector support.
Docker Resource Requirements
Minimum recommended Docker Desktop settings for AEGIS:
| Resource | Minimum | Recommended |
|---|---|---|
| Memory | 4 GB | 6 GB |
| CPU | 2 cores | 4 cores |
| Disk | 5 GB | 10 GB |
PostgreSQL with AGE and pgvector is the most resource-intensive container. If you experience slow query performance or out-of-memory errors, increase Docker’s memory allocation.