Architecture Overview
AEGIS is a microservices platform with 9 backend services, a Next.js frontend, and three infrastructure dependencies. All client traffic flows through a single Go API gateway that handles authentication, rate limiting, and CORS.
System Diagram
+--------------------+
| Frontend |
| Next.js 14 :3000 |
+--------+-----------+
|
| HTTP/SSE
v
+--------------------+
| API Gateway |
| Go :8000 |
| JWT auth |
| Rate limit 100/m |
| CORS middleware |
+--------+-----------+
|
+-------+-----------+----------+--------+
| | | | |
v v v v v
+------+--+ +--+------+ +-+-------+ ++------+ +--+------+
| Orch. | | Approval| | KG | | Ingest| | Auth |
| Engine | | Service | | Service | | Svc | | Service |
| :8001 | | :8004 | | :8003 | | :8005 | | :8009 |
+----+----+ +---------+ +----+----+ +---+---+ +---------+
| | |
+-------+-------+ | |
| | | | |
v v v | v
+----+--+ +--+---+ ++------++ +---+---+
| Memory| | KG | |Approval| | Kafka |
| :8002 | | :8003| | :8004 | +-------+
+---+---+ +--+---+ +--------+
| |
v v
+---+---+ +---+---+
| Redis | | Postgres|
| 7 | | 15+AGE |
| | | pgvec |
+-------+ +--------+
+----------+ +----------+
| Complnce | | Flaring |
| Monitor | | Monitor |
| :8006 | | :8007 |
+-----+----+ +-----+---+
| |
v v
+-----+----+ +-----+---+
| KG :8003 | | Postgres |
+----------+ +----------+Service Inventory
AEGIS runs 9 services plus a frontend:
| Service | Port | Language | Responsibility |
|---|---|---|---|
| API Gateway | 8000 | Go | Reverse proxy, JWT authentication (header or aegis_token cookie), rate limiting (100 req/min), CORS. Single entry point for all client traffic. |
| Orchestration Engine | 8001 | Python | LangGraph StateGraph execution, tool calling, skill injection, SSE streaming, compliance engine, checklist CRUD, workspace management. The largest and most complex service. |
| Memory Service | 8002 | Python | Working memory (Redis Hash per conversation), episodic memory (pgvector semantic search), injection ledger (Redis Hash for dedup). |
| Knowledge Graph Service | 8003 | Python | Apache AGE graph CRUD (openCypher), entity types, context assembly, impact propagation, entity management, detection engine. |
| Approval Service | 8004 | Python | HITL approval requests, decision recording, append-only audit trail with HMAC signatures. |
| Ingestion Service | 8005 | Python | RRC data scrapers (wells, leases, fields, permits, production, flaring authorizations), CSV import, entity extraction, Kafka event publishing. |
| Compliance Monitor | 8006 | Python | Deadline tracking, rule change detection, risk scoring. |
| Flaring Monitor | 8007 | Python | Flaring volume tracking, R-32 validation, emissions calculation, operational events. |
| Auth Service | 8009 | Python | Email/password login (bcrypt), JWT token generation (HS256, 24-hour expiry), users table in PostgreSQL. Dev login: admin@aegis.local / aegis-dev-admin. |
| Frontend | 3000 | TypeScript | Next.js 14 App Router dashboard. Conversations, compliance matrix, filings, graph explorer. |
Infrastructure
| Component | Version | Purpose |
|---|---|---|
| PostgreSQL | 15 | Primary database. Hosts relational tables (audit_logs, agents, skills, conversations, etc.), pgvector for embeddings, and Apache AGE for the knowledge graph. Single instance with multiple extensions. |
| Redis | 7 (Alpine) | Working memory storage and injection ledger. Each conversation gets a Redis Hash with 24-hour TTL and 64KB max size. |
| Kafka | Confluent 7.6.0 | Async event bus. Currently used by the ingestion service to publish entity extraction events to the entity-extraction-worker topic. |
All infrastructure runs via Docker Compose on the aegis-network Docker network.
Key Architectural Decisions
Single API Gateway
All frontend requests go through the Go API gateway at port 8000. The gateway:
- Validates the JWT (from the
Authorizationheader oraegis_tokencookie) on every request (except health checks and public entity type endpoints) - Applies per-user rate limiting (100 requests/minute, burst of 10)
- Strips CORS headers from backend responses and applies its own
- Routes requests to the appropriate backend service by URL prefix
Backend services trust the gateway — they do not re-validate authentication.
LangGraph for Agent Execution
Agent execution follows a compiled LangGraph StateGraph with a defined node pipeline. This provides:
- Deterministic control flow with conditional branching
- State persistence across nodes (the
GraphStateTypedDict flows through the entire graph) - Budget enforcement (token count and dollar cost limits per execution)
- Tool-calling loops that can execute multiple rounds before proceeding
See LangGraph Pipeline for the full pipeline documentation.
Three-Layer Memory
The memory system provides context at three levels:
- Working memory (Redis): fast key-value store for the current conversation’s scratchpad, entities, and state
- Episodic memory (pgvector): long-term semantic search over past conversation summaries
- Injection ledger (Redis Hash): tracks what has been injected into the current conversation to prevent duplicates
See Memory System for details.
Knowledge Graph over Relational Queries
Entity relationships are modeled as a property graph in Apache AGE using openCypher queries. This allows:
- Natural modeling of oil and gas entity relationships (wells on leases, operated by operators, located in fields)
- Multi-hop traversals for impact analysis (e.g., “which wells are affected if this compressor goes down?”)
- Context assembly by walking the graph from a seed entity
See Knowledge Graph for the schema.
Mandatory HITL for All Filings
No regulatory filing can be submitted without human approval. The approval node in the LangGraph pipeline checks for HITL requirements and pauses execution with status: awaiting_hitl until a human reviewer approves, rejects, or modifies the output.
Append-Only Audit Trail
The audit_logs table uses PostgreSQL triggers to prevent UPDATE and DELETE operations. Every audit entry includes an HMAC signature for tamper detection.