Writing Tests
This guide covers the patterns and conventions used across AEGIS test suites, with concrete examples from the codebase.
File and Directory Structure
Tests live in services/{service}/tests/ alongside the service source code:
services/memory-service/
├── src/memory/ # Service source code
├── tests/
│ ├── __init__.py # Required for pytest discovery
│ ├── conftest.py # Shared fixtures
│ ├── test_working_memory.py
│ ├── test_episodic_memory.py
│ └── test_ledger.py
└── pyproject.tomlNaming Conventions
- Test files:
test_{module}.pyortest_{feature}.py - Test classes:
TestFeatureName(e.g.,TestCreateApproval,TestVertexCrud) - Test methods:
test_{behavior}(e.g.,test_create_named_individual,test_get_vertex_not_found) - Every test directory must contain an
__init__.pyfile
The conftest.py Pattern
Each service with complex test infrastructure defines shared fixtures in conftest.py. These fixtures provide mock infrastructure (databases, Redis, HTTP clients) and service instances that tests consume via dependency injection.
Pattern 1: Mock Redis Client (memory-service)
The memory-service conftest builds a mock Redis client backed by an in-memory dict, simulating Hash operations:
# services/memory-service/tests/conftest.py
@pytest.fixture
def mock_redis():
store: dict[str, dict[str, str]] = {}
client = AsyncMock()
async def hgetall(key):
return dict(store.get(key, {}))
async def hset(key, field, value):
store.setdefault(key, {})[field] = value
client.hgetall = AsyncMock(side_effect=hgetall)
client.hset = AsyncMock(side_effect=hset)
# ... other operations
return client, storeThe fixture returns both the mock client and the backing store, so tests can inspect stored data directly.
Pattern 2: Fake PostgreSQL Pool (approval-service)
The approval-service builds a FakePostgresPool class that maintains in-memory dicts for approvals and audit_logs, and interprets basic SQL patterns:
class FakePostgresPool:
def __init__(self):
self.approvals: dict[str, dict[str, Any]] = {}
self.audit_logs: list[dict[str, Any]] = []
async def execute(self, query: str, *args) -> str:
q = query.strip().upper()
if "INSERT INTO" in q and "APPROVAL_REQUESTS" in q:
self._insert_approval(args)
elif "UPDATE" in q and "APPROVAL_REQUESTS" in q:
self._update_approval(query, args)
return "OK"
async def fetch(self, query: str, *args) -> list[dict[str, Any]]:
...Pattern 3: Fake Graph Store (knowledge-graph-service)
The knowledge-graph-service builds a FakeAgeGraph that simulates Apache AGE Cypher operations with vertex/edge CRUD and a regex-based Cypher interpreter:
class FakeAgeGraph:
def __init__(self):
self._vertices: dict[int, dict[str, Any]] = {}
self._edges: dict[int, dict[str, Any]] = {}
def create_vertex(self, label: str, props: dict) -> dict:
vid = self._alloc_id()
vertex = {"id": vid, "label": label, "properties": dict(props)}
self._vertices[vid] = vertex
return vertex
def match_vertices(self, label: str, match_props=None, limit=50):
...A companion _interpret_cypher() function uses regex to parse CREATE, MATCH, SET, and DETACH DELETE patterns and dispatch them to the fake graph.
Pattern 4: Real Database with Cleanup (flaring-monitor)
The flaring-monitor uses real database connections with careful cleanup:
@pytest_asyncio.fixture
async def pg():
pool = PostgresPool(dsn=DATABASE_URL, min_size=1, max_size=5)
await pool.connect()
# Cleanup before yielding
await pool.execute("DELETE FROM event_type_definitions WHERE tenant_id = $1", TEST_TENANT)
yield pool
# Cleanup after test
await pool.execute("DELETE FROM event_type_definitions WHERE tenant_id = $1", TEST_TENANT)
await pool.disconnect()When using real database fixtures, always use a fixed test tenant ID and clean up both before and after the test to ensure isolation regardless of previous test failures.
Writing Unit Tests
Async Unit Tests
With asyncio_mode = "auto", write async test methods directly in test classes:
class TestWorkingMemoryUnit:
async def test_set_and_get(self, working_mem):
await working_mem.set("conv-1", {"key1": "value1", "key2": 42})
result = await working_mem.get("conv-1")
assert result["key1"] == "value1"
assert result["key2"] == 42
async def test_get_empty(self, working_mem):
result = await working_mem.get("nonexistent")
assert result == {}No decorators are needed. The working_mem fixture is injected by pytest from conftest.
Synchronous API Tests
Tests that exercise FastAPI endpoints via TestClient are synchronous, because TestClient handles the async event loop internally:
class TestWorkingMemoryAPI:
def test_get_empty(self, app_client):
r = app_client.get("/working-memory/conv-new")
assert r.status_code == 200
body = r.json()
assert body["conversation_id"] == "conv-new"
assert body["data"] == {}
def test_put_and_get(self, app_client):
r = app_client.put(
"/working-memory/conv-1",
json={"data": {"scratchpad": "notes", "count": 5}},
)
assert r.status_code == 200Testing Error Cases
Always test error paths and edge cases alongside the happy path:
class TestCreateApproval:
def test_create_invalid_strategy(self, app_client):
r = app_client.post("/approvals", json={
"execution_id": "exec-3",
"agent_id": "agent-1",
"checkpoint_type": "pre_filing",
"state_snapshot": {},
"reviewer_strategy": "invalid",
})
assert r.status_code == 400
def test_create_named_individual_without_reviewer_id(self, app_client):
r = app_client.post("/approvals", json={
"execution_id": "exec-4",
"agent_id": "agent-1",
"checkpoint_type": "pre_filing",
"state_snapshot": {},
"reviewer_strategy": "named_individual",
# Missing reviewer_id
})
assert r.status_code == 400Testing with pytest.raises
For functions that raise exceptions, use pytest.raises:
async def test_create_invalid_label(self, crud):
with pytest.raises(ValueError, match="Unknown vertex label"):
await crud.create_vertex("InvalidLabel", {"entity_id": "x"})Writing API Integration Tests
API tests follow a setup-act-assert pattern, often creating prerequisite data before testing the target endpoint:
class TestDecideApproval:
def _create_pending(self, app_client):
"""Helper to create a pending approval for testing decisions."""
r = app_client.post("/approvals", json={
"execution_id": "e1",
"agent_id": "a1",
"checkpoint_type": "pre_filing",
"state_snapshot": {"messages": [{"role": "assistant", "content": "draft"}]},
"reviewer_strategy": "named_individual",
"reviewer_id": "rev-1",
})
return r.json()["id"]
def test_approve(self, app_client):
aid = self._create_pending(app_client)
r = app_client.post(f"/approvals/{aid}/decide", json={
"decision": "approved",
"reviewer_id": "rev-1",
"reviewer_comments": "Looks good",
})
assert r.status_code == 200
body = r.json()
assert body["status"] == "approved"
assert body["decided_at"] is not None
def test_decide_already_decided(self, app_client):
aid = self._create_pending(app_client)
# First decision succeeds
app_client.post(f"/approvals/{aid}/decide", json={
"decision": "approved", "reviewer_id": "rev-1",
})
# Second decision should fail with 409 Conflict
r = app_client.post(f"/approvals/{aid}/decide", json={
"decision": "rejected", "reviewer_id": "rev-2",
})
assert r.status_code == 409Using Seeded Test Data
The orchestration-engine conftest provides helper functions and seeded fixtures for complex test scenarios:
# conftest.py helpers
def make_compliance_row(entity_id="well-1", status="compliant", domain="rule_37", **extra):
return {
"id": str(uuid.uuid4()),
"tenant_id": "default",
"entity_id": entity_id,
"entity_type": "Well",
"compliance_domain": domain,
"status": status,
...
}
@pytest.fixture
def seeded_pg(fake_pg):
"""FakePostgresPool pre-seeded with templates and rule versions."""
template = make_template_row()
fake_pg.seed_table("checklist_templates", [template])
fake_pg.seed_table("rule_versions", [
make_rule_version_row(rule_domain="spacing", rule_identifier="SWR_37"),
])
return fake_pgTests then use these fixtures to verify behavior against known data:
class TestComplianceSummary:
def test_with_overdue(self, app_client, fake_pg):
fake_pg.seed_compliance_status([
make_compliance_row(entity_id="w-1", status="overdue"),
make_compliance_row(entity_id="w-2", status="overdue"),
])
resp = app_client.get("/compliance/summary")
assert resp.json()["overdue_count"] >= 2Testing Pure Functions
Some services have pure business logic functions that need no mocking at all:
from compliance.deadlines import scan_deadlines
from compliance.risk_scoring import score_well_risk
class TestDeadlines:
def test_critical_deadline(self):
today = date(2025, 4, 10)
permits = [{"permit_number": "P-1", "expiration_date": "2025-04-14"}]
result = scan_deadlines(permits, [], reference_date=today)
assert result["critical_count"] == 1
class TestRiskScoring:
def test_high_risk(self):
result = score_well_risk(
well={"well_name": "W-1", "api_number": "42-1"},
deadlines=[{"days_remaining": -5}],
production_issues=["GOR exceeded", "Zero production"],
violation_count=2,
flaring_exposure={"pct_of_max": 105},
)
assert result["risk_level"] == "HIGH"
assert result["risk_score"] >= 70These tests are straightforward and fast — no fixtures required beyond the function under test.
Adding a New Test
Follow this checklist when adding tests to an existing service:
- Create or extend a test file in
services/{service}/tests/test_{feature}.py - Add an
__init__.pyif the tests directory does not already have one - Add fixtures to
conftest.pyif you need shared mock infrastructure - Organize tests in classes grouped by feature or endpoint
- Use async for business logic tests and sync for
TestClientAPI tests - Test the happy path, error cases, and edge cases
- Run the tests to verify they pass:
cd services/{service-name} poetry run pytest tests/test_{feature}.py -v
Adding Tests for a New Service
When creating tests for a brand-new service:
- Create the
tests/directory with__init__.py - Create
conftest.pywith the appropriate mock infrastructure:- PostgreSQL-backed service: build a
FakePostgresPool - Redis-backed service: build a mock Redis client with
AsyncMockandside_effect - Graph-backed service: build a
FakeAgeGraph
- PostgreSQL-backed service: build a
- Create an
app_clientfixture that patches the service’s global dependencies and replaces the lifespan:@pytest.fixture def app_client(fake_pg): import my_service.main as main_module main_module.pg = fake_pg @asynccontextmanager async def noop_lifespan(app): yield main_module.app.router.lifespan_context = noop_lifespan return TestClient(main_module.app) - Add
[tool.pytest.ini_options]withasyncio_mode = "auto"inpyproject.toml - Add dev dependencies for
pytest,pytest-asyncio, andhttpx:[tool.poetry.group.dev.dependencies] pytest = "^8.0" pytest-asyncio = "^0.23" httpx = "^0.28"