Skip to Content

Writing Tests

This guide covers the patterns and conventions used across AEGIS test suites, with concrete examples from the codebase.

File and Directory Structure

Tests live in services/{service}/tests/ alongside the service source code:

services/memory-service/ ├── src/memory/ # Service source code ├── tests/ │ ├── __init__.py # Required for pytest discovery │ ├── conftest.py # Shared fixtures │ ├── test_working_memory.py │ ├── test_episodic_memory.py │ └── test_ledger.py └── pyproject.toml

Naming Conventions

  • Test files: test_{module}.py or test_{feature}.py
  • Test classes: TestFeatureName (e.g., TestCreateApproval, TestVertexCrud)
  • Test methods: test_{behavior} (e.g., test_create_named_individual, test_get_vertex_not_found)
  • Every test directory must contain an __init__.py file

The conftest.py Pattern

Each service with complex test infrastructure defines shared fixtures in conftest.py. These fixtures provide mock infrastructure (databases, Redis, HTTP clients) and service instances that tests consume via dependency injection.

Pattern 1: Mock Redis Client (memory-service)

The memory-service conftest builds a mock Redis client backed by an in-memory dict, simulating Hash operations:

# services/memory-service/tests/conftest.py @pytest.fixture def mock_redis(): store: dict[str, dict[str, str]] = {} client = AsyncMock() async def hgetall(key): return dict(store.get(key, {})) async def hset(key, field, value): store.setdefault(key, {})[field] = value client.hgetall = AsyncMock(side_effect=hgetall) client.hset = AsyncMock(side_effect=hset) # ... other operations return client, store

The fixture returns both the mock client and the backing store, so tests can inspect stored data directly.

Pattern 2: Fake PostgreSQL Pool (approval-service)

The approval-service builds a FakePostgresPool class that maintains in-memory dicts for approvals and audit_logs, and interprets basic SQL patterns:

class FakePostgresPool: def __init__(self): self.approvals: dict[str, dict[str, Any]] = {} self.audit_logs: list[dict[str, Any]] = [] async def execute(self, query: str, *args) -> str: q = query.strip().upper() if "INSERT INTO" in q and "APPROVAL_REQUESTS" in q: self._insert_approval(args) elif "UPDATE" in q and "APPROVAL_REQUESTS" in q: self._update_approval(query, args) return "OK" async def fetch(self, query: str, *args) -> list[dict[str, Any]]: ...

Pattern 3: Fake Graph Store (knowledge-graph-service)

The knowledge-graph-service builds a FakeAgeGraph that simulates Apache AGE Cypher operations with vertex/edge CRUD and a regex-based Cypher interpreter:

class FakeAgeGraph: def __init__(self): self._vertices: dict[int, dict[str, Any]] = {} self._edges: dict[int, dict[str, Any]] = {} def create_vertex(self, label: str, props: dict) -> dict: vid = self._alloc_id() vertex = {"id": vid, "label": label, "properties": dict(props)} self._vertices[vid] = vertex return vertex def match_vertices(self, label: str, match_props=None, limit=50): ...

A companion _interpret_cypher() function uses regex to parse CREATE, MATCH, SET, and DETACH DELETE patterns and dispatch them to the fake graph.

Pattern 4: Real Database with Cleanup (flaring-monitor)

The flaring-monitor uses real database connections with careful cleanup:

@pytest_asyncio.fixture async def pg(): pool = PostgresPool(dsn=DATABASE_URL, min_size=1, max_size=5) await pool.connect() # Cleanup before yielding await pool.execute("DELETE FROM event_type_definitions WHERE tenant_id = $1", TEST_TENANT) yield pool # Cleanup after test await pool.execute("DELETE FROM event_type_definitions WHERE tenant_id = $1", TEST_TENANT) await pool.disconnect()

When using real database fixtures, always use a fixed test tenant ID and clean up both before and after the test to ensure isolation regardless of previous test failures.

Writing Unit Tests

Async Unit Tests

With asyncio_mode = "auto", write async test methods directly in test classes:

class TestWorkingMemoryUnit: async def test_set_and_get(self, working_mem): await working_mem.set("conv-1", {"key1": "value1", "key2": 42}) result = await working_mem.get("conv-1") assert result["key1"] == "value1" assert result["key2"] == 42 async def test_get_empty(self, working_mem): result = await working_mem.get("nonexistent") assert result == {}

No decorators are needed. The working_mem fixture is injected by pytest from conftest.

Synchronous API Tests

Tests that exercise FastAPI endpoints via TestClient are synchronous, because TestClient handles the async event loop internally:

class TestWorkingMemoryAPI: def test_get_empty(self, app_client): r = app_client.get("/working-memory/conv-new") assert r.status_code == 200 body = r.json() assert body["conversation_id"] == "conv-new" assert body["data"] == {} def test_put_and_get(self, app_client): r = app_client.put( "/working-memory/conv-1", json={"data": {"scratchpad": "notes", "count": 5}}, ) assert r.status_code == 200

Testing Error Cases

Always test error paths and edge cases alongside the happy path:

class TestCreateApproval: def test_create_invalid_strategy(self, app_client): r = app_client.post("/approvals", json={ "execution_id": "exec-3", "agent_id": "agent-1", "checkpoint_type": "pre_filing", "state_snapshot": {}, "reviewer_strategy": "invalid", }) assert r.status_code == 400 def test_create_named_individual_without_reviewer_id(self, app_client): r = app_client.post("/approvals", json={ "execution_id": "exec-4", "agent_id": "agent-1", "checkpoint_type": "pre_filing", "state_snapshot": {}, "reviewer_strategy": "named_individual", # Missing reviewer_id }) assert r.status_code == 400

Testing with pytest.raises

For functions that raise exceptions, use pytest.raises:

async def test_create_invalid_label(self, crud): with pytest.raises(ValueError, match="Unknown vertex label"): await crud.create_vertex("InvalidLabel", {"entity_id": "x"})

Writing API Integration Tests

API tests follow a setup-act-assert pattern, often creating prerequisite data before testing the target endpoint:

class TestDecideApproval: def _create_pending(self, app_client): """Helper to create a pending approval for testing decisions.""" r = app_client.post("/approvals", json={ "execution_id": "e1", "agent_id": "a1", "checkpoint_type": "pre_filing", "state_snapshot": {"messages": [{"role": "assistant", "content": "draft"}]}, "reviewer_strategy": "named_individual", "reviewer_id": "rev-1", }) return r.json()["id"] def test_approve(self, app_client): aid = self._create_pending(app_client) r = app_client.post(f"/approvals/{aid}/decide", json={ "decision": "approved", "reviewer_id": "rev-1", "reviewer_comments": "Looks good", }) assert r.status_code == 200 body = r.json() assert body["status"] == "approved" assert body["decided_at"] is not None def test_decide_already_decided(self, app_client): aid = self._create_pending(app_client) # First decision succeeds app_client.post(f"/approvals/{aid}/decide", json={ "decision": "approved", "reviewer_id": "rev-1", }) # Second decision should fail with 409 Conflict r = app_client.post(f"/approvals/{aid}/decide", json={ "decision": "rejected", "reviewer_id": "rev-2", }) assert r.status_code == 409

Using Seeded Test Data

The orchestration-engine conftest provides helper functions and seeded fixtures for complex test scenarios:

# conftest.py helpers def make_compliance_row(entity_id="well-1", status="compliant", domain="rule_37", **extra): return { "id": str(uuid.uuid4()), "tenant_id": "default", "entity_id": entity_id, "entity_type": "Well", "compliance_domain": domain, "status": status, ... } @pytest.fixture def seeded_pg(fake_pg): """FakePostgresPool pre-seeded with templates and rule versions.""" template = make_template_row() fake_pg.seed_table("checklist_templates", [template]) fake_pg.seed_table("rule_versions", [ make_rule_version_row(rule_domain="spacing", rule_identifier="SWR_37"), ]) return fake_pg

Tests then use these fixtures to verify behavior against known data:

class TestComplianceSummary: def test_with_overdue(self, app_client, fake_pg): fake_pg.seed_compliance_status([ make_compliance_row(entity_id="w-1", status="overdue"), make_compliance_row(entity_id="w-2", status="overdue"), ]) resp = app_client.get("/compliance/summary") assert resp.json()["overdue_count"] >= 2

Testing Pure Functions

Some services have pure business logic functions that need no mocking at all:

from compliance.deadlines import scan_deadlines from compliance.risk_scoring import score_well_risk class TestDeadlines: def test_critical_deadline(self): today = date(2025, 4, 10) permits = [{"permit_number": "P-1", "expiration_date": "2025-04-14"}] result = scan_deadlines(permits, [], reference_date=today) assert result["critical_count"] == 1 class TestRiskScoring: def test_high_risk(self): result = score_well_risk( well={"well_name": "W-1", "api_number": "42-1"}, deadlines=[{"days_remaining": -5}], production_issues=["GOR exceeded", "Zero production"], violation_count=2, flaring_exposure={"pct_of_max": 105}, ) assert result["risk_level"] == "HIGH" assert result["risk_score"] >= 70

These tests are straightforward and fast — no fixtures required beyond the function under test.

Adding a New Test

Follow this checklist when adding tests to an existing service:

  1. Create or extend a test file in services/{service}/tests/test_{feature}.py
  2. Add an __init__.py if the tests directory does not already have one
  3. Add fixtures to conftest.py if you need shared mock infrastructure
  4. Organize tests in classes grouped by feature or endpoint
  5. Use async for business logic tests and sync for TestClient API tests
  6. Test the happy path, error cases, and edge cases
  7. Run the tests to verify they pass:
    cd services/{service-name} poetry run pytest tests/test_{feature}.py -v

Adding Tests for a New Service

When creating tests for a brand-new service:

  1. Create the tests/ directory with __init__.py
  2. Create conftest.py with the appropriate mock infrastructure:
    • PostgreSQL-backed service: build a FakePostgresPool
    • Redis-backed service: build a mock Redis client with AsyncMock and side_effect
    • Graph-backed service: build a FakeAgeGraph
  3. Create an app_client fixture that patches the service’s global dependencies and replaces the lifespan:
    @pytest.fixture def app_client(fake_pg): import my_service.main as main_module main_module.pg = fake_pg @asynccontextmanager async def noop_lifespan(app): yield main_module.app.router.lifespan_context = noop_lifespan return TestClient(main_module.app)
  4. Add [tool.pytest.ini_options] with asyncio_mode = "auto" in pyproject.toml
  5. Add dev dependencies for pytest, pytest-asyncio, and httpx:
    [tool.poetry.group.dev.dependencies] pytest = "^8.0" pytest-asyncio = "^0.23" httpx = "^0.28"
Last updated on