Knowledge Graph Service
The Knowledge Graph Service manages the Apache AGE graph database that stores oil and gas entities (wells, leases, fields, operators, permits, infrastructure) and their relationships. It provides entity CRUD, whitelisted graph-query templates (tenant-scoped, injection-safe; legacy raw Cypher deprecated), context assembly for agent skill injection, impact propagation analysis, and a managed entity system with schema-driven validation.
Overview
AEGIS uses a property graph to represent the oil and gas domain. The graph is stored in PostgreSQL using the Apache AGE extension, which adds openCypher query support. This service provides:
- Entity CRUD — Create, read, update, and delete vertices (wells, leases, fields, operators, permits) and edges (OPERATED_BY, LOCATED_IN, etc.)
- Context assembly — Assemble rich entity context for agent skill injection (Tier 3.5), combining entity properties, relationships, and domain-specific data
- Impact propagation — Traverse the graph bidirectionally from a seed entity to analyze production and financial exposure across connected entities
- Managed entities — A schema-driven entity system where entity types, fields, and validation rules are defined in admin tables and enforced at runtime
- Detection engine — Rule-based event detection that evaluates conditions against entity data on cron schedules
- Infrastructure tree — Full infrastructure hierarchy from projects down to wells with status rollup
Port & Language
| Property | Value |
|---|---|
| Port | 8003 |
| Language | Python 3.12 |
| Framework | FastAPI |
| Entry point | src/knowledge_graph/main.py |
Key Endpoints
Entity CRUD (Legacy Graph)
| Method | Path | Description |
|---|---|---|
POST | /entities | Create a vertex in the knowledge graph. |
GET | /entities/{label}/{entity_id} | Get a vertex by label and entity ID. |
GET | /entities/{label} | List vertices of a given label (max 500). |
PUT | /entities/{label}/{entity_id} | Update properties on a vertex. |
DELETE | /entities/{label}/{entity_id} | Delete a vertex and its edges. |
PUT | /entities/Well/{entity_id}/production | Update production data on a Well vertex from Form PR. |
Edge CRUD
| Method | Path | Description |
|---|---|---|
POST | /edges | Create an edge between two vertices. |
GET | /edges/{label}/{entity_id} | Get edges for a vertex (filterable by edge label and direction). |
Graph Query
| Method | Path | Description |
|---|---|---|
POST | /query | Run a whitelisted graph-query template: {template_id, params} resolved against graph_query_templates and executed through a TenantScopedConnection (tenant-scoped, injection-safe). Optional tenant_id query param (default dev tenant). The legacy raw {query, columns} body is still accepted but deprecated (logs a warning) — its removal, along with migrating the orchestration/skills query layer onto templates, is R34 Phase 1’s deferred R35 work. |
POST | /entity/resolve | Resolve a natural-language entity reference (R34 P3). Body {query, entity_types?, context_entity_ids?, max_proximity_hops?, limit?} + optional tenant_id query param. Returns an EntityResolveResult — see Entity Resolution below. |
Five templates are registered (R34 P1): operator_wells, operator_permits,
operator_flaring_auths, operator_flaring_auths_via_lease, edge_usage. Value params
(e.g. operator_id) are escaped; label params (e.g. edge_label) are whitelist-validated
against EDGE_LABELS / VERTEX_LABELS. See the
API reference for the full template table.
Context Assembly (Tier 3.5)
| Method | Path | Description |
|---|---|---|
GET | /context/assemble/managed/{entity_id} | Assemble context for a managed entity (R27). Requires entity_type_key query parameter. |
GET | /context/assemble/{well_api} | Assemble context for a well by API number (legacy). |
Impact Propagation
| Method | Path | Description |
|---|---|---|
GET | /impact/{label}/{entity_id} | Bidirectional impact analysis from a seed entity. |
GET | /impact/outward/{label}/{entity_id} | Outward impact: walk downstream through infrastructure. |
GET | /impact/inward/{label}/{entity_id} | Inward impact: walk upstream from infrastructure to wells. |
GET | /impact/tree | Full infrastructure tree from project roots to wells. |
Managed Entity CRUD (R25)
These endpoints are served via the entity_manager_routes router:
| Method | Path | Description |
|---|---|---|
GET | /managed-entities | List managed entities with filtering and pagination. |
POST | /managed-entities | Create a managed entity with schema validation. |
GET | /managed-entities/{id} | Get a managed entity by ID. |
PUT | /managed-entities/{id} | Update a managed entity. |
DELETE | /managed-entities/{id} | Delete a managed entity. |
Admin Routes (Entity Type Schema)
| Method | Path | Description |
|---|---|---|
GET | /admin/entity-types | List entity type definitions. |
POST | /admin/entity-types | Create an entity type with field definitions. |
PUT | /admin/entity-types/{id} | Update an entity type. |
POST | /admin/entity-types/import | Import entity types from a schema definition. |
GET | /admin/relationship-types | List relationship type definitions. |
POST | /admin/relationship-types | Create a relationship type. |
Public Routes
| Method | Path | Description |
|---|---|---|
GET | /entity-types | List entity types (public, no auth required). |
GET | /entity-types/{id} | Get entity type details. |
GET | /entity-types/by-key/{key} | Get entity type by key. |
GET | /relationship-types | List relationship types (public). |
GET | /relationship-rules | List relationship rules. |
GET | /event-detection-rules | List detection rules. |
Seeding
| Method | Path | Description |
|---|---|---|
POST | /seed | Load sample RRC data for development. |
Health
| Method | Path | Description |
|---|---|---|
GET | /health | Health check. |
Architecture
Module Breakdown
src/knowledge_graph/
├── main.py # FastAPI app, endpoint definitions, lifespan
├── config.py # Settings from environment variables
├── schemas.py # Pydantic request/response models
├── age_connection.py # AgePool — asyncpg pool with AGE setup
├── normalization.py # Shared trigram normalizer + uuid/label helpers (R34 0b/2)
├── connections/ # TenantScopedConnection — structural tenant boundary (R34 P1)
├── queries/ # graph_query_templates + entity_queries (R34 P1/P2)
├── providers/ # EntitySearchProvider + Postgres impl (R34 P2)
├── crud.py # GraphCrud — vertex/edge CRUD via openCypher
├── context.py # ContextAssembler — Tier 3.5 context for agents
├── impact.py # ImpactTraverser — propagation analysis
├── seed.py # Sample data seeder
├── admin_routes.py # Admin entity type management routes
├── public_routes.py # Public entity type read routes
├── entity_manager_routes.py # Managed entity CRUD routes
├── entity_crud_service.py # Schema-validated entity CRUD logic
├── entity_type_crud.py # Entity type admin CRUD
├── entity_type_schemas.py # Entity type Pydantic schemas
├── entity_validator.py # Field-level validation against entity type schema
├── entity_mover.py # Move entities between types
├── entity_relationship_service.py # Relationship CRUD for managed entities
├── relationship_type_crud.py # Relationship type admin CRUD
├── relationship_type_schemas.py # Relationship type Pydantic schemas
├── detection_engine.py # Rule-based event detection engine
├── event_detection_crud.py # Detection rule CRUD
├── event_detection_schemas.py # Detection rule schemas
├── formula_evaluator.py # Formula evaluation for computed fields
├── formula_validator.py # Formula syntax validation
├── rrc_lookup.py # RRC data lookup service
├── schema_importer.py # Bulk entity type import
├── script_context.py # Script execution context
└── script_executor.py # Safe script execution for detection rulesApache AGE Integration
The AgePool class wraps asyncpg with the setup commands required for Apache AGE:
LOAD 'age';
SET search_path = ag_catalog, "$user", public;These statements must be executed on every new connection before any Cypher queries. The pool handles this automatically in its connection initialization.
Every connection to the database that will execute Cypher queries must run LOAD 'age' and set the search path to include ag_catalog. The AgePool handles this automatically, but direct psql connections require manual setup.
Cypher Query Patterns
All graph queries use openCypher syntax via Apache AGE. The GraphCrud class wraps queries in the AGE SQL function:
SELECT * FROM cypher('oilgas', $$
MATCH (w:Well {api_number: '42-123-45678'})
RETURN w
$$) AS (v agtype);Entity Resolution (R34)
Agents and skills refer to entities by the names a human would type — "Mitchell Ranch 1H", the SCADA tag "MR1H", a typo like "Mitchel Ranch". The entity-resolution layer maps those strings to canonical graph entities. It is built in phases on top of a thin relational index that mirrors the AGE graph:
kg_entity_name_index— one row per entity:entity_id(the AGE vertex’suuidproperty),tenant_id,label,display_name, and adisplay_name_normalizedcolumn. Canonical properties stay in the graph; this table exists only to make name search fast and fuzzy.kg_entity_aliases— external-system aliases (SCADA tags, PI points, SAP IDs) pointing at anentity_id, each with a normalized form andsource_system.
Both normalized columns use the same normalize_name() (lowercase, strip - _ . / \ + whitespace) at write time (the 0b sync hooks) and at read time (the provider) — divergence would silently break matching, so the function lives once in normalization.py.
Fuzzy matching uses PostgreSQL’s pg_trgm extension (% operator + similarity()), backed by GIN trigram indexes on the normalized columns. The % operator’s recall is governed by pg_trgm.similarity_threshold (default 0.3, which is also R34’s no-match floor).
The Phase 2 read path — EntitySearchProvider (providers/) — is the search-layer abstraction Phase 3’s orchestration tool programs against. The Postgres implementation runs three relational templates (queries/entity_queries.py) through a TenantScopedConnection:
| Method | Behavior |
|---|---|
exact_match("entity_id", …) | Canonical id lookup → matched_on="exact_id", confidence 1.0. |
exact_match(<other key>, …) | Normalized exact alias hit → matched_on="exact_alias", confidence 1.0 (an exact alias is as strong as an exact id, R34 Q4). |
search(query, entity_types, limit) | Combined trigram name UNION ALL alias search (one query — a single asyncpg connection is not concurrency-safe), deduped per entity, ordered by similarity. |
Each result is a shared ResolvedEntity (aegis_shared.models) carrying entity_id, type_key, the canonical display_name (never the matched alias — the alias rides separately in matched_alias), matched_on, a similarity-or-exact confidence, and properties hydrated from AGE in a single batched MATCH (v) WHERE v.uuid IN [...] lookup.
Two guarantees are worth naming explicitly:
- By construction: tenant scope (the
fetch_sqlCTE pre-filters both tables by the boundtenant_id, so a query physically cannot see another tenant’s rows) and the always-canonicaldisplay_name. - By test: trigram recall, dedupe precedence, and zero cross-tenant leakage — proven in the
@pytest.mark.integrationreal-Postgres lane (tests/test_entity_search_integration.py), since the mock suite cannot reproducepg_trgmor the tenant CTE. Run it withpoetry run pytest -m integrationagainst the docker-compose Postgres+AGE.
The Phase 3 decision — ResolutionPipeline (entity_resolution.py), behind POST /entity/resolve — turns those scored candidates into a decision:
- Search: an exact pass (
exact_match) ∪ the trigramsearch, merged exact-first (trigram alone never yieldsexact_*provenance). Context-broadened recall: when nothing matched strongly by name and the conversation has context, a short partial query ("1H") is matched against the 1-hop graph neighbors of the in-context entities — so"1H"after “pull up the Mitchell Ranch lease” resolves to that lease’s1Hwell. Lenient matching (word_similarity/ containment) is safe because the candidate set is bounded to context neighbors and theentity_typesscope; a uniquely-recalled candidate auto-selects, two (Mitchell Ranch 1H+Delaware 1Hon the same lease) → ask-user. - Proximity: graph distance from each candidate to the nearest
context_entity_id— one batched Cypher (MATCH p = (ctx)-[*1..h]-(cand) RETURN min(length(p)));1 hop → 1.0,2 → 0.5,3 → 0.25, unconnected →0. AGE’sshortestPathis unsupported, so a bound variable-length path is the equivalent. - Composite:
0.6·name + 0.3·proximity + 0.1·type_hint(weights from a config dict — Phase 6 calibration seam). This ranks; it does not decide. - Decide (Q2 three-outcome): auto-select iff the top is an exact id/alias or its name similarity ≥ 0.90 with no other ≥ 0.70; below 0.30 → no-match; everything else (including proximity-broken ties) → ask-user.
The decision keys on name similarity, never the composite score (D6). Proximity ranks candidates but must never trigger an auto-select — otherwise two same-named wells differing only by graph proximity would be silently auto-picked, the precision bug v5’s Q2 reframe fixed.
The result is an EntityResolveResult (aegis_shared.models): matches, auto_selected, total_above_threshold, disambiguating_fields (property keys that vary across matches — feed the agent’s “which 1H?” prompt), and exact_match. Proximity hop-scores + cross-tenant isolation are proven in tests/test_entity_resolution_integration.py.
It also carries a compact resolution_trace block (R34 P6) — span fuel for the orchestration-side Langfuse tiers (entity_resolve.*). It surfaces the per-stage facts the final result collapses away: exact_match_found, candidates_returned, the top candidate’s top_name_similarity / top_proximity_score / top_composite_score + weighted composite_components, and the outcome. It is observability-only — additive, never seen by the LLM (the tool result exposes matches/auto_selected). See the Orchestration Engine tracing section and docs/specs/phases/R34-phase-6-calibration-and-demo-runbook.md.
On the orchestration side, entity_resolve is the first capability-backed LLM tool: it exposes only query + entity_types; the orchestrator injects context_entity_ids (from recent_entity_context), the tenant, and limits server-side, so the LLM can’t reach them. Auto-selected entities feed back into recent_entity_context (FIFO-10, Redis + GraphState) to boost later mentions, and each auto-select fires the on_entity_resolve audit rule. See the Orchestration Engine docs.
Context Assembly
R34 Phase 4 rebuilt context assembly as the context_assemble capability
(context_assembly.py): one tenant-scoped neighborhood walk
(TenantScopedConnection.run_cypher) that returns each neighbor’s properties
inline, yielding two views:
structured(EntityContext) — complete: every field, and relationships grouped by first-hop edge label up to a 1000-per-relationship sanity ceiling. The rules/governance layer reads this; the text caps below never touch it.context_text— truncated for the LLM: a per-relationship cap (default 25) with explicit"…and N more … not shown"overflow markers, an 8K-token backstop (tiktoken cl100k_baseproxy), and the domain-filtered significant/plain field split.
The walk anchors on either the AGE vertex uuid (what entity_resolve returns) or
the business entity_id, so the new entity_resolve → context_assemble handoff and
the legacy skill-injection path hit one code path. On the orchestration side
context_assemble is a state-aware LLM tool (sibling to entity_resolve) exposing
only entity_id + entity_type; the orchestrator injects domain_filter from the
active skills’ domain_tags server-side. It is a pure read — it does not update
recent_entity_context.
The legacy well-API ContextAssembler is gone — GET /context/assemble/{well_api}
is now a thin adapter (assemble_well_sections) over the same walk, mapping the
result back into the legacy sections shape the hardcoded Rule 37/32 tools read
(R35 deletes the adapter + those tools). Caps/ordering proven in
tests/test_context_assembly.py; the real walk + tenant isolation in
tests/test_context_assembly_integration.py.
Detection Engine
The DetectionEngine evaluates detection rules on a 60-second cron loop:
- Load all active detection rules from the database
- For each rule, evaluate its conditions against entity data
- If triggered, generate an event and optionally run a script action
- Log results for monitoring
Impact Propagation
The ImpactTraverser performs graph traversal to calculate exposure:
- Bidirectional: Walk outward and inward from a seed entity up to a configurable max depth
- Production exposure: Sum oil (BBL) and gas (MCF) production across affected wells
- Dollar exposure: Calculate financial impact using WTI spot and Henry Hub prices
- Infrastructure tree: Build a nested tree from
InfrastructureProjectroots down to wells with status rollup
Dependencies
Python Packages
| Package | Version | Purpose |
|---|---|---|
fastapi | ^0.115 | Web framework |
uvicorn | ^0.34 | ASGI server |
asyncpg | ^0.29 | PostgreSQL async driver (AGE queries) |
anthropic | ^0.52 | Claude API (used for some context operations) |
httpx | ^0.28 | HTTP client |
aegis-shared | local | Shared models and DB helpers |
Infrastructure Dependencies
| Dependency | Purpose |
|---|---|
| PostgreSQL 15 + Apache AGE | Graph database storage |
Configuration
| Environment Variable | Default | Description |
|---|---|---|
KNOWLEDGE_GRAPH_HOST | 0.0.0.0 | Bind address |
KNOWLEDGE_GRAPH_PORT | 8003 | Bind port |
DATABASE_URL | postgresql://aegis:aegis_local@localhost:5432/aegis | PostgreSQL connection |
AGE_GRAPH_NAME | oilgas | Name of the Apache AGE graph |
WTI_SPOT_USD | 72.0 | WTI crude oil price for impact dollar calculations |
HENRY_HUB_USD | 2.50 | Henry Hub natural gas price for impact dollar calculations |
Running Locally
cd services/knowledge-graph-service
poetry install
poetry run uvicorn knowledge_graph.main:app --reload --port 8003Seeding the Graph
After starting the service, seed it with sample data:
curl -X POST http://localhost:8003/seedThis populates the graph with sample wells, leases, fields, operators, and permits from the data/rrc-samples/ CSV files.
The knowledge graph service requires PostgreSQL with the Apache AGE extension installed. The docker-compose.yml file configures PostgreSQL with AGE automatically. The infrastructure/docker/postgres/init.sql script creates the oilgas graph and all required tables.