Knowledge Graph Service

The Knowledge Graph Service manages the Apache AGE graph database that stores oil and gas entities (wells, leases, fields, operators, permits, infrastructure) and their relationships. It provides entity CRUD, whitelisted graph-query templates (tenant-scoped, injection-safe; legacy raw Cypher deprecated), context assembly for agent skill injection, impact propagation analysis, and a managed entity system with schema-driven validation.

Overview

AEGIS uses a property graph to represent the oil and gas domain. The graph is stored in PostgreSQL using the Apache AGE extension, which adds openCypher query support. This service provides:

Entity CRUD — Create, read, update, and delete vertices (wells, leases, fields, operators, permits) and edges (OPERATED_BY, LOCATED_IN, etc.)
Context assembly — Assemble rich entity context for agent skill injection (Tier 3.5), combining entity properties, relationships, and domain-specific data
Impact propagation — Traverse the graph bidirectionally from a seed entity to analyze production and financial exposure across connected entities
Managed entities — A schema-driven entity system where entity types, fields, and validation rules are defined in admin tables and enforced at runtime
Detection engine — Rule-based event detection that evaluates conditions against entity data on cron schedules
Infrastructure tree — Full infrastructure hierarchy from projects down to wells with status rollup

Port & Language

Property	Value
Port	`8003`
Language	Python 3.12
Framework	FastAPI
Entry point	`src/knowledge_graph/main.py`

Key Endpoints

Entity CRUD (Legacy Graph)

Method	Path	Description
`POST`	`/entities`	Create a vertex in the knowledge graph.
`GET`	`/entities/{label}/{entity_id}`	Get a vertex by label and entity ID.
`GET`	`/entities/{label}`	List vertices of a given label (max 500).
`PUT`	`/entities/{label}/{entity_id}`	Update properties on a vertex.
`DELETE`	`/entities/{label}/{entity_id}`	Delete a vertex and its edges.
`PUT`	`/entities/Well/{entity_id}/production`	Update production data on a Well vertex from Form PR.

Edge CRUD

Method	Path	Description
`POST`	`/edges`	Create an edge between two vertices.
`GET`	`/edges/{label}/{entity_id}`	Get edges for a vertex (filterable by edge label and direction).

Graph Query

Method	Path	Description
`POST`	`/query`	Run a whitelisted graph-query template: `{template_id, params}` resolved against `graph_query_templates` and executed through a `TenantScopedConnection` (tenant-scoped, injection-safe). Optional `tenant_id` query param (default dev tenant). The legacy raw `{query, columns}` body is still accepted but deprecated (logs a warning) — its removal, along with migrating the orchestration/skills query layer onto templates, is R34 Phase 1’s deferred R35 work.
`POST`	`/entity/resolve`	Resolve a natural-language entity reference (R34 P3). Body `{query, entity_types?, context_entity_ids?, max_proximity_hops?, limit?}` + optional `tenant_id` query param. Returns an `EntityResolveResult` — see Entity Resolution below.

Five templates are registered (R34 P1): operator_wells, operator_permits, operator_flaring_auths, operator_flaring_auths_via_lease, edge_usage. Value params (e.g. operator_id) are escaped; label params (e.g. edge_label) are whitelist-validated against EDGE_LABELS / VERTEX_LABELS. See the API reference for the full template table.

Context Assembly (Tier 3.5)

Method	Path	Description
`GET`	`/context/assemble/managed/{entity_id}`	Assemble context for a managed entity (R27). Requires `entity_type_key` query parameter.
`GET`	`/context/assemble/{well_api}`	Assemble context for a well by API number (legacy).

Impact Propagation

Method	Path	Description
`GET`	`/impact/{label}/{entity_id}`	Bidirectional impact analysis from a seed entity.
`GET`	`/impact/outward/{label}/{entity_id}`	Outward impact: walk downstream through infrastructure.
`GET`	`/impact/inward/{label}/{entity_id}`	Inward impact: walk upstream from infrastructure to wells.
`GET`	`/impact/tree`	Full infrastructure tree from project roots to wells.

Managed Entity CRUD (R25)

These endpoints are served via the entity_manager_routes router:

Method	Path	Description
`GET`	`/managed-entities`	List managed entities with filtering and pagination.
`POST`	`/managed-entities`	Create a managed entity with schema validation.
`GET`	`/managed-entities/{id}`	Get a managed entity by ID.
`PUT`	`/managed-entities/{id}`	Update a managed entity.
`DELETE`	`/managed-entities/{id}`	Delete a managed entity.

Admin Routes (Entity Type Schema)

Method	Path	Description
`GET`	`/admin/entity-types`	List entity type definitions.
`POST`	`/admin/entity-types`	Create an entity type with field definitions.
`PUT`	`/admin/entity-types/{id}`	Update an entity type.
`POST`	`/admin/entity-types/import`	Import entity types from a schema definition.
`GET`	`/admin/relationship-types`	List relationship type definitions.
`POST`	`/admin/relationship-types`	Create a relationship type.

Public Routes

Method	Path	Description
`GET`	`/entity-types`	List entity types (public, no auth required).
`GET`	`/entity-types/{id}`	Get entity type details.
`GET`	`/entity-types/by-key/{key}`	Get entity type by key.
`GET`	`/relationship-types`	List relationship types (public).
`GET`	`/relationship-rules`	List relationship rules.
`GET`	`/event-detection-rules`	List detection rules.

Seeding

Method	Path	Description
`POST`	`/seed`	Load sample RRC data for development.

Health

Method	Path	Description
`GET`	`/health`	Health check.

Architecture

Module Breakdown


src/knowledge_graph/
├── main.py                      # FastAPI app, endpoint definitions, lifespan
├── config.py                    # Settings from environment variables
├── schemas.py                   # Pydantic request/response models
├── age_connection.py            # AgePool — asyncpg pool with AGE setup
├── normalization.py             # Shared trigram normalizer + uuid/label helpers (R34 0b/2)
├── connections/                 # TenantScopedConnection — structural tenant boundary (R34 P1)
├── queries/                     # graph_query_templates + entity_queries (R34 P1/P2)
├── providers/                   # EntitySearchProvider + Postgres impl (R34 P2)
├── crud.py                      # GraphCrud — vertex/edge CRUD via openCypher
├── context.py                   # ContextAssembler — Tier 3.5 context for agents
├── impact.py                    # ImpactTraverser — propagation analysis
├── seed.py                      # Sample data seeder
├── admin_routes.py              # Admin entity type management routes
├── public_routes.py             # Public entity type read routes
├── entity_manager_routes.py     # Managed entity CRUD routes
├── entity_crud_service.py       # Schema-validated entity CRUD logic
├── entity_type_crud.py          # Entity type admin CRUD
├── entity_type_schemas.py       # Entity type Pydantic schemas
├── entity_validator.py          # Field-level validation against entity type schema
├── entity_mover.py              # Move entities between types
├── entity_relationship_service.py  # Relationship CRUD for managed entities
├── relationship_type_crud.py    # Relationship type admin CRUD
├── relationship_type_schemas.py # Relationship type Pydantic schemas
├── detection_engine.py          # Rule-based event detection engine
├── event_detection_crud.py      # Detection rule CRUD
├── event_detection_schemas.py   # Detection rule schemas
├── formula_evaluator.py         # Formula evaluation for computed fields
├── formula_validator.py         # Formula syntax validation
├── rrc_lookup.py                # RRC data lookup service
├── schema_importer.py           # Bulk entity type import
├── script_context.py            # Script execution context
└── script_executor.py           # Safe script execution for detection rules

Apache AGE Integration

The AgePool class wraps asyncpg with the setup commands required for Apache AGE:


LOAD 'age';
SET search_path = ag_catalog, "$user", public;

These statements must be executed on every new connection before any Cypher queries. The pool handles this automatically in its connection initialization.

Every connection to the database that will execute Cypher queries must run LOAD 'age' and set the search path to include ag_catalog. The AgePool handles this automatically, but direct psql connections require manual setup.

Cypher Query Patterns

All graph queries use openCypher syntax via Apache AGE. The GraphCrud class wraps queries in the AGE SQL function:


SELECT * FROM cypher('oilgas', $$
  MATCH (w:Well {api_number: '42-123-45678'})
  RETURN w
$$) AS (v agtype);

Entity Resolution (R34)

Agents and skills refer to entities by the names a human would type — "Mitchell Ranch 1H", the SCADA tag "MR1H", a typo like "Mitchel Ranch". The entity-resolution layer maps those strings to canonical graph entities. It is built in phases on top of a thin relational index that mirrors the AGE graph:

kg_entity_name_index — one row per entity: entity_id (the AGE vertex’s uuid property), tenant_id, label, display_name, and a display_name_normalized column. Canonical properties stay in the graph; this table exists only to make name search fast and fuzzy.
kg_entity_aliases — external-system aliases (SCADA tags, PI points, SAP IDs) pointing at an entity_id, each with a normalized form and source_system.

Both normalized columns use the same normalize_name() (lowercase, strip - _ . / \ + whitespace) at write time (the 0b sync hooks) and at read time (the provider) — divergence would silently break matching, so the function lives once in normalization.py.

Fuzzy matching uses PostgreSQL’s pg_trgm extension (% operator + similarity()), backed by GIN trigram indexes on the normalized columns. The % operator’s recall is governed by pg_trgm.similarity_threshold (default 0.3, which is also R34’s no-match floor).

The Phase 2 read path — EntitySearchProvider (providers/) — is the search-layer abstraction Phase 3’s orchestration tool programs against. The Postgres implementation runs three relational templates (queries/entity_queries.py) through a TenantScopedConnection:

Method	Behavior
`exact_match("entity_id", …)`	Canonical id lookup → `matched_on="exact_id"`, confidence `1.0`.
`exact_match(<other key>, …)`	Normalized exact alias hit → `matched_on="exact_alias"`, confidence `1.0` (an exact alias is as strong as an exact id, R34 Q4).
`search(query, entity_types, limit)`	Combined trigram name `UNION ALL` alias search (one query — a single asyncpg connection is not concurrency-safe), deduped per entity, ordered by similarity.

Each result is a shared ResolvedEntity (aegis_shared.models) carrying entity_id, type_key, the canonical display_name (never the matched alias — the alias rides separately in matched_alias), matched_on, a similarity-or-exact confidence, and properties hydrated from AGE in a single batched MATCH (v) WHERE v.uuid IN [...] lookup.

Two guarantees are worth naming explicitly:

By construction: tenant scope (the fetch_sql CTE pre-filters both tables by the bound tenant_id, so a query physically cannot see another tenant’s rows) and the always-canonical display_name.
By test: trigram recall, dedupe precedence, and zero cross-tenant leakage — proven in the @pytest.mark.integration real-Postgres lane (tests/test_entity_search_integration.py), since the mock suite cannot reproduce pg_trgm or the tenant CTE. Run it with poetry run pytest -m integration against the docker-compose Postgres+AGE.

The Phase 3 decision — ResolutionPipeline (entity_resolution.py), behind POST /entity/resolve — turns those scored candidates into a decision:

Search: an exact pass (exact_match) ∪ the trigram search, merged exact-first (trigram alone never yields exact_* provenance). Context-broadened recall: when nothing matched strongly by name and the conversation has context, a short partial query ("1H") is matched against the 1-hop graph neighbors of the in-context entities — so "1H" after “pull up the Mitchell Ranch lease” resolves to that lease’s 1H well. Lenient matching (word_similarity / containment) is safe because the candidate set is bounded to context neighbors and the entity_types scope; a uniquely-recalled candidate auto-selects, two (Mitchell Ranch 1H + Delaware 1H on the same lease) → ask-user.
Proximity: graph distance from each candidate to the nearest context_entity_id — one batched Cypher (MATCH p = (ctx)-[*1..h]-(cand) RETURN min(length(p))); 1 hop → 1.0, 2 → 0.5, 3 → 0.25, unconnected → 0. AGE’s shortestPath is unsupported, so a bound variable-length path is the equivalent.
Composite: 0.6·name + 0.3·proximity + 0.1·type_hint (weights from a config dict — Phase 6 calibration seam). This ranks; it does not decide.
Decide (Q2 three-outcome): auto-select iff the top is an exact id/alias or its name similarity ≥ 0.90 with no other ≥ 0.70; below 0.30 → no-match; everything else (including proximity-broken ties) → ask-user.

The decision keys on name similarity, never the composite score (D6). Proximity ranks candidates but must never trigger an auto-select — otherwise two same-named wells differing only by graph proximity would be silently auto-picked, the precision bug v5’s Q2 reframe fixed.

The result is an EntityResolveResult (aegis_shared.models): matches, auto_selected, total_above_threshold, disambiguating_fields (property keys that vary across matches — feed the agent’s “which 1H?” prompt), and exact_match. Proximity hop-scores + cross-tenant isolation are proven in tests/test_entity_resolution_integration.py.

It also carries a compact resolution_trace block (R34 P6) — span fuel for the orchestration-side Langfuse tiers (entity_resolve.*). It surfaces the per-stage facts the final result collapses away: exact_match_found, candidates_returned, the top candidate’s top_name_similarity / top_proximity_score / top_composite_score + weighted composite_components, and the outcome. It is observability-only — additive, never seen by the LLM (the tool result exposes matches/auto_selected). See the Orchestration Engine tracing section and docs/specs/phases/R34-phase-6-calibration-and-demo-runbook.md.

On the orchestration side, entity_resolve is the first capability-backed LLM tool: it exposes only query + entity_types; the orchestrator injects context_entity_ids (from recent_entity_context), the tenant, and limits server-side, so the LLM can’t reach them. Auto-selected entities feed back into recent_entity_context (FIFO-10, Redis + GraphState) to boost later mentions, and each auto-select fires the on_entity_resolve audit rule. See the Orchestration Engine docs.

Context Assembly

R34 Phase 4 rebuilt context assembly as the context_assemble capability (context_assembly.py): one tenant-scoped neighborhood walk (TenantScopedConnection.run_cypher) that returns each neighbor’s properties inline, yielding two views:

structured (EntityContext) — complete: every field, and relationships grouped by first-hop edge label up to a 1000-per-relationship sanity ceiling. The rules/governance layer reads this; the text caps below never touch it.
context_text — truncated for the LLM: a per-relationship cap (default 25) with explicit "…and N more … not shown" overflow markers, an 8K-token backstop (tiktoken cl100k_base proxy), and the domain-filtered significant/plain field split.

The walk anchors on either the AGE vertex uuid (what entity_resolve returns) or the business entity_id, so the new entity_resolve → context_assemble handoff and the legacy skill-injection path hit one code path. On the orchestration side context_assemble is a state-aware LLM tool (sibling to entity_resolve) exposing only entity_id + entity_type; the orchestrator injects domain_filter from the active skills’ domain_tags server-side. It is a pure read — it does not update recent_entity_context.

The legacy well-API ContextAssembler is gone — GET /context/assemble/{well_api} is now a thin adapter (assemble_well_sections) over the same walk, mapping the result back into the legacy sections shape the hardcoded Rule 37/32 tools read (R35 deletes the adapter + those tools). Caps/ordering proven in tests/test_context_assembly.py; the real walk + tenant isolation in tests/test_context_assembly_integration.py.

Detection Engine

The DetectionEngine evaluates detection rules on a 60-second cron loop:

Load all active detection rules from the database
For each rule, evaluate its conditions against entity data
If triggered, generate an event and optionally run a script action
Log results for monitoring

Impact Propagation

The ImpactTraverser performs graph traversal to calculate exposure:

Bidirectional: Walk outward and inward from a seed entity up to a configurable max depth
Production exposure: Sum oil (BBL) and gas (MCF) production across affected wells
Dollar exposure: Calculate financial impact using WTI spot and Henry Hub prices
Infrastructure tree: Build a nested tree from InfrastructureProject roots down to wells with status rollup

Dependencies

Python Packages

Package	Version	Purpose
`fastapi`	^0.115	Web framework
`uvicorn`	^0.34	ASGI server
`asyncpg`	^0.29	PostgreSQL async driver (AGE queries)
`anthropic`	^0.52	Claude API (used for some context operations)
`httpx`	^0.28	HTTP client
`aegis-shared`	local	Shared models and DB helpers

Infrastructure Dependencies

Dependency	Purpose
PostgreSQL 15 + Apache AGE	Graph database storage

Configuration

Environment Variable	Default	Description
`KNOWLEDGE_GRAPH_HOST`	`0.0.0.0`	Bind address
`KNOWLEDGE_GRAPH_PORT`	`8003`	Bind port
`DATABASE_URL`	`postgresql://aegis:aegis_local@localhost:5432/aegis`	PostgreSQL connection
`AGE_GRAPH_NAME`	`oilgas`	Name of the Apache AGE graph
`WTI_SPOT_USD`	`72.0`	WTI crude oil price for impact dollar calculations
`HENRY_HUB_USD`	`2.50`	Henry Hub natural gas price for impact dollar calculations

Running Locally


cd services/knowledge-graph-service
poetry install
poetry run uvicorn knowledge_graph.main:app --reload --port 8003

Seeding the Graph

After starting the service, seed it with sample data:


curl -X POST http://localhost:8003/seed

This populates the graph with sample wells, leases, fields, operators, and permits from the data/rrc-samples/ CSV files.

The knowledge graph service requires PostgreSQL with the Apache AGE extension installed. The docker-compose.yml file configures PostgreSQL with AGE automatically. The infrastructure/docker/postgres/init.sql script creates the oilgas graph and all required tables.