Graph Schema
AEGIS uses Apache AGE (A Graph Extension) on PostgreSQL to maintain a knowledge graph of oil and gas entities and their relationships. The graph is named oilgas and uses openCypher query syntax.
Prerequisites
Every session that queries the graph must first run:
LOAD 'age';
SET search_path = ag_catalog, "$user", public;Forgetting these SET commands will cause all Cypher queries to fail with “function ag_catalog.cypher does not exist” errors.
Vertex Labels
The graph supports 20 vertex labels:
| Label | Description | Key Properties |
|---|---|---|
Well | Individual wellbore | entity_id, api_number, well_name, operator, status, latitude, longitude |
Lease | Legal lease boundary | entity_id, lease_number, lease_name, operator |
Field | RRC-defined field area | entity_id, field_number, field_name, district |
Formation | Geological target formation | entity_id, formation_name |
Operator | Company that operates wells | entity_id, operator_name, operator_number |
Permit | Drilling/operating permit | entity_id, permit_number, permit_type, status |
FlaringAuthorization | R-32 flaring exception | entity_id, auth_number, max_volume, expiration_date |
FlaringEvent | Individual flaring occurrence | entity_id, volume_mcf, date, disposition_code |
InfrastructureProject | Pipeline/facility construction | entity_id, project_name, status, completion_date |
Regulation | RRC/EPA rule reference | entity_id, rule_number, title, effective_date |
Wellpad | Surface location grouping | entity_id, pad_name |
Facility | Processing/gathering facility | entity_id, facility_name, facility_type |
PipelineRoute | Pipeline segment | entity_id, route_name, diameter, length_miles |
WellEvent | Lifecycle event for a well | entity_id, event_type, event_date |
ComplianceDeadline | Regulatory deadline | entity_id, deadline_date, rule_reference |
FilingPackage | Assembled filing documents | entity_id, filing_type, status |
WellConversation | Agent conversation linked to well | entity_id, conversation_id |
Project | Generic project entity | entity_id, project_name |
Equipment | Field equipment | entity_id, equipment_type |
Pipeline | Pipeline entity | entity_id, pipeline_name |
Edge Labels
16 relationship types connect entities:
| Edge Label | From → To | Description |
|---|---|---|
LOCATED_IN | Well → Field, Well → Lease | Spatial containment |
LOCATED_ON | Well → Wellpad | Surface grouping |
COMPLETED_IN | Well → Formation | Target formation |
OPERATED_BY | Well/Lease/Wellpad/Facility/PipelineRoute → Operator | Operational control |
OFFSET_TO | Well → Well | Nearby wells for spacing analysis |
GOVERNED_BY | Well → Regulation, Permit → Regulation | Regulatory applicability |
FLARES_AT | FlaringEvent → Well or Lease | Flaring location |
AUTHORIZED_UNDER | FlaringEvent → FlaringAuthorization | Flaring permission |
PRODUCES_TO | Well → Facility | Production flow |
FEEDS_INTO | Facility → PipelineRoute | Gathering flow |
CONNECTS_TO | InfrastructureProject → Lease, PipelineRoute → InfrastructureProject | Infrastructure links |
HAS_EVENT | Well → WellEvent | Lifecycle events |
HAS_DEADLINE | Well → ComplianceDeadline | Regulatory deadlines |
HAS_FILING | Well → FilingPackage | Filing documents |
HAS_CONVERSATION | Well → WellConversation | Agent conversations |
PART_OF | Any → Any | Hierarchical parent-child |
Example Queries
Find all wells on a lease
SELECT * FROM cypher('oilgas', $$
MATCH (w:Well)-[:LOCATED_IN]->(l:Lease {lease_name: 'Smith Ranch'})
RETURN w.well_name, w.api_number, w.status
$$) AS (well_name agtype, api_number agtype, status agtype);Find offset wells within spacing distance
SELECT * FROM cypher('oilgas', $$
MATCH (w:Well {api_number: '42-123-45678'})-[:OFFSET_TO]->(offset:Well)
RETURN offset.well_name, offset.api_number, offset.operator
$$) AS (well_name agtype, api_number agtype, operator agtype);Trace production flow from well to pipeline
SELECT * FROM cypher('oilgas', $$
MATCH (w:Well {api_number: '42-123-45678'})-[:PRODUCES_TO]->(f:Facility)-[:FEEDS_INTO]->(p:PipelineRoute)
RETURN w.well_name, f.facility_name, p.route_name
$$) AS (well_name agtype, facility_name agtype, route_name agtype);Find wells with expiring flaring authorizations
SELECT * FROM cypher('oilgas', $$
MATCH (fe:FlaringEvent)-[:AUTHORIZED_UNDER]->(fa:FlaringAuthorization)
WHERE fa.expiration_date < '2026-05-01'
MATCH (fe)-[:FLARES_AT]->(w:Well)
RETURN w.well_name, fa.auth_number, fa.expiration_date
$$) AS (well_name agtype, auth_number agtype, expiration_date agtype);Get all entities operated by a specific company
SELECT * FROM cypher('oilgas', $$
MATCH (entity)-[:OPERATED_BY]->(op:Operator {operator_name: 'Permian Basin Energy LLC'})
RETURN labels(entity) AS type, entity.entity_id, entity
$$) AS (type agtype, entity_id agtype, entity agtype);Find all compliance deadlines for a well
SELECT * FROM cypher('oilgas', $$
MATCH (w:Well {api_number: '42-383-40121'})-[:HAS_DEADLINE]->(d:ComplianceDeadline)
RETURN d.deadline_date, d.rule_reference
ORDER BY d.deadline_date
$$) AS (deadline_date agtype, rule_reference agtype);Context Assembly
The knowledge-graph-service provides a context assembly endpoint that collects all graph data related to an entity for agent consumption. It traverses relationships to build a comprehensive context document including:
- Entity properties
- Direct relationships (1-hop neighbors)
- Related compliance data (deadlines, filings)
- Operational connections (facilities, pipelines)
GET /context/assemble/managed/{entity_id}
GET /context/assemble/{well_api}The assembled context is injected into the agent’s prompt as structured data, enabling informed decision-making without the agent needing to query the graph directly.
Querying from Python
The knowledge-graph-service uses the AgePool class from shared/src/aegis_shared/db/postgres.py, which automatically handles the LOAD 'age' and SET search_path commands on each connection. When writing Cypher queries in Python code, use the helper method:
async def run_cypher(pool, query: str, graph: str = "oilgas"):
async with pool.acquire() as conn:
await conn.execute("LOAD 'age'")
await conn.execute('SET search_path = ag_catalog, "$user", public')
result = await conn.fetch(
f"SELECT * FROM cypher('{graph}', $$ {query} $$) AS (result agtype)"
)
return resultAll graph queries should go through the knowledge-graph-service API rather than directly querying PostgreSQL from other services. This ensures consistent search path configuration and query logging.