Skip to Content

Graph Schema

AEGIS uses Apache AGE (A Graph Extension) on PostgreSQL to maintain a knowledge graph of oil and gas entities and their relationships. The graph is named oilgas and uses openCypher query syntax.

Prerequisites

Every session that queries the graph must first run:

LOAD 'age'; SET search_path = ag_catalog, "$user", public;

Forgetting these SET commands will cause all Cypher queries to fail with “function ag_catalog.cypher does not exist” errors.

Vertex Labels

The graph supports 20 vertex labels:

LabelDescriptionKey Properties
WellIndividual wellboreentity_id, api_number, well_name, operator, status, latitude, longitude
LeaseLegal lease boundaryentity_id, lease_number, lease_name, operator
FieldRRC-defined field areaentity_id, field_number, field_name, district
FormationGeological target formationentity_id, formation_name
OperatorCompany that operates wellsentity_id, operator_name, operator_number
PermitDrilling/operating permitentity_id, permit_number, permit_type, status
FlaringAuthorizationR-32 flaring exceptionentity_id, auth_number, max_volume, expiration_date
FlaringEventIndividual flaring occurrenceentity_id, volume_mcf, date, disposition_code
InfrastructureProjectPipeline/facility constructionentity_id, project_name, status, completion_date
RegulationRRC/EPA rule referenceentity_id, rule_number, title, effective_date
WellpadSurface location groupingentity_id, pad_name
FacilityProcessing/gathering facilityentity_id, facility_name, facility_type
PipelineRoutePipeline segmententity_id, route_name, diameter, length_miles
WellEventLifecycle event for a wellentity_id, event_type, event_date
ComplianceDeadlineRegulatory deadlineentity_id, deadline_date, rule_reference
FilingPackageAssembled filing documentsentity_id, filing_type, status
WellConversationAgent conversation linked to wellentity_id, conversation_id
ProjectGeneric project entityentity_id, project_name
EquipmentField equipmententity_id, equipment_type
PipelinePipeline entityentity_id, pipeline_name

Edge Labels

16 relationship types connect entities:

Edge LabelFrom → ToDescription
LOCATED_INWell → Field, Well → LeaseSpatial containment
LOCATED_ONWell → WellpadSurface grouping
COMPLETED_INWell → FormationTarget formation
OPERATED_BYWell/Lease/Wellpad/Facility/PipelineRoute → OperatorOperational control
OFFSET_TOWell → WellNearby wells for spacing analysis
GOVERNED_BYWell → Regulation, Permit → RegulationRegulatory applicability
FLARES_ATFlaringEvent → Well or LeaseFlaring location
AUTHORIZED_UNDERFlaringEvent → FlaringAuthorizationFlaring permission
PRODUCES_TOWell → FacilityProduction flow
FEEDS_INTOFacility → PipelineRouteGathering flow
CONNECTS_TOInfrastructureProject → Lease, PipelineRoute → InfrastructureProjectInfrastructure links
HAS_EVENTWell → WellEventLifecycle events
HAS_DEADLINEWell → ComplianceDeadlineRegulatory deadlines
HAS_FILINGWell → FilingPackageFiling documents
HAS_CONVERSATIONWell → WellConversationAgent conversations
PART_OFAny → AnyHierarchical parent-child

Example Queries

Find all wells on a lease

SELECT * FROM cypher('oilgas', $$ MATCH (w:Well)-[:LOCATED_IN]->(l:Lease {lease_name: 'Smith Ranch'}) RETURN w.well_name, w.api_number, w.status $$) AS (well_name agtype, api_number agtype, status agtype);

Find offset wells within spacing distance

SELECT * FROM cypher('oilgas', $$ MATCH (w:Well {api_number: '42-123-45678'})-[:OFFSET_TO]->(offset:Well) RETURN offset.well_name, offset.api_number, offset.operator $$) AS (well_name agtype, api_number agtype, operator agtype);

Trace production flow from well to pipeline

SELECT * FROM cypher('oilgas', $$ MATCH (w:Well {api_number: '42-123-45678'})-[:PRODUCES_TO]->(f:Facility)-[:FEEDS_INTO]->(p:PipelineRoute) RETURN w.well_name, f.facility_name, p.route_name $$) AS (well_name agtype, facility_name agtype, route_name agtype);

Find wells with expiring flaring authorizations

SELECT * FROM cypher('oilgas', $$ MATCH (fe:FlaringEvent)-[:AUTHORIZED_UNDER]->(fa:FlaringAuthorization) WHERE fa.expiration_date < '2026-05-01' MATCH (fe)-[:FLARES_AT]->(w:Well) RETURN w.well_name, fa.auth_number, fa.expiration_date $$) AS (well_name agtype, auth_number agtype, expiration_date agtype);

Get all entities operated by a specific company

SELECT * FROM cypher('oilgas', $$ MATCH (entity)-[:OPERATED_BY]->(op:Operator {operator_name: 'Permian Basin Energy LLC'}) RETURN labels(entity) AS type, entity.entity_id, entity $$) AS (type agtype, entity_id agtype, entity agtype);

Find all compliance deadlines for a well

SELECT * FROM cypher('oilgas', $$ MATCH (w:Well {api_number: '42-383-40121'})-[:HAS_DEADLINE]->(d:ComplianceDeadline) RETURN d.deadline_date, d.rule_reference ORDER BY d.deadline_date $$) AS (deadline_date agtype, rule_reference agtype);

Context Assembly

The knowledge-graph-service provides a context assembly endpoint that collects all graph data related to an entity for agent consumption. It traverses relationships to build a comprehensive context document including:

  • Entity properties
  • Direct relationships (1-hop neighbors)
  • Related compliance data (deadlines, filings)
  • Operational connections (facilities, pipelines)
GET /context/assemble/managed/{entity_id} GET /context/assemble/{well_api}

The assembled context is injected into the agent’s prompt as structured data, enabling informed decision-making without the agent needing to query the graph directly.

Querying from Python

The knowledge-graph-service uses the AgePool class from shared/src/aegis_shared/db/postgres.py, which automatically handles the LOAD 'age' and SET search_path commands on each connection. When writing Cypher queries in Python code, use the helper method:

async def run_cypher(pool, query: str, graph: str = "oilgas"): async with pool.acquire() as conn: await conn.execute("LOAD 'age'") await conn.execute('SET search_path = ag_catalog, "$user", public') result = await conn.fetch( f"SELECT * FROM cypher('{graph}', $$ {query} $$) AS (result agtype)" ) return result

All graph queries should go through the knowledge-graph-service API rather than directly querying PostgreSQL from other services. This ensures consistent search path configuration and query logging.

Last updated on