Skip to Content

Knowledge Graph

AEGIS stores oil and gas entity relationships in an Apache AGE property graph running inside PostgreSQL 15. All graph queries use openCypher syntax.

Technology Stack

ComponentDetails
Graph engineApache AGE (A Graph Extension for PostgreSQL)
Query languageopenCypher
Host databasePostgreSQL 15
Graph nameoilgas
ServiceKnowledge Graph Service (port 8003)

Setup Requirements

Every connection to Apache AGE must execute two preparatory statements before running Cypher queries:

LOAD 'age'; SET search_path = ag_catalog, "$user", public;

The AgePool connection helper in knowledge_graph/age_connection.py handles this automatically for every acquired connection.

Graph Schema

Vertex Labels

The graph supports the following vertex types:

LabelDescriptionKey Properties
WellOil or gas wellapi_number, name, status, well_type, spud_date, surface_location, production fields
LeaseMineral leaseentity_id, name, rrc_lease_id, county, district
FieldOil/gas fieldentity_id, name, rrc_field_id, district, county, state
FormationGeological formationentity_id, name, type, depth_range
OperatorOperating companyentity_id, name, rrc_operator_id, p5_number, contact_name, contact_email
PermitDrilling or operating permitentity_id, permit_number, status, type
FlaringAuthorizationR-32 flaring authorizationentity_id, auth_number, authorized_mcf, expiry_date
FlaringEventFlaring/venting occurrenceentity_id, disposition_code, volume_mcf, date
WellpadSurface grouping of wellsentity_id, name, wells_count
FacilityProcessing/storage facilityentity_id, name, type (tank_battery, compressor, separator, cpf)
PipelineRoutePipeline segmententity_id, name, diameter, capacity
InfrastructureProjectLarge infrastructure projectentity_id, name, status, type
RegulationRegulatory ruleentity_id, name, rule_number, source
WellEventEvent on a wellentity_id, event_type, date
ComplianceDeadlineRegulatory deadlineentity_id, deadline_date, type
FilingPackageAssembled filingentity_id, filing_type, status
WellConversationChat session about a wellentity_id, conversation_id
ProjectGeneral projectentity_id, name, status
EquipmentField equipmententity_id, name, type
PipelinePipeline entityentity_id, name

Edge Labels

Edges define relationships between vertices:

Edge LabelFromToSemantics
LOCATED_INWellField, LeaseWell is located in a field or on a lease
LOCATED_ONWellWellpadWell is physically on a wellpad
COMPLETED_INWellFormationWell produces from a geological formation
OPERATED_BYWell, Lease, Wellpad, Facility, PipelineRouteOperatorEntity is operated by a company
OFFSET_TOWellWellWells are within regulatory spacing distance
GOVERNED_BYWell, PermitRegulationEntity is subject to a regulation
FLARES_ATFlaringEventWell, LeaseFlaring event occurs at a well or lease
AUTHORIZED_UNDERFlaringEventFlaringAuthorizationEvent is covered by an authorization
PRODUCES_TOWellFacilityWell’s production flows to a facility
FEEDS_INTOFacilityPipelineRouteFacility feeds into a pipeline
CONNECTS_TOInfrastructureProject, PipelineRouteLease, InfrastructureProjectInfrastructure connects to entities
HAS_EVENTWellWellEventWell has an associated event
HAS_DEADLINEWellComplianceDeadlineWell has a compliance deadline
HAS_FILINGWellFilingPackageWell has a filing package
HAS_CONVERSATIONWellWellConversationWell has a chat session
PART_OF(any)(any)Generic hierarchical parent-child relationship

Sample Cypher Queries

Find a well and its operator

MATCH (w:Well {api_number: '42-329-12345'})-[:OPERATED_BY]->(op:Operator) RETURN w, op

Find all offset wells for a given well

MATCH (w:Well {api_number: '42-329-12345'})-[:OFFSET_TO]->(offset:Well) RETURN offset.api_number, offset.name, offset.status

Trace production flow from well to pipeline

MATCH (w:Well {api_number: '42-329-12345'})-[:PRODUCES_TO]->(f:Facility)-[:FEEDS_INTO]->(p:PipelineRoute) RETURN w.name, f.name, f.type, p.name

Find all wells on a lease

MATCH (w:Well)-[:LOCATED_IN]->(l:Lease {entity_id: 'lease-mitchell-ranch'}) RETURN w.api_number, w.name, w.status

Impact propagation (multi-hop)

MATCH path = (start:Facility {entity_id: 'fac-tank-battery-a'})<-[:PRODUCES_TO]-(w:Well) RETURN w.api_number, w.name, w.production_oil_bbls, w.production_gas_mcf

Context Assembly (Tier 3.5)

The ContextAssembler class in knowledge_graph/context.py gathers entity context for injection into the agent pipeline. Given a well API number and optional domain tags, it assembles:

  1. The well and its properties
  2. Its lease, field, and operator
  3. Active flaring authorizations
  4. Offset wells (via OFFSET_TO edges)
  5. Applicable regulations (via GOVERNED_BY edges)
  6. Connected infrastructure (wellpad, facilities, pipeline routes)

Domain Tag Filtering

Domain tags control which sections are included in the assembled context:

TagSections Included
spacing / rule_37well, lease, field, operator, offsets, regulations
flaringwell, lease, field, operator, flaring_auths, infrastructure, regulations, wellpad, facilities, pipeline_routes
rule_32well, lease, field, operator, flaring_auths, infrastructure, regulations
compliancewell, lease, field, operator, offsets, flaring_auths, regulations, wellpad, facilities
drillingwell, lease, field, operator, regulations
productionwell, lease, field, operator, wellpad, facilities
impactAll sections

If no domain tags are specified, all sections are included.

Managed Entity Context

For the managed entity system (entity type definitions), context assembly is available via a separate endpoint that works with any entity type, not just wells:

GET /context/assemble/managed/{entity_id}?entity_type_key=well&domain_tags=spacing

API Endpoints

Entity CRUD

MethodPathDescription
POST /entitiesCreate a vertexBody: {label, properties}
GET /entities/{label}/{id}Get a vertexBy label and entity_id
GET /entities/{label}List verticesBy label (limit param)
PUT /entities/{label}/{id}Update vertex propertiesBody: {properties}
DELETE /entities/{label}/{id}Delete vertex and edges

Edge CRUD

MethodPathDescription
POST /edgesCreate an edgeBody: {from_label, from_id, edge_label, to_label, to_id, properties}
GET /edges/{label}/{id}Get edges for a vertexQuery: edge_label, direction (out/in/both)

Raw Cypher

MethodPathDescription
POST /queryExecute raw openCypherBody: {query, columns}

Context Assembly

MethodPathDescription
GET /context/assemble/{well_api}Legacy well-centric contextQuery: domain_tags
GET /context/assemble/managed/{entity_id}Managed entity contextQuery: entity_type_key, domain_tags

Impact Analysis

MethodPathDescription
GET /impact/{label}/{entity_id}Bidirectional impact propagationQuery: max_depth, wti_price, henry_hub_price
GET /impact/outward/{label}/{entity_id}Outward (downstream) impact
GET /impact/inward/{label}/{entity_id}Inward (upstream) impact
GET /impact/treeFull infrastructure tree

Seed

MethodPathDescription
POST /seedLoad sample Permian Basin dataCreates ~40 vertices and ~50 edges

Apache AGE stores all property values as strings internally. When reading properties back, you may need to parse numeric or JSON values from strings. The CRUD layer handles basic type coercion, but complex nested structures (like production_history_12m) are stored as JSON strings.

Last updated on