Knowledge Graph
AEGIS stores oil and gas entity relationships in an Apache AGE property graph running inside PostgreSQL 15. All graph queries use openCypher syntax.
Technology Stack
| Component | Details |
|---|---|
| Graph engine | Apache AGE (A Graph Extension for PostgreSQL) |
| Query language | openCypher |
| Host database | PostgreSQL 15 |
| Graph name | oilgas |
| Service | Knowledge Graph Service (port 8003) |
Setup Requirements
Every connection to Apache AGE must execute two preparatory statements before running Cypher queries:
LOAD 'age';
SET search_path = ag_catalog, "$user", public;The AgePool connection helper in knowledge_graph/age_connection.py handles this automatically for every acquired connection.
Graph Schema
Vertex Labels
The graph supports the following vertex types:
| Label | Description | Key Properties |
|---|---|---|
Well | Oil or gas well | api_number, name, status, well_type, spud_date, surface_location, production fields |
Lease | Mineral lease | entity_id, name, rrc_lease_id, county, district |
Field | Oil/gas field | entity_id, name, rrc_field_id, district, county, state |
Formation | Geological formation | entity_id, name, type, depth_range |
Operator | Operating company | entity_id, name, rrc_operator_id, p5_number, contact_name, contact_email |
Permit | Drilling or operating permit | entity_id, permit_number, status, type |
FlaringAuthorization | R-32 flaring authorization | entity_id, auth_number, authorized_mcf, expiry_date |
FlaringEvent | Flaring/venting occurrence | entity_id, disposition_code, volume_mcf, date |
Wellpad | Surface grouping of wells | entity_id, name, wells_count |
Facility | Processing/storage facility | entity_id, name, type (tank_battery, compressor, separator, cpf) |
PipelineRoute | Pipeline segment | entity_id, name, diameter, capacity |
InfrastructureProject | Large infrastructure project | entity_id, name, status, type |
Regulation | Regulatory rule | entity_id, name, rule_number, source |
WellEvent | Event on a well | entity_id, event_type, date |
ComplianceDeadline | Regulatory deadline | entity_id, deadline_date, type |
FilingPackage | Assembled filing | entity_id, filing_type, status |
WellConversation | Chat session about a well | entity_id, conversation_id |
Project | General project | entity_id, name, status |
Equipment | Field equipment | entity_id, name, type |
Pipeline | Pipeline entity | entity_id, name |
Edge Labels
Edges define relationships between vertices:
| Edge Label | From | To | Semantics |
|---|---|---|---|
LOCATED_IN | Well | Field, Lease | Well is located in a field or on a lease |
LOCATED_ON | Well | Wellpad | Well is physically on a wellpad |
COMPLETED_IN | Well | Formation | Well produces from a geological formation |
OPERATED_BY | Well, Lease, Wellpad, Facility, PipelineRoute | Operator | Entity is operated by a company |
OFFSET_TO | Well | Well | Wells are within regulatory spacing distance |
GOVERNED_BY | Well, Permit | Regulation | Entity is subject to a regulation |
FLARES_AT | FlaringEvent | Well, Lease | Flaring event occurs at a well or lease |
AUTHORIZED_UNDER | FlaringEvent | FlaringAuthorization | Event is covered by an authorization |
PRODUCES_TO | Well | Facility | Well’s production flows to a facility |
FEEDS_INTO | Facility | PipelineRoute | Facility feeds into a pipeline |
CONNECTS_TO | InfrastructureProject, PipelineRoute | Lease, InfrastructureProject | Infrastructure connects to entities |
HAS_EVENT | Well | WellEvent | Well has an associated event |
HAS_DEADLINE | Well | ComplianceDeadline | Well has a compliance deadline |
HAS_FILING | Well | FilingPackage | Well has a filing package |
HAS_CONVERSATION | Well | WellConversation | Well has a chat session |
PART_OF | (any) | (any) | Generic hierarchical parent-child relationship |
Sample Cypher Queries
Find a well and its operator
MATCH (w:Well {api_number: '42-329-12345'})-[:OPERATED_BY]->(op:Operator)
RETURN w, opFind all offset wells for a given well
MATCH (w:Well {api_number: '42-329-12345'})-[:OFFSET_TO]->(offset:Well)
RETURN offset.api_number, offset.name, offset.statusTrace production flow from well to pipeline
MATCH (w:Well {api_number: '42-329-12345'})-[:PRODUCES_TO]->(f:Facility)-[:FEEDS_INTO]->(p:PipelineRoute)
RETURN w.name, f.name, f.type, p.nameFind all wells on a lease
MATCH (w:Well)-[:LOCATED_IN]->(l:Lease {entity_id: 'lease-mitchell-ranch'})
RETURN w.api_number, w.name, w.statusImpact propagation (multi-hop)
MATCH path = (start:Facility {entity_id: 'fac-tank-battery-a'})<-[:PRODUCES_TO]-(w:Well)
RETURN w.api_number, w.name, w.production_oil_bbls, w.production_gas_mcfContext Assembly (Tier 3.5)
The ContextAssembler class in knowledge_graph/context.py gathers entity context for injection into the agent pipeline. Given a well API number and optional domain tags, it assembles:
- The well and its properties
- Its lease, field, and operator
- Active flaring authorizations
- Offset wells (via
OFFSET_TOedges) - Applicable regulations (via
GOVERNED_BYedges) - Connected infrastructure (wellpad, facilities, pipeline routes)
Domain Tag Filtering
Domain tags control which sections are included in the assembled context:
| Tag | Sections Included |
|---|---|
spacing / rule_37 | well, lease, field, operator, offsets, regulations |
flaring | well, lease, field, operator, flaring_auths, infrastructure, regulations, wellpad, facilities, pipeline_routes |
rule_32 | well, lease, field, operator, flaring_auths, infrastructure, regulations |
compliance | well, lease, field, operator, offsets, flaring_auths, regulations, wellpad, facilities |
drilling | well, lease, field, operator, regulations |
production | well, lease, field, operator, wellpad, facilities |
impact | All sections |
If no domain tags are specified, all sections are included.
Managed Entity Context
For the managed entity system (entity type definitions), context assembly is available via a separate endpoint that works with any entity type, not just wells:
GET /context/assemble/managed/{entity_id}?entity_type_key=well&domain_tags=spacingAPI Endpoints
Entity CRUD
| Method | Path | Description |
|---|---|---|
POST /entities | Create a vertex | Body: {label, properties} |
GET /entities/{label}/{id} | Get a vertex | By label and entity_id |
GET /entities/{label} | List vertices | By label (limit param) |
PUT /entities/{label}/{id} | Update vertex properties | Body: {properties} |
DELETE /entities/{label}/{id} | Delete vertex and edges |
Edge CRUD
| Method | Path | Description |
|---|---|---|
POST /edges | Create an edge | Body: {from_label, from_id, edge_label, to_label, to_id, properties} |
GET /edges/{label}/{id} | Get edges for a vertex | Query: edge_label, direction (out/in/both) |
Raw Cypher
| Method | Path | Description |
|---|---|---|
POST /query | Execute raw openCypher | Body: {query, columns} |
Context Assembly
| Method | Path | Description |
|---|---|---|
GET /context/assemble/{well_api} | Legacy well-centric context | Query: domain_tags |
GET /context/assemble/managed/{entity_id} | Managed entity context | Query: entity_type_key, domain_tags |
Impact Analysis
| Method | Path | Description |
|---|---|---|
GET /impact/{label}/{entity_id} | Bidirectional impact propagation | Query: max_depth, wti_price, henry_hub_price |
GET /impact/outward/{label}/{entity_id} | Outward (downstream) impact | |
GET /impact/inward/{label}/{entity_id} | Inward (upstream) impact | |
GET /impact/tree | Full infrastructure tree |
Seed
| Method | Path | Description |
|---|---|---|
POST /seed | Load sample Permian Basin data | Creates ~40 vertices and ~50 edges |
Apache AGE stores all property values as strings internally. When reading properties back, you may need to parse numeric or JSON values from strings. The CRUD layer handles basic type coercion, but complex nested structures (like production_history_12m) are stored as JSON strings.