Skip to Content

Ingestion Endpoints

Data import and ingestion endpoints. The ingestion-service runs on port 8005 and is proxied through the API gateway at http://localhost:8000.

Endpoints

MethodPathDescription
POST/ingest/rrc-scrapeTrigger RRC data scrape
POST/ingest/operator/csvImport operator data from CSV
POST/ingest/scadaIngest SCADA field data
POST/extractExtract entities from text

POST /ingest/rrc-scrape

Trigger a scrape of RRC (Railroad Commission of Texas) data. Supports multiple data types.

curl -X POST http://localhost:8000/ingest/rrc-scrape \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "data_types": ["wells", "permits", "production"], "district": "08", "date_range": { "from": "2026-01-01", "to": "2026-03-31" } }'
FieldTypeDescription
data_typesstring[]Types to scrape: wells, permits, production, flaring, operators, leases
districtstringRRC district number (optional)
date_rangeobjectDate range filter (optional)

In the current development environment, the RRC scraper uses mock data. Production will connect to actual RRC data sources.


POST /ingest/operator/csv

Import operator data from a CSV file. The service normalizes column names and maps data to the knowledge graph schema.

curl -X POST http://localhost:8000/ingest/operator/csv \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: multipart/form-data" \ -F "file=@wells_data.csv"

The CSV should include columns for well identification (API number, well name), location (latitude, longitude), and operational data (operator, status, field, lease).


POST /ingest/scada

Ingest real-time SCADA (Supervisory Control and Data Acquisition) field data.

curl -X POST http://localhost:8000/ingest/scada \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "readings": [ { "well_api": "42-383-12345", "timestamp": "2026-04-10T10:00:00Z", "metrics": { "pressure_psi": 2400, "flow_rate_mcfd": 850, "temperature_f": 165 } } ] }'

POST /extract

Extract entities from unstructured text. Identifies wells, operators, leases, fields, and other entity types mentioned in text documents.

curl -X POST http://localhost:8000/extract \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "text": "The Smith Ranch #1 well (API 42-383-12345) operated by Permian Energy Inc. in the Spraberry field..." }'

Response (200):

{ "entities": [ {"type": "Well", "name": "Smith Ranch #1", "api_number": "42-383-12345"}, {"type": "Operator", "name": "Permian Energy Inc."}, {"type": "Field", "name": "Spraberry"} ], "edges": [ {"from": "Smith Ranch #1", "to": "Permian Energy Inc.", "type": "OPERATED_BY"}, {"from": "Smith Ranch #1", "to": "Spraberry", "type": "LOCATED_IN"} ] }

Ingested data publishes events to Kafka for downstream processing by other services.

Last updated on