Skill Injection

Skills are specialized knowledge packages that augment agent capabilities during execution. AEGIS uses a tiered injection architecture that progressively loads more context as needed, with an injection ledger to prevent duplicate content.

Tier Architecture

The skill system has four tiers, each adding more context at increasing token cost:


Tier 1 (~50 tokens)      Tier 2 (~200-800 tokens)     Tier 3 (variable)         Tier 3.5 (variable)
+------------------+     +---------------------+      +------------------+       +------------------+
| Manifest         |     | Full Definition     |      | Artifacts        |       | Graph Context    |
|                  |     |                     |      |                  |       |                  |
| - skill_id       | --> | - description       | -->  | - Reference docs | -->   | - Well data      |
| - name           |     | - steps (ordered)   |      | - Templates      |       | - Lease/field    |
| - description    |     | - requirements      |      | - Regulatory text|       | - Operator       |
| - triggers       |     | - output_format     |      | - Examples       |       | - Offsets        |
| - domain_tags    |     | - hitl_checkpoints  |      | - Token estimate |       | - Regulations    |
+------------------+     +---------------------+      +------------------+       +------------------+

Always loaded in         Loaded when LLM emits       Loaded alongside         Loaded from knowledge
system prompt (compact)  SKILL_SELECT:{id}           Tier 2 if artifacts      graph if entity context
                                                     exist for the skill      is available

Tier 1: Manifest

The manifest is a compact summary (~50 tokens) that is always available in the system prompt. It tells the LLM what skills exist and when to request them.


{
  "skill_id": "spacing-calculation",
  "name": "Spacing Calculation",
  "description": "Calculate distances from proposed well to lease lines and offset wells",
  "triggers": ["spacing", "distance", "lease line", "offset distance", "467 feet"],
  "domain_tags": ["spacing", "rule_37", "drilling"]
}

Tier 2: Full Definition

Loaded on demand when the LLM requests the skill. Contains the complete specification:


{
  "description": "Calculates the distance from a proposed well surface location...",
  "steps": [
    "Retrieve the subject well location from the knowledge graph",
    "Identify the lease boundaries and compute distance to each lease line",
    "Find all wells on the same lease and compute well-to-well distances",
    "..."
  ],
  "requirements": [
    "Well must have surface_location coordinates in the knowledge graph"
  ],
  "output_format": "Spacing Summary:\n- Distance to nearest lease line: X ft..."
}

Tier 3: Artifacts

Reference documents stored in the skill_artifacts table. These are loaded alongside the Tier 2 definition when available. Examples include regulatory text excerpts, form templates, calculation examples, and filing precedent.

Each artifact has:

name: Human-readable label
content: Full text content
content_hash: SHA-256 hash for change detection
token_estimate: Approximate token count for budget tracking

Tier 3.5: Graph Context

Entity-specific context assembled from the knowledge graph. Based on the well API number or entity ID in working memory, the context assembler queries the graph for:

The entity and its properties
Related entities (lease, field, operator)
Offset wells within regulatory distance
Active flaring authorizations
Applicable regulations
Connected infrastructure (wellpad, facilities, pipelines)

The sections included depend on the skill’s domain_tags. See Knowledge Graph - Context Assembly for the domain-to-section mapping.

Injection Flow

The injection happens across two LangGraph nodes:

skill_select_node

Scans the latest assistant message for SKILL_SELECT:{skill_id} patterns using regex:


SKILL_SELECT_PATTERN = re.compile(r"SKILL_SELECT:([a-zA-Z0-9_-]+)")

For example, if the LLM outputs:


I'll analyze the spacing for this well. SKILL_SELECT:spacing-calculation

The node extracts spacing-calculation and stores it in selected_skill_ids, excluding any skills already in injected_skill_ids.

skill_inject_node

For each selected skill, performs the following steps:


For each skill_id in selected_skill_ids:
  1. Check injection ledger     --> Skip if already injected
  2. Load Tier 2 definition     --> From skills table (PostgreSQL)
  3. Load Tier 3 artifacts      --> From skill_artifacts table
  4. Load Tier 3.5 graph ctx    --> From knowledge graph service
  5. Format and inject          --> Append as system message
  6. Mark in ledger             --> Prevent future re-injection

The injected content is formatted as a structured system message:


=== Skill Activated: Spacing Calculation ===

## Description
Calculates the distance from a proposed well surface location to all
lease boundary lines and to every existing well within the regulatory
spacing radius.

## Steps
  1. Retrieve the subject well location from the knowledge graph
  2. Identify the lease boundaries and compute distance to each lease line
  3. Find all wells on the same lease and compute well-to-well distances
  4. Find all wells on adjacent leases within 1,200 ft
  5. Determine if standard spacing is met or exception is required
  6. Return a spacing summary table

## Requirements
  - Well must have surface_location coordinates in the knowledge graph
  - Lease must have boundary information or known distances

## Output Format
Spacing Summary:
- Distance to nearest lease line: X ft (PASS/FAIL vs 1,200 ft / 467 ft)
- Distance to nearest well (same lease): X ft (PASS/FAIL vs 467 ft)
- Distance to nearest well (offset lease): X ft
- Exception required: YES/NO
- Exception type: Regular / Density / No-objection

## Entity Context
Well: Mitchell Ranch 1H (API: 42-329-12345)
  Status: active, Type: horizontal
  Lease: Mitchell Ranch Lease
  Field: Spraberry (Trend Area), District 08
  Operator: Permian Basin Energy LLC (P-5: 683214)

Skill Registry

Skills are stored in the PostgreSQL skills table:


CREATE TABLE skills (
    id VARCHAR(100) PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    tier1_manifest JSONB NOT NULL,
    tier2_definition JSONB NOT NULL,
    tier3_artifact_refs JSONB,
    domain_tags VARCHAR(100)[],
    status VARCHAR(20) DEFAULT 'active',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

Artifacts are in a separate table:


CREATE TABLE skill_artifacts (
    id VARCHAR(100) PRIMARY KEY,
    skill_id VARCHAR(100) REFERENCES skills(id),
    name VARCHAR(200) NOT NULL,
    content TEXT NOT NULL,
    content_hash VARCHAR(64) NOT NULL,
    token_estimate INT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Registered Skills

Rule 37 Skills

ID	Name	Domain Tags	HITL Checkpoints
`spacing-calculation`	Spacing Calculation	spacing, rule_37, drilling	—
`offset-well-analysis`	Offset Well Analysis	spacing, rule_37	—
`rule37-filing-assembly`	Rule 37 Filing Assembly	spacing, rule_37, drilling	`pre_filing`
`good-cause-narrative`	Good Cause Narrative	spacing, rule_37	`good_cause_review`

Rule 32 Skills

ID	Name	Domain Tags	HITL Checkpoints
`flaring-volume-calc`	Flaring Volume Calculation	flaring, rule_32, compliance	—
`gas-analysis`	Gas Composition Analysis	flaring, rule_32	—
`rule32-filing-assembly`	Rule 32 Filing Assembly	flaring, rule_32	`pre_filing`
`emissions-estimate`	Emissions Estimate	flaring, rule_32, compliance	—

Checklist-Item Skills (R2)

In addition to the core skills above, 35 checklist-item skills are registered for the redesigned workflow-driven execution. These map 1:1 to specific checklist steps:

Rule 37: 11 skills (r37-exception-type, r37-field-rules, r37-offset-identification, r37-service-list, r37-waiver-tracking, r37-form-w1, r37-good-cause, r37-plat-prep, r37-pre-filing-review, r37-filing-assembly, r37-post-filing-tracking)
Rule 32: 10 skills
Form PR: 8 skills
Flaring Monitor: 6 skills

Each checklist-item skill specifies which structured SSE events it emits (e.g., DATA_TABLE_UPDATE, FORM_FIELD_UPDATE, ARTIFACT_GENERATED).

Injection Ledger Deduplication

The injection ledger is a Redis Hash at skill:ledger:{conversation_id}:


skill:ledger:conv-001
  "skill:spacing-calculation" -> "injected"
  "skill:offset-well-analysis" -> "injected"

Before injecting any skill, the node calls ledger_check() to see if the key already exists. This prevents the same skill definition from being injected twice even if the LLM requests it multiple times.

The ledger can also be used to track entity context injections and other deduplicated content. The evict endpoint allows removing a ledger entry to force re-injection if the underlying data has changed.

Seeding

To populate the skill registry for development:


cd services/orchestration-engine
poetry run python -m orchestration.seed_skills

This script inserts all skill definitions and agent configurations into the database. It handles upserts (insert or update on conflict) so it is safe to run multiple times.