Abby 2.0 Phase 5: Advanced Agency — Parallel Workflows and Safety Rails

March 17, 2026 · 3 min read

Creator, Parthenon

AI Development Assistant

Abby can now orchestrate complex multi-step research workflows with independent steps running in parallel. High-risk tools (modify concept sets, update cohort criteria, execute SQL) join the toolkit with safety validation. Dry run mode simulates actions before execution. Workflow templates encode OHDSI best practices into one-click study designs.

What's New

DAG-Based Parallel Execution

Phase 4's Plan-Confirm-Execute ran steps sequentially. Phase 5 introduces a DAG (directed acyclic graph) executor that identifies independent steps and runs them concurrently.

"Build diabetes + metformin cohort with characterization"
    ↓
Wave 1 (parallel):  Create diabetes concept set ║ Create metformin concept set
Wave 2 (sequential): Create cohort definition (depends on both sets)
Wave 3 (sequential): Generate cohort
Wave 4 (sequential): Run characterization

Concept sets have no dependencies on each other — they run simultaneously. The cohort definition waits for both. This reduces a 5-step sequential plan from ~15 seconds to ~10 seconds.

The executor uses Kahn's algorithm (BFS topological sort) to decompose plans into waves. Each wave's steps run via asyncio.gather. If any step fails, all its dependents are automatically skipped with a clear reason.

High-Risk Tools with Safety Validation

Six new tools join the registry, bringing the total to 12:

Tool	Risk	Safety Mechanism
`modify_concept_set`	High	Adds/removes concepts via validated API
`modify_cohort_criteria`	High	Updates expression JSON via validated API
`execute_sql`	High	Regex blocks INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, GRANT, REVOKE, pg_*, COPY
`run_characterization`	Medium	Queues via job system
`run_incidence_analysis`	Medium	Queues via job system
`schedule_recurring_analysis`	High	Creates periodic schedule

The SQL safety validator blocks 12 categories of dangerous patterns before any query reaches the database. Only read-only SELECT queries pass validation.

Dry Run Mode

For high-risk actions, dry run simulates what WOULD happen without executing:

{
  "simulated": true,
  "would_create": "concept_set",
  "name": "Diabetes Conditions",
  "concept_count": 3,
  "description": "Would create concept set 'Diabetes Conditions' with 3 concepts"
}

Every tool has a simulation handler that returns tool-specific fields describing the expected outcome.

Workflow Templates

Pre-built study designs encode OHDSI best practices:

Incident Cohort — condition concept set + optional drug concept set + cohort definition with washout period + generation
Characterization Study — concept set + cohort definition + generation + characterization analysis queue

Templates generate step lists compatible with the plan engine, so researchers can say "run an incident cohort study for diabetes on metformin" and get a complete, reviewable plan.

What Shipped

Component	Tests	Purpose
DAG Executor	7	Parallel wave execution with dependency tracking
Dry Run Simulator	5	Action simulation without side effects
Modify tools	2	Concept set and cohort criteria modification
Analysis tools	2	Characterization and incidence analysis queuing
SQL tools	8	Read-only SQL execution with safety validation
Workflow templates	5	Pre-built OHDSI study design plans
Integration tests	4	End-to-end DAG, dry run, safety, template verification

247 tests passing across the Python AI service.

What's New​

DAG-Based Parallel Execution​

High-Risk Tools with Safety Validation​

Dry Run Mode​

Workflow Templates​

What Shipped​