Skip to main content

Abby 2.0 Phase 5: Advanced Agency — Parallel Workflows and Safety Rails

· 3 min read
Creator, Parthenon
AI Development Assistant

Abby can now orchestrate complex multi-step research workflows with independent steps running in parallel. High-risk tools (modify concept sets, update cohort criteria, execute SQL) join the toolkit with safety validation. Dry run mode simulates actions before execution. Workflow templates encode OHDSI best practices into one-click study designs.

Abby AI assistant

What's New

DAG-Based Parallel Execution

Phase 4's Plan-Confirm-Execute ran steps sequentially. Phase 5 introduces a DAG (directed acyclic graph) executor that identifies independent steps and runs them concurrently.

"Build diabetes + metformin cohort with characterization"

Wave 1 (parallel): Create diabetes concept set ║ Create metformin concept set
Wave 2 (sequential): Create cohort definition (depends on both sets)
Wave 3 (sequential): Generate cohort
Wave 4 (sequential): Run characterization

Concept sets have no dependencies on each other — they run simultaneously. The cohort definition waits for both. This reduces a 5-step sequential plan from ~15 seconds to ~10 seconds.

The executor uses Kahn's algorithm (BFS topological sort) to decompose plans into waves. Each wave's steps run via asyncio.gather. If any step fails, all its dependents are automatically skipped with a clear reason.

High-Risk Tools with Safety Validation

Six new tools join the registry, bringing the total to 12:

ToolRiskSafety Mechanism
modify_concept_setHighAdds/removes concepts via validated API
modify_cohort_criteriaHighUpdates expression JSON via validated API
execute_sqlHighRegex blocks INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, GRANT, REVOKE, pg_*, COPY
run_characterizationMediumQueues via job system
run_incidence_analysisMediumQueues via job system
schedule_recurring_analysisHighCreates periodic schedule

The SQL safety validator blocks 12 categories of dangerous patterns before any query reaches the database. Only read-only SELECT queries pass validation.

Dry Run Mode

For high-risk actions, dry run simulates what WOULD happen without executing:

{
"simulated": true,
"would_create": "concept_set",
"name": "Diabetes Conditions",
"concept_count": 3,
"description": "Would create concept set 'Diabetes Conditions' with 3 concepts"
}

Every tool has a simulation handler that returns tool-specific fields describing the expected outcome.

Workflow Templates

Pre-built study designs encode OHDSI best practices:

  • Incident Cohort — condition concept set + optional drug concept set + cohort definition with washout period + generation
  • Characterization Study — concept set + cohort definition + generation + characterization analysis queue

Templates generate step lists compatible with the plan engine, so researchers can say "run an incident cohort study for diabetes on metformin" and get a complete, reviewable plan.


What Shipped

ComponentTestsPurpose
DAG Executor7Parallel wave execution with dependency tracking
Dry Run Simulator5Action simulation without side effects
Modify tools2Concept set and cohort criteria modification
Analysis tools2Characterization and incidence analysis queuing
SQL tools8Read-only SQL execution with safety validation
Workflow templates5Pre-built OHDSI study design plans
Integration tests4End-to-end DAG, dry run, safety, template verification

247 tests passing across the Python AI service.