CI Green at Last: Codebase Hardening, AtlanticHealth Synthesis, and a 147-Test Renaissance
After months of a perpetually red CI pipeline, today marks a turning point for Parthenon: 92 commits, a full-spectrum codebase review, a complete AtlanticHealth patient synthesis pipeline, and — most satisfying of all — every CI job green. Here's how we got there.
The CI Pipeline Was Never Green (Until Today)
The most impactful work today was a ~6-hour, five-phase codebase hardening sprint that touched virtually every layer of the stack. The starting state was grim: CI failing on every push, 6% test coverage, and four files well past their size limits. The ending state: all six CI jobs passing, 147 new tests written, and a documented methodology for keeping things that way.
The failure modes were stacking and masking each other, which made the pipeline feel intractable. Once we untangled them, the root causes were addressable one by one:
- 37 TypeScript errors in the investigation module — mostly Lucide icon casting issues, incorrect property access on
PaginatedResponse(.datavs.items), anduseRefstrict mode violations. Fixed with properLucidePropstyping and a pass to remove dead code. - 80+ Pint code style violations — Pint 1.29 quietly introduced the
fully_qualified_strict_typesrule. We resolved these by running auto-format through a Docker Pint container pinned to the same version as CI, ensuring parity. The final straggler — asingle_quoteandunary_operatorviolation inMorpheusPatientService— was cleaned up in commit7ad77af. - 11 PHPStan errors outside the baseline — caused by the strict_types changes shuffling what PHPStan was tracking. Regenerated the baseline (33 → 31 known errors) and committed it cleanly.
- 6 Python test failures — the FastAPI app was still using the deprecated
@app.on_event("startup")pattern. Migrated to the modernlifespancontext manager. - CI database schema mismatches — the CI environment was still referencing legacy schema names (
vocab,cdm,achilles_results) instead of the current ones (omop,results,gis). A PostGIS extension failure was also aborting migration transactions mid-run.
The fix methodology is now codified as an internal ADR so future contributors have a clear playbook when CI goes red.
AtlanticHealth Synthesis Pipeline: 3,250 Patients, MIMIC-Standard
On the data generation side, we shipped a complete AtlanticHealth synthesis pipeline today. The headline: 3,250 synthetic patients with full MIMIC-standard data, generated end-to-end through a multi-phase pipeline.
Phases 4–7 were added to cover the full clinical picture: procedure events, microbiology results, and input/output events. Earlier phases handle the patient cohort, admissions, and diagnoses. The result is a realistic, MIMIC-schema-compatible dataset sourced from AtlanticHealth's structure — which required adapting the labevents, chartevents, and transfers queries to match AtlanticHealth's actual schema (commit c5f05e83).
We also cleaned up \\N bulk-import artifacts left over from PostgreSQL COPY operations on AtlanticHealth source data (commit 37b871063). These null-sentinel strings were leaking into text fields and causing downstream parsing issues — a subtle bug that would have been painful to debug later in the OMOP conversion layer.
This synthetic dataset is foundational: it gives us a realistic, large-scale cohort for testing the Morpheus ETL pipeline without touching any real patient data.
Morpheus UX: Dataset Parameter Persistence
A smaller but user-facing fix worth calling out: the dataset query parameter was being dropped when users switched tabs or clicked breadcrumb navigation inside Morpheus. This meant the UI would silently lose context, forcing users to re-select their dataset. The fix ensures the parameter is persisted through tab switches and breadcrumb navigation — a subtle but frustrating UX regression that's now resolved (commit 36222e5).
Codebase Architecture: ADRs, Docs, and Decomposition
Part of the hardening sprint involved structural improvements that won't show up in feature metrics but matter enormously for maintainability:
- 8 Architecture Decision Records (ADRs) written, covering decisions that were previously implicit or tribal knowledge.
- 11 new documentation pages across five previously underdocumented modules.
- 4 oversized files decomposed — each was more than 3× the project's file size guideline. Breaking these apart improves testability and makes the codebase easier to navigate.
- Docker hardening — the development and CI Docker configurations were reviewed and tightened.
Going from zero ADRs to eight in a single session is a significant knowledge capture moment. These documents will pay dividends the next time someone asks "why does it work this way?"
Dependency Updates
We also rolled forward several key dependencies today:
- Vite 8 and plugin-react 6 — keeping the frontend build toolchain current.
- Ollama 0.6 and LangChain 1 — AI integration libraries bumped to latest stable.
- sentence-transformers and transformers — Python AI requirements updated.
- laravel/tinker 3.0.0 — bumped from 2.11.1.
None of these are risky upgrades in isolation, but doing them together while CI is green (rather than red) makes it much easier to catch any regressions they introduce.
What's Next
With CI green and a solid synthetic dataset in hand, the immediate priorities are:
- OMOP ETL validation — run the AtlanticHealth synthetic cohort through the Morpheus OMOP conversion pipeline and validate concept mapping coverage.
- Test coverage growth — 147 new tests is a great start from 6%, but we want to reach a meaningful floor (targeting 40%+) before the next major feature push.
- PHPStan baseline reduction — the 31 known errors in the baseline are technical debt. Now that CI is stable, we can chip away at these systematically.
- Investigation module hardening — the TypeScript fixes today were correctness patches; a deeper review of the investigation module's data flow is warranted.
Today was a grind in the best sense — the kind of session where you clear out months of accumulated friction and leave the codebase meaningfully better for everyone who touches it next.