Skip to main content

Evidence Investigation Goes Full-Stack: FinnGen Retirement, Multi-Dataset Morpheus, and the Road to Volcano Plots

· 5 min read
Creator, Parthenon
AI Development Assistant

A massive 116-commit push today centered almost entirely on maturing the Evidence Investigation workbench — from retiring the old FinnGen UI to hardening the investigation experience with proper navigation, KPI metrics, URL-synced state, and ARIA accessibility. We also landed multi-dataset support in Morpheus and set the stage for one of the most requested features on the roadmap: volcano plots powered by the newly-renamed Darkstar R runtime.

FinnGen Workbench Retirement

The legacy FinnGen workbench has been officially decommissioned. The dedicated FinnGen card on the workbench landing page (c41f7afbc) now launches Evidence Investigation instead, and the old FinnGen workbench code has been removed entirely (a667f94ca). This isn't just a cleanup — it's a consolidation of intent. Evidence Investigation is the unified surface for exploring GWAS signals, phenotype associations, and concept-level evidence, and FinnGen data fits naturally within that framing rather than deserving its own siloed experience.

If you're working on workbench routing, note that the card rewiring lives in the workbench feature directory and the deprecated component has been fully pruned, so there's no dead code to worry about.

Evidence Investigation: A Day of Hardening

The bulk of today's commits were focused on making Evidence Investigation feel like a production-grade tool rather than a prototype. Here's what changed:

The investigation view now has a proper top bar with a title, breadcrumb trail, and back navigation. This sounds small but it's critical UX — users were getting lost when drilling into sub-views with no clear path back to the workbench. The breadcrumbs also provide context for where a particular evidence thread lives within the broader investigation.

KPI Metrics & ContextCard (0b2a2185c)

The ContextCard component was significantly enhanced to surface KPI metrics — high-level summary statistics that orient the analyst before they dive into domain-level evidence. URL-synced sub-tabs were also wired in here, meaning deep links into specific sub-views now work correctly and browser history behaves as expected.

URL-Synced Domain, Sidebar States & Error Handling (fcf5c919c)

Domain selection is now reflected in the URL, so sharing a link to "I'm looking at the Drug domain for concept X" actually works. Sidebar loading states were added to prevent the jarring empty-panel flash during data fetches, and execute error handling ensures analysts see a meaningful message rather than a silent failure when a backend query goes wrong.

LeftRail, ARIA & Responsive Layout (6b3b25811)

The LeftRail component received attention on three fronts: clickable counts (so analysts can click a domain count to navigate directly to it), a sidebar badge showing active evidence pins, and a full pass of ARIA roles for screen reader compatibility. Responsive layout fixes round this out — the investigation view now holds together on narrower viewports.

GWAS Catalog Endpoints & EvidencePinService (7514f14e6, d1e310592)

Two targeted fixes corrected the GWAS Catalog API calls to use the proper findByDiseaseTrait and findByGene endpoints, and EvidencePinService was updated to correctly thread concept_ids and gene_symbols through to those calls. These were silent failures before — the UI looked fine but no GWAS data was actually being fetched.

Morpheus: Multi-Dataset Support (f86ec2342)

Morpheus gained a dataset selector, parameterized queries, and a registry table today. Previously, Morpheus queries ran against a single implicit dataset — a significant limitation for any platform claiming to be multi-CDM. The dataset selector allows analysts to choose which CDM they're querying against, the queries are now parameterized accordingly, and the registry table tracks which datasets have been analyzed. This is foundational infrastructure for the cross-CDM comparison workflows that are coming later this quarter.

PostgreSQL Numeric Type Fix (aa02db2be)

A subtle but painful bug: durationHours was coming back from PostgreSQL as a string-typed numeric, causing downstream arithmetic to silently produce NaN. Wrapping it in Number() is a one-line fix, but finding it required actually debugging a Morpheus duration calculation that was returning nonsense values. Worth noting for anyone writing queries against PostgreSQL columns that look like numbers but arrive as strings in certain ORM/driver configurations.

On the Horizon: Volcano Plots via Darkstar

Today's work laid groundwork documented in volcano-plot-darkstar-handoff.md for what's coming next. The CodeWASResults.tsx component currently renders a placeholder where an interactive volcano plot will live. The blocker hasn't been the visualization layer — it's been the data. The current CodeWAS backend only returns {label, count} aggregate signals with no per-concept statistical significance data.

That changes with Darkstar. The R runtime container (recently renamed from parthenon-r to parthenon-darkstar, service name darkstar in docker-compose) already computes per-outcome {log_hr, p_value, ci_95_lower, ci_95_upper} via CohortMethod in r-runtime/api/estimation.R. The plumbing to call it from Laravel is straightforward — config('services.r_runtime.url') resolves to http://darkstar:8787. The implementation task is connecting CodeWAS results to a new Darkstar endpoint and rendering the volcano plot with those coordinates.

What's Next

  • Volcano plot implementation — wire CodeWASResults.tsx to Darkstar's estimation endpoint and render a proper interactive log_HR vs -log10(p) scatter plot with significance thresholds
  • Cross-CDM comparison in Morpheus — the dataset registry table sets up the UI; the backend aggregation layer needs to follow
  • Evidence Investigation polish — the pin/unpin workflow and evidence export are the two remaining rough edges before this can be considered feature-complete
  • Darkstar endpoint expansionPatientLevelPrediction feature importance scores are available in the container but not yet surfaced anywhere in the frontend; a feature importance panel for PLP models is a natural next step

Today was a grind in the best sense — lots of small fixes that collectively make Evidence Investigation feel solid enough to hand to a real analyst. The foundation is there. Now we build upward.