5 posts tagged with "patient-similarity"

Parthenon v1.0.6 — FinnGen Workbench, SSO, and Light Mode

April 16, 2026 · 7 min read

Creator, Parthenon

v1.0.6 — FinnGen Workbench, SSO, and Light Mode

v1.0.6 is the biggest feature release in the v1.0.x arc. After two back-to-back stabilization releases (v1.0.4 test coverage, v1.0.5 data quality), the platform was ready for net-new modules. This release lands four of them at once: the FinnGen Cohort Workbench, Authentik SSO, first-class light mode, and a substantially reworked Patient Similarity explorer — plus a doubled care-bundle library, a project-management handoff to Acumenus Data Room, and a long list of installer and CI hardening fixes.

From Jaccard to Network Fusion: How Parthenon's Patient Similarity Engine Became Research-Grade

April 10, 2026 · 22 min read

Sanjay M. Udoshi MD

Creator, Parthenon

Claude (AI Pair Programmer)

AI Development Assistant

Eight days ago, we shipped the Patient Similarity Engine — a multi-modal system that scores patients across six clinical dimensions using weighted Jaccard, z-scored lab distances, and pathogenicity-tiered genomic matching. Two days later, we generated embeddings for a million patients. The engine worked. Researchers could find patients like a seed patient, compare cohorts, and export results.

But it wasn't research-grade. The Jaccard similarity was binary — two patients with Type 1 DM and Type 2 DM got zero credit even though they share the ancestor "Diabetes mellitus" in the SNOMED hierarchy. The cohort comparison showed a radar chart with divergence percentages, but couldn't tell you which covariates were driving the imbalance or how the distributions actually differed. There was no propensity scoring, no temporal analysis, no phenotype discovery, and no way to fuse multiple data modalities into a single principled similarity measure.

Tonight, in a single session, we shipped eight interconnected upgrades that transform the Patient Similarity Engine from a useful clinical tool into a research platform that exceeds the analytical capabilities of OHDSI Atlas, Oracle Healthcare's "Patients Like Mine," and every open-source OMOP similarity system we've been able to find.

This is the story of what we built, why each piece matters, and how they work together.

From Five Disconnected Tabs to a Research Workspace: Redesigning the Patient Similarity UI

April 10, 2026 · 17 min read

Sanjay M. Udoshi MD

Creator, Parthenon

Claude (AI Pair Programmer)

AI Development Assistant

We shipped eight analytical upgrades to the Patient Similarity Engine last week — hierarchical concept similarity, Love plots, distributional divergence, propensity score matching, UMAP projections, temporal DTW, consensus clustering, and similarity network fusion. The engine is now, arguably, more analytically capable than anything in the OHDSI ecosystem for cohort-level comparison.

But the UI was still the original five-tab layout we built in the first sprint. And no amount of analytical horsepower matters if a researcher opens the page, sees five tabs without context, and doesn't understand the order of operations.

Tonight we replaced it entirely.

One Million Patient Embeddings: GPU-Accelerated Similarity Search Comes to Parthenon

April 4, 2026 · 20 min read

Sanjay M. Udoshi MD

Creator, Parthenon

Claude (AI Pair Programmer)

AI Development Assistant

Two days ago, we shipped the Patient Similarity Engine — a multi-modal system that scores patients across six clinical dimensions on OMOP CDM. The architecture was sound. The algorithms worked. But there was a problem hiding in plain sight: none of our patients had embeddings.

The embedding pipeline had been silently failing since day one. Three type mismatches between our PHP backend and Python AI service meant that every embedding request returned a validation error, was caught by a try/catch block, and logged as a warning that nobody read. The feature vectors were all there — conditions, drugs, measurements, procedures — but the 512-dimensional dense vectors that would make similarity search fast at scale? Zero. For every source. For every patient.

Tonight, we fixed all three bugs, refactored the embedding pipeline from CPU-only SapBERT to GPU-accelerated Ollama, upgraded from 512 to 768 dimensions, introduced batch deduplication that delivered a 123x throughput improvement, and generated embeddings for 1,007,007 patients across three CDM sources. This is the story of what broke, what we built, and what it unlocks.

Patients Like Mine: Building a Multi-Modal Patient Similarity Engine on OMOP CDM

April 2, 2026 · 18 min read

Sanjay M. Udoshi MD

Creator, Parthenon

Claude (AI Pair Programmer)

AI Development Assistant

For twenty years, the question "which patients are most like this one?" has haunted clinical informatics. Molecular tumor boards want to know: of the 300 patients in our pancreatic cancer corpus, which ones had the same pathogenic variants, the same comorbidity profile, the same treatment history — and what happened to them? Population health researchers want to seed cohort definitions not from abstract inclusion criteria but from a concrete index patient. And every clinician who has ever stared at a complex case has wished for a button that says show me others like this.

Today, Parthenon ships that button. The Patient Similarity Engine is a multi-modal matching system that scores patients across six clinical dimensions — demographics, conditions, measurements, drugs, procedures, and genomic variants — with user-adjustable weights, dual algorithmic modes, bidirectional cohort integration, and tiered privacy controls. It works across any OMOP CDM source in the platform, from the 361-patient Pancreatic Cancer Corpus to the million-patient Acumenus CDM.

This post tells the story of why it was needed, what we studied before building it, how it works under the hood, and what we learned along the way.

v1.0.6 — FinnGen Workbench, SSO, and Light Mode​

v1.0.6 — FinnGen Workbench, SSO, and Light Mode