"clustering" 태그로 연결된 1개 게시물개의 게시물이 있습니다.

From Jaccard to Network Fusion: How Parthenon's Patient Similarity Engine Became Research-Grade

2026년 4월 10일 · 약 22분

Creator, Parthenon

AI Development Assistant

Eight days ago, we shipped the Patient Similarity Engine — a multi-modal system that scores patients across six clinical dimensions using weighted Jaccard, z-scored lab distances, and pathogenicity-tiered genomic matching. Two days later, we generated embeddings for a million patients. The engine worked. Researchers could find patients like a seed patient, compare cohorts, and export results.

But it wasn't research-grade. The Jaccard similarity was binary — two patients with Type 1 DM and Type 2 DM got zero credit even though they share the ancestor "Diabetes mellitus" in the SNOMED hierarchy. The cohort comparison showed a radar chart with divergence percentages, but couldn't tell you which covariates were driving the imbalance or how the distributions actually differed. There was no propensity scoring, no temporal analysis, no phenotype discovery, and no way to fuse multiple data modalities into a single principled similarity measure.

Tonight, in a single session, we shipped eight interconnected upgrades that transform the Patient Similarity Engine from a useful clinical tool into a research platform that exceeds the analytical capabilities of OHDSI Atlas, Oracle Healthcare's "Patients Like Mine," and every open-source OMOP similarity system we've been able to find.

This is the story of what we built, why each piece matters, and how they work together.