Skip to main content

Parthenon Database Architecture

Reading time: ~15 minutes Audience: Developers, data engineers, and stakeholders onboarding to the Parthenon platform Last updated: 2026-03-12


1. The Two-Database Pattern

Parthenon splits data across two PostgreSQL instances. This is intentional — not technical debt.

Docker PG 16 (parthenon database, port 5480)

  • Portable application metadata: users, roles, cohort definitions, studies, analyses, concept sets
  • Created from scratch by docker compose up + Laravel migrations
  • Lightweight, disposable, re-seedable
  • ~104 tables in the app schema

External PG 17 (ohdsi database, port 5432)

  • Operational data store: the real OMOP CDM v5.4
  • 1M+ patients, 1.8B+ clinical observations, 7.2M vocabulary concepts
  • Achilles characterization results, GIS extension tables
  • Hundreds of GB, ETL'd from claims/EHR sources — too large to containerize

This mirrors the standard OHDSI deployment pattern. Atlas/WebAPI uses the same split: app tier separate from CDM tier. Parthenon makes it explicit with named Laravel connections.

What Happens After docker compose up

Docker PG gets:

  • app schema with all Laravel migration tables (users, roles, sources, etc.)
  • Empty vocab, cdm, achilles_results, eunomia schemas (created by docker/postgres/init.sql)
  • The empty OMOP schemas are expected — they're populated only when you run parthenon:load-eunomia or connect to an external PG

External PG (if configured) has:

  • Full CDM + vocabulary in the omop schema
  • Achilles results in achilles_results schema
  • GIS data in the gis schema
  • A mirror of the app schema (kept in sync via db:sync)

2. Connection Topology

Laravel routes queries through 7 named database connections. Which physical database each connection targets depends on the deployment profile.

Docker-Only Profile (New Installs)

All connections point to Docker PG. CDM/vocab/results use the eunomia schema (2,694-patient GiBleed demo dataset).

Acumenus Profile (Production)

pgsql points to External PG (not Docker). CDM/vocab/results target the omop and achilles_results schemas on External PG.

Key

On Acumenus, pgsql points to External PG (ohdsi database) — NOT Docker PG. The docker_pg connection always targets Docker PG regardless of profile. On Docker-only installs, pgsql and docker_pg point to the same database.


3. Schema Inventory

External PG 17 (ohdsi) — 15 schemas

SchemaTablesPurposeRow Scale
omop48Combined CDM v5.4 + vocabulary (Atlas/ETL convention)1.8B+
achilles_results8Achilles characterization + DQD results1.8M
app112Application tables (mirror of Docker + 8 extra tables)Thousands
gis5GIS extension (geographic_location, external_exposure, hospitals)Thousands
eunomia20GiBleed demo CDM (2,694 patients)343K
eunomia_results4Achilles results for Eunomia demo64K
webapi105Legacy Atlas WebAPI tables (read-only, historical)~0
basicauth8Legacy Atlas auth (unused)~0
vocab11Empty — created by migrations, unused on Acumenus0
cdm24Empty — created by migrations, unused on Acumenus0
staging1ETL staging areaVariable
topologyPostGIS topology extensionSystem
vocabularyLegacy schema alias0
resultsLegacy schema from Atlas era0
publicDefault PostgreSQL schemaSystem
note

The app schema on External PG has 8 more tables than Docker PG (112 vs 104). These include genomic_variants (17.8M rows) and gis_admin_boundaries (51K) — tables created by ETL scripts, not managed by Laravel migrations.

Docker PG 16 (parthenon) — 7 schemas

SchemaTablesPurpose
app104Application tables from Laravel migrations
vocab0Created by init.sql, empty (populated only via Eunomia seeder)
cdm0Created by init.sql, empty (populated only via Eunomia seeder)
achilles_results0Created by init.sql, empty
eunomia0Created by init.sql, populated by parthenon:load-eunomia
public1Laravel migrations tracking
topology2PostGIS topology extension

4. The omop Schema Explained

The omop schema on External PG is a combined CDM + vocabulary schema. This is the standard Atlas/ETL convention — both clinical tables (person, visit_occurrence, condition_occurrence) and vocabulary tables (concept, concept_relationship, vocabulary) live in a single schema.

Parthenon's separate cdm and vocab Laravel connections both point here via search_path overrides in .env:

CDM_DB_SEARCH_PATH=omop,public
VOCAB_DB_SEARCH_PATH=omop,public

The omop schema also contains ETL-specific tables not in the standard OMOP CDM specification:

  • claims, claims_transactions — Insurance claims data
  • states_map — Geographic state mapping
  • concept_embeddings — pgvector 768-dim embeddings for AI semantic search

5. Laravel Connection Reference

ConnectionEnv PrefixDefault HostDefault DBDefault search_pathUsed By
pgsqlDB_*127.0.0.1ohdsi (Acumenus) / parthenon (Docker)app,publicAll App models (User, Source, CohortDefinition, Study, etc.)
cdmCDM_DB_*127.0.0.1parthenoneunomia,publicCdmModel subclasses (Person, VisitOccurrence, Condition, etc.)
vocabDB_VOCAB_*127.0.0.1parthenoneunomia,publicVocabularyModel subclasses (Concept, ConceptRelationship, etc.)
resultsRESULTS_DB_*127.0.0.1parthenoneunomia_results,publicAchillesResultReaderService (overrides search_path per-request)
gisGIS_DB_*127.0.0.1ohdsigis,omop,public,appGIS services (SviAnalysisService, etc.)
eunomiaDB_* (shared with pgsql)postgresparthenoneunomia,publicEunomia demo dataset access
docker_pgDOCKER_DB_*postgresparthenonapp,publicdb:sync, db:audit comparison
Gotcha

Env var naming is inconsistent across connections. CDM uses CDM_DB_*, vocab uses DB_VOCAB_*, results uses RESULTS_DB_*, GIS uses GIS_DB_*. Check backend/config/database.php for the authoritative list.

Gotcha

On Acumenus, the pgsql connection defaults to External PG (DB_DATABASE=ohdsi), not Docker PG. Only docker_pg always targets Docker PG.


6. Common Gotchas

GotchaExplanation
Docker PG has empty OMOP schemasExpected behavior. vocab, cdm, achilles_results schemas are created by init.sql but only populated via Eunomia seeder or external PG.
docker compose restart doesn't reload env varsContainer must be recreated: docker compose up -d. restart reuses the same container with stale env.
Sources/cohorts gone after migrate:freshRe-seed with php artisan admin:seed + php artisan eunomia:seed-source.
ETL tables in omop schemaclaims, claims_transactions, states_map are ETL artifacts, not standard OMOP CDM.
Legacy webapi/basicauth schemasRead-only artifacts from the Atlas to Parthenon migration. Safe to ignore.
112 vs 104 app tablesExternal PG has 8 extra tables (genomic_variants, gis_admin_boundaries, etc.) created by ETL scripts outside Laravel migrations.

7. Solr Acceleration Layer

Parthenon runs Solr 9.7 as a read-optimized search layer alongside PostgreSQL. Solr does not replace PG — it mirrors subsets of data into purpose-built indices for sub-200ms full-text search, faceted filtering, and typeahead.

The 9 Cores

CoreAcceleratesSource DataScale
vocabularyConcept search + typeahead + facetsomop.concept + synonyms7.2M docs
cohortsCohort/study discoveryapp.cohort_definitions + studiesThousands
analysesCross-source analysis searchAnalysis metadataHundreds
mappingsETL mapping reviewapp.concept_mappingsVariable
clinicalPatient timeline search7 CDM event tables710M+ events
imagingDICOM study discoveryapp.imaging_studiesThousands
claimsBilling record searchomop.claims + transactionsVariable
gis_spatialChoropleth disease mapsCondition-county aggregates500+ pairs
vector_explorer3D embedding visualizationChromaDB projections43K+ points

Data Flow

PostgreSQL → Artisan indexer / Horizon queue → Solr → API search endpoint → Frontend
  • Batch indexing: 9 solr:index-* Artisan commands with --fresh option for full reindex
  • Real-time sync: Eloquent observers on CohortDefinition and Study dispatch queue jobs to update Solr
  • Python bypass: The AI service writes directly to gis_spatial and vector_explorer cores (bypasses Laravel)

Fallback & Resilience

  • Gated: SOLR_ENABLED=true in .env — when false, all search services fall back to PostgreSQL ILIKE
  • Circuit breaker: After 5 failures, Solr queries fail-fast for 30s (tracked in Redis via SolrClientWrapper)
  • Admin UI: Solr Admin page shows per-core doc counts, reindex triggers, and health status

Performance

PathResponse Time
Solr (cached query)~168ms
PostgreSQL ILIKE (fallback)~2-5s
Live UMAP computation~8s

8. Deployment Profiles

Docker-Only (New Installs)

  • All PG connections point to Docker PG
  • Eunomia demo dataset for CDM/vocab/results (2,694 patients)
  • Solr optional (SOLR_ENABLED=false by default) — search falls back to PG ILIKE
  • Sufficient for development and testing

Acumenus (Production)

  • pgsql → External PG 17 (ohdsi DB, app schema)
  • docker_pg → Docker PG 16 (parthenon DB) — audit/comparison only
  • CDM/vocab/results/GIS → External PG 17 (various schemas)
  • eunomia → Docker PG 16 (demo data still available)
  • Solr enabled with all 9 cores indexed
  • 1M patients, full Athena vocabulary, real Achilles results

Verifying Your Setup

Run the audit command to check all connections:

php artisan db:audit

This will show table counts, row counts, and flag any connections that are down or schemas that are unexpectedly empty.


Domain Entity-Relationship Diagrams

The following ERDs show the key tables and relationships in each domain. They are not exhaustive — they focus on the relationships that matter for understanding data flow.

ERD 1: App Core (Docker PG, app schema)

ERD 2: Research Pipeline (Docker PG, app schema)

ERD 3: OMOP CDM v5.4 (External PG, omop schema)

ERD 4: OMOP Vocabulary (External PG, omop schema)

ERD 5: Extensions (Multi-Database)

Extension tables span multiple schemas and databases. Labels indicate location.

GIS (External PG, gis schema)

Genomics (Docker PG, app schema)

Imaging (Docker PG, app schema)

HEOR (Docker PG, app schema)


Live Audit

Run the database audit command to see a real-time snapshot of all connections:

php artisan db:audit
+------------+------------------+--------+---------------+-------------------+
| Connection | Schema | Tables | Total Rows | Status |
+------------+------------------+--------+---------------+-------------------+
| pgsql | app | 112 | 17,935,875 | OK |
| cdm | omop | 48 | 1,882,597,503 | OK |
| vocab | omop | 48 | 1,882,597,503 | OK |
| results | achilles_results | 8 | 2,005,137 | OK |
| gis | gis | 5 | 4,466,880 | OK |
| eunomia | eunomia | 20 | 343,279 | OK |
| docker_pg | app | 104 | 789 | OK |
| Solr | vocabulary | 1 core | 7,194,924 | OK |
| Solr | cohorts | 1 core | 46 | OK |
| Solr | analyses | 1 core | 190 | OK |
| Solr | mappings | 1 core | 0 | WARN: 0 documents |
| Solr | clinical | 1 core | 500 | OK |
| Solr | imaging | 1 core | 635 | OK |
| Solr | claims | 1 core | 50,000 | OK |
| Solr | gis_spatial | 1 core | 67 | OK |
| Solr | vector_explorer | 1 core | 5,001 | OK |
+------------+------------------+--------+---------------+-------------------+

Use --json for machine-readable output or --connection=NAME to audit a single connection.