Data Quality Dashboard

The Data Quality Dashboard (DQD) evaluates the conformance, completeness, and plausibility of your OMOP CDM data against the OHDSI Data Quality Dashboard specification. It runs approximately 3,000 data quality checks and summarizes results by category, CDM table, and concept domain. Data quality assessment is a critical prerequisite before using any CDM for research -- it helps identify ETL errors, source data issues, and potential biases.

DQD Check Categories

The DQD framework organizes checks into three categories, each targeting a different dimension of data quality:

Category	Description	Approximate Count	Examples
Conformance	Data conforms to CDM structural requirements	~1,200	Valid concept IDs, correct data types, referential integrity, date ranges within bounds
Completeness	Expected columns and records are populated	~800	Non-null required fields, expected record counts per domain, observation period coverage
Plausibility	Values are clinically reasonable	~1,000	Age at death within lifespan, drug duration within expected range, lab values within physiological limits

Each check returns:

Pass/Fail status -- based on the configured failure threshold
Failure count -- number of records violating the check
Failure rate -- percentage of applicable records that fail
Violation sample -- up to 10 example records for investigation

The overall DQD score is the weighted percentage of passing checks, giving a single number that summarizes database quality.

Viewing DQD Results

Navigate to Data Explorer and select a Data Source from the dropdown.
Click the Data Quality tab (4th tab).
The summary panel shows:
- Overall pass rate with a color-coded gauge (green > 90%, yellow 70-90%, red < 70%)
- Category breakdown -- pass rate for Conformance, Completeness, and Plausibility independently
- CDM table breakdown -- pass rates grouped by clinical table

Check Detail Table

Below the summary, the full check list displays every DQD check with:

Column	Description
Check Name	Descriptive name of the quality check
Description	What the check evaluates
Category	Conformance / Completeness / Plausibility
CDM Table	The clinical table being checked (e.g., `condition_occurrence`)
CDM Column	The specific column (e.g., `condition_concept_id`)
Threshold	Configurable failure rate limit (default: 5%)
Failure Count	Number of records failing this check
Failure Rate	Percentage of applicable records failing
Status	Pass / Fail icon

Click any check row to expand the violation detail panel, showing example records and suggested remediation steps.

Filtering and Searching

The filter bar provides multiple dimensions for narrowing the check list:

Category -- Conformance / Completeness / Plausibility (toggle buttons)
Status -- Passing / Failing / All (dropdown)
CDM Table -- select a specific table (e.g., measurement, drug_exposure)
Concept Domain -- filter by OMOP domain (Condition, Drug, Measurement, etc.)
Search -- free-text search across check names and descriptions

Prioritization

Start by filtering to Failing checks sorted by Failure Count descending. This surfaces the highest-impact data quality issues first. A single check with 100,000 failures is more urgent than 10 checks with 5 failures each.

Achilles Heel Checks

The Achilles Heel tab (5th tab in Data Explorer) shows rule-based quality notifications generated as part of the Achilles analysis run. These are simpler, faster checks compared to the full DQD:

Severity	Icon	Description	Examples
ERROR	Red circle	Critical data quality issues requiring immediate attention	Future dates in observation period, negative drug exposure durations, orphaned records
WARNING	Yellow triangle	Potential issues that may affect analysis validity	Birth year after death year, extremely long observation periods, unusual gender distributions
NOTIFICATION	Blue info	Informational items about data characteristics	Low record counts in certain domains, single-value columns, vocabulary coverage gaps

Heel checks are stored in the achilles_heel_results table and update automatically whenever Achilles is re-run. They are significantly faster than a full DQD execution (seconds vs. minutes).

Heel Check Table

The Heel tab displays checks in a sortable, filterable table:

Filter by severity (Error / Warning / Notification)
Search by message text
Sort by record count to find the most prevalent issues
Click any check to see the underlying Achilles analysis that triggered it

Running a Full DQD Check

Navigate to Admin > System > DQD Jobs (requires admin role).
Select a Data Source from the dropdown.
Optionally configure:
- Failure threshold -- percentage above which a check is considered failing (default: 5%)
- Check categories -- run all categories or select specific ones
- CDM tables -- restrict checks to specific tables (useful for targeted re-evaluation after ETL fixes)
Click Run DQD.
The DQD executes as a background job via Laravel Horizon. Typical execution times:
- Small CDM (< 10K patients): 5-15 minutes
- Medium CDM (10K-1M patients): 15-45 minutes
- Large CDM (> 1M patients): 30-90 minutes

Failure thresholds

The default failure threshold is 5% -- checks where more than 5% of applicable records fail are marked as "failing." This threshold should be tuned based on your data quality standards:

Research networks (OHDSI, PCORnet): Typically use 1-5% thresholds
Regulatory submissions: May require 0% tolerance for certain checks
Exploratory analysis: 10% may be acceptable for initial data assessment

Adjust thresholds in Admin > System Configuration > Data Quality Settings.

DQD Results History

Parthenon stores historical DQD results for trend analysis. Navigate to Data Quality > History to view:

Score trend -- line chart of overall DQD score over time
Category trends -- individual trend lines for Conformance, Completeness, and Plausibility
Run comparison -- side-by-side diff of two DQD runs to see which checks improved or regressed after an ETL update

This historical view is invaluable for tracking data quality improvement over successive ETL iterations.

Do not ignore failing checks

Failing DQD checks can silently bias research results. For example, if 20% of condition_occurrence records have condition_concept_id = 0 (unmapped), prevalence estimates will be systematically underestimated. Always review and address failing checks before using a CDM for published research.

DQD Check Categories​

Viewing DQD Results​

Check Detail Table​

Filtering and Searching​

Achilles Heel Checks​

Heel Check Table​

Running a Full DQD Check​

DQD Results History​