Concept Mapping

Concept mapping (also called code mapping or vocabulary mapping) translates source codes in your data -- ICD-10-CM diagnosis codes, NDC drug codes, LOINC lab codes, CPT procedure codes, and other terminologies -- into OMOP standard concept IDs. This step is essential for making your CDM data queryable using the standardized OMOP vocabulary and enables cross-database analyses across institutions using different source coding systems.

Why Concept Mapping Matters

Source data typically uses local or domain-specific terminologies that vary by institution, vendor, and country:

Source Terminology	Domain	Example Code	Example Meaning
ICD-10-CM	Conditions	`E11.9`	Type 2 diabetes without complications
NDC	Drugs	`00002-7510-01`	Metformin 500mg tablet
CPT-4	Procedures	`99213`	Office visit, established patient
LOINC	Measurements	`2339-0`	Glucose [Mass/volume] in Blood
SNOMED CT	Observations	`73211009`	Diabetes mellitus

OMOP CDM stores the original source codes in *_source_value and *_source_concept_id columns for full traceability, but all analytical queries use standardized concept IDs in the *_concept_id columns:

SNOMED CT for conditions
RxNorm for drugs
LOINC for measurements
SNOMED CT for procedures (or CPT4 mapped to SNOMED)

The concept mapping step populates these standardized concept ID columns correctly, enabling your data to participate in network studies and standardized analytics.

Automatic Mapping

Parthenon leverages the OMOP vocabulary's concept_relationship table to automatically map source codes to standard concepts:

Navigate to the mapping results for your upload batch (after schema mapping is complete).
Click the Concept Mapping tab.
Click Auto-Map to start the automatic mapping process.
Parthenon queries the vocabulary for Maps to relationships from each source code.
The system categorizes results:
- Mapped -- single unambiguous standard concept found; auto-applied
- Review -- multiple possible standard concepts found; requires manual selection
- Unmapped -- no Maps to relationship exists in the vocabulary

Auto-mapping coverage

For well-coded claims data using standard terminologies (ICD-10-CM, NDC, CPT-4), auto-mapping typically achieves 85-95% coverage. EHR data with local codes or free-text descriptions will have lower auto-mapping rates and require more manual review.

Manual Mapping Review

The mapping review table provides a comprehensive view of all source codes and their mapping status:

Column	Description
Source Value	The raw code from your data (e.g., `E11.9`)
Source Vocabulary	Detected vocabulary (ICD10CM, NDC, CPT4, LOINC, HCPCS, etc.)
Frequency	Number of rows in your data containing this source code
Auto-Mapped To	Standard concept ID and name from auto-mapping (if found)
Confidence	Mapping confidence score (High / Medium / Low)
Status	Mapped / Unmapped / Review Needed

Resolving Unmapped Codes

For codes with no automatic mapping or those flagged for review:

Click the search icon next to the unmapped code to open the concept search dialog.
Search by code, name, or keyword in the OMOP vocabulary.
Review candidate standard concepts with their domain, vocabulary, and validity dates.
Click Accept to apply the mapping.
Optionally check Apply to all to map all occurrences of this source code across all batches.

Bulk Review Workflow

For large mapping efforts, use the bulk review tools:

Sort by frequency -- map high-frequency codes first for maximum impact
Filter by status -- show only unmapped or review-needed codes
Filter by domain -- focus on one clinical domain at a time
Batch accept -- accept all high-confidence auto-mappings in one click

Unmappable Codes

Some source codes cannot be mapped to a standard OMOP concept. Common reasons:

Custom local codes -- institution-specific codes not in any standard vocabulary
Vocabulary gaps -- the OMOP vocabulary does not yet include this terminology
Malformed codes -- truncated, invalid, or data entry errors
Deprecated codes -- codes that have been retired from their source vocabulary

For unmappable codes, set the concept ID to 0 -- the OMOP convention meaning "no standard concept available." These records will still appear in *_source_value columns and can be queried directly when needed. The source concept ID (*_source_concept_id) should still be populated if the source code exists in the vocabulary, even without a standard mapping.

Concept ID 0 impact

Records mapped to concept ID 0 are excluded from most standard OHDSI analyses (incidence rates, characterization, cohort definitions) because these analyses filter on standard concepts. Track your concept 0 rate as a key data quality metric -- high rates indicate vocabulary gaps that may bias your analyses.

Exporting and Importing Mapping Tables

Concept mapping tables can be exported and shared across Parthenon instances:

Export

Navigate to the completed concept mapping for a batch.
Click Export Mappings to download a CSV file containing:
- Source vocabulary, source code, source name
- Target standard concept ID, concept name, domain
- Mapping status and reviewer notes

Import

Navigate to Data Ingestion > Concept Mappings.
Click Import Mapping File.
Upload a previously exported CSV mapping file.
Parthenon validates the target concept IDs against the current vocabulary and flags any concepts that have been deprecated or invalidated since the export.

This enables a "build once, reuse everywhere" approach for multi-site studies using the same source system or data vendor.

Integration with Usagi

For large-scale manual concept mapping projects, Parthenon integrates with OHDSI Usagi -- a standalone Java tool that provides NLP-assisted concept suggestions based on term similarity:

Export your unmapped codes from Parthenon as a Usagi-compatible CSV.
Open the export in Usagi and use its similarity-based suggestions to find appropriate standard concepts.
Review and approve mappings in Usagi's interface.
Export the completed Usagi mapping file.
Import the Usagi output back into Parthenon to apply all mappings at once.

Vocabulary updates

When you update your OMOP vocabulary (see Chapter 25 -- System Configuration), re-run auto-mapping on existing unmapped codes. New vocabulary releases often add mappings for previously unmappable codes, especially for newer ICD-10-CM codes and drug products.

Mapping Quality Metrics

After concept mapping is complete, the summary dashboard shows:

Metric	Description	Target
Overall mapping rate	% of source codes with a standard concept	> 90%
Frequency-weighted rate	% of data rows with a mapped concept	> 95%
Concept 0 rate	% of rows mapped to concept ID 0	< 5%
Review pending	Codes still awaiting manual review	0
Domain coverage	Mapping rates per clinical domain	Varies

These metrics help assess whether your ETL is producing analysis-ready data or requires additional vocabulary work.

Why Concept Mapping Matters​

Automatic Mapping​

Manual Mapping Review​

Resolving Unmapped Codes​

Bulk Review Workflow​

Unmappable Codes​

Exporting and Importing Mapping Tables​

Export​

Import​

Integration with Usagi​

Mapping Quality Metrics​