Skip to main content

Incidence Rates

Incidence rate analysis estimates how frequently a clinical outcome occurs in a defined population at risk. The primary result is an incidence rate (IR) expressed as cases per 1,000 person-years (PY), along with incidence proportions, 95% confidence intervals, and stratified breakdowns. Incidence analysis is the standard approach for establishing background event rates, comparing outcome frequencies across populations, and supporting safety surveillance.


Core Concepts

Understanding four key concepts is essential for configuring an incidence rate analysis:

Target Cohort (Denominator)

The target cohort defines the population at risk. Patients contribute person-time from their cohort start date to their cohort end date (or the outcome event, whichever comes first). The target cohort must be generated before the incidence analysis can execute.

Example: "New users of metformin with T2DM and 365 days prior observation" --- this defines who is at risk.

Outcome Cohort (Numerator)

The outcome cohort defines the event of interest. For each patient in the target cohort, the analysis counts whether (and when) they experience the outcome. Typically, the first occurrence of the outcome after the target cohort start date is counted.

Example: "Myocardial infarction (inpatient)" --- this is the event being measured.

Time at Risk (TAR)

The time at risk window specifies which portion of each patient's time in the target cohort counts toward the analysis:

  • Start: Days relative to cohort start when risk observation begins. Commonly 0 (from day of entry) or 1 (day after entry, excluding the index day).
  • End: Days relative to cohort start or cohort end when risk observation ends. Commonly set to cohort end date for open-ended follow-up.

The TAR window determines the denominator (total person-time) for the rate calculation.

TAR ConfigurationUse case
Start = 0, End = cohort endFull follow-up from entry to exit
Start = 1, End = cohort endExclude the index day (common for drug safety)
Start = 0, End = 365 days after startFixed 1-year risk window
Start = 31, End = cohort endSkip the first 30 days (induction period)

Clean Window (Washout)

The clean window excludes patients who already have the outcome in the N days before their target cohort start date. This ensures you are measuring incident (new) events rather than prevalent (pre-existing) conditions.

Example: A clean window of 365 days for MI means patients with any MI diagnosis in the year before entering the target cohort are excluded from the analysis.


Creating an Incidence Rate Analysis

  1. Navigate to Analyses and select the Incidence Rates tab.
  2. Click New Analysis and select Incidence Rate.
  3. Configure the analysis on the design page.

Design Configuration

Target Cohorts:

  • Add one or more target cohorts. Multiple targets allow you to compare rates across different populations in a single analysis (e.g., metformin users vs. sulfonylurea users).
  • Target cohorts must already be generated on the source you plan to analyze.

Outcome Cohorts:

  • Add one or more outcome cohorts. Each target-outcome pair produces an independent rate estimate.
  • Multiple outcomes allow you to assess several safety or effectiveness endpoints simultaneously.

Time at Risk:

  • Start anchor: cohort start date
  • Start offset: Days after anchor (commonly 0 or 1)
  • End anchor: cohort start date or cohort end date
  • End offset: Days after anchor (commonly 0 when using cohort end date)

Clean Window:

  • Days before cohort start during which the outcome must not be present
  • Set to 0 to include all patients (no washout)

Stratification:

  • By gender: Separate rates for male/female
  • By age: Separate rates for configurable age bands
  • By calendar year: Separate rates for each calendar year of cohort entry

Privacy:

  • Minimum cell count: Suppress results where the case count is below this threshold (default: 5)

Reading the Results

After execution, the results table shows one row per target-outcome combination:

ColumnDescription
Target CohortName of the at-risk population
Outcome CohortName of the event being measured
Persons at RiskUnique patients in the target cohort after clean window exclusions
Person-Time (PY)Total time at risk in person-years
CasesNumber of outcome events observed
IR per 1,000 PYIncidence rate with 95% confidence interval
IP (proportion)Incidence proportion: cases / persons at risk

Stratified Results

When stratification is enabled, expandable sub-tables show rates broken down by the stratification variable:

Age-stratified example:

Age BandPersonsPerson-YearsCasesIR per 1,000 PY (95% CI)
18--3412,45018,230231.26 (0.80--1.89)
35--4928,31042,1001122.66 (2.19--3.20)
50--6445,20061,3003896.35 (5.73--7.01)
65+31,04038,50054214.08 (12.91--15.33)

Interpreting Results

Incidence Rate vs. Incidence Proportion

MetricFormulaWhen to use
Incidence RateCases / Person-YearsWhen follow-up time varies across patients; standard for pharmacoepi
Incidence ProportionCases / Persons at RiskWhen all patients have similar follow-up; simpler to interpret

The incidence rate accounts for differential follow-up (patients who exit early contribute less person-time), making it the preferred metric for observational studies where follow-up is not uniform.

Confidence Intervals

Parthenon computes exact Poisson 95% confidence intervals for incidence rates. Narrow confidence intervals indicate precise estimates (large sample size and/or long follow-up). Wide intervals suggest insufficient data for reliable estimation.

Common Applications

  • Background rates: Establishing the natural frequency of an outcome in a population (e.g., MI rate among T2DM patients)
  • Comparative rates: Comparing event rates between populations (e.g., metformin vs. sulfonylurea users)
  • Temporal trends: Tracking how rates change over calendar time (annual stratification)
  • Subgroup analysis: Identifying populations with elevated risk (age, gender stratification)
  • Safety surveillance: Monitoring adverse event rates in post-marketing drug safety studies
  • Sample size planning: Using observed rates to calculate required sample sizes for clinical trials
Observational data limitations

Incidence rates from claims or EHR data reflect diagnosed and recorded events, not true population incidence. Under-coding, care-seeking behavior, database coverage, and coding practices all affect rate estimates. Rates from administrative data are typically lower than clinical trial rates due to under-ascertainment. Always report the data source type and known limitations alongside the rates.


Multiple Target-Outcome Pairs

A powerful feature of Parthenon's incidence analysis is the ability to configure multiple targets and multiple outcomes in a single analysis:

Targets: [Metformin users, Sulfonylurea users, DPP-4i users]
Outcomes: [MI, Stroke, Heart failure, Hypoglycemia]

This produces a 3 x 4 matrix of 12 incidence rate estimates, enabling comprehensive comparative safety profiling across drug classes in one execution.

Negative control outcomes

Include outcomes that have no plausible causal relationship with the exposure (e.g., appendicitis as a negative control for a diabetes drug). If the incidence rate is elevated for negative controls, it suggests residual confounding or systematic bias in the analysis design.


Exporting Results

Click Export CSV to download the complete results table, including all stratified breakdowns. The export includes:

  • Target and outcome cohort identifiers and names
  • Person counts, person-time, case counts
  • Rate estimates with confidence intervals
  • Stratification variables and values

The export format is suitable for inclusion in regulatory submissions, study reports, and journal publications.


Best Practices

  1. Set appropriate clean windows: A 365-day clean window is standard for chronic conditions. For acute events (e.g., fracture), a shorter window (90--180 days) may be more appropriate.

  2. Validate against published rates: Compare your computed rates against published epidemiologic literature. Major discrepancies suggest issues with cohort definition, outcome ascertainment, or data quality.

  3. Always report person-time: Incidence proportions without person-time context can be misleading. A 5% incidence over 1 year is very different from 5% over 10 years.

  4. Use age-gender stratification by default: Age and gender are the strongest predictors of most clinical outcomes. Stratified rates reveal patterns that overall rates mask.

  5. Consider competing risks: In elderly populations, death is a competing risk that prevents outcome observation. Report mortality rates alongside outcome rates when relevant.