Methodology

This page provides detailed technical documentation of the statistical methods, calculations, and data processing techniques used throughout Mortality Watch. Understanding these methodologies is essential for correctly interpreting the visualizations and analyses.

Mortality Watch is fully open source on GitHub . All statistical methods, baseline calculations, and data processing pipelines are implemented in R and available for review, replication, and contribution.

Mortality Metrics

Deaths (Raw Counts)

The absolute number of deaths reported for a given time period and population. This is the most direct measure but doesn't account for population size or age structure, making comparisons between regions difficult.

Deaths = Total number of deaths in period

CMR (Crude Mortality Rate)

Deaths per 100,000 population. This standardizes mortality by population size, allowing comparison between differently-sized populations. However, it doesn't account for differences in age structure.

CMR = (Deaths / Population) × 100,000

Example: 1,000 deaths in a population of 500,000 = 200 deaths per 100,000 population

ASMR (Age-Standardized Mortality Rate)

A mortality rate adjusted for differences in age structure between populations. This is crucial because mortality rates increase dramatically with age. Without age-standardization, comparisons between countries with different age structures (e.g., Japan vs. Nigeria) would be meaningless.

ASMR applies a weighted average of age-specific mortality rates, where the weights are taken from a "standard population" (see Standard Populations section below).

ASMR = Σ(Age-specific rate × Standard population weight)

We support WHO2015, ESP2013, US2000, and country-specific 2020 standard populations

ASD (Age-Standardized Deaths)

Age-Standardized Deaths (ASD) is an alternative approach to account for population aging when calculating excess mortality. Unlike ASMR which standardizes rates, ASD works directly with death counts by applying baseline mortality rates to current population age structures.

The ASD method (based on Levitt et al.) calculates expected deaths by:

  1. Calculate baseline mortality rates per age group during the reference period
  2. Apply these baseline rates to the current population age structure
  3. Sum across all age groups to get total expected deaths
For each age group a:
baseline_ratea = mean(deathsa / populationa) during baseline period
Expected Deaths = Σ(baseline_ratea × current_populationa)
Excess ASD = Observed Deaths - Expected Deaths

When to use ASD vs ASMR

  • ASMR answers: "If this population had the same age structure as the standard, what would the mortality rate be?"
  • ASD answers: "Given how this population has aged, how many deaths would we expect if mortality rates stayed at baseline levels?"

ASD is particularly useful for understanding excess deaths in aging populations, as it separates the effect of population aging from changes in mortality rates.

Life Expectancy

Life Expectancy (LE) is defined as the average remaining years of life expected by a hypothetical cohort of individuals who would be subject to the mortality rates of the year of interest over the course of their remaining life. We provide both period LE at birth (all ages) as well as remaining LE at specific ages.

LE is calculated using Chiang's abridged life table methodology.

Method: Chiang's abridged life table
nax (avg. years lived by decedents):
• Age 0: Coale-Demeny coefficients (mortality-dependent)
• Ages 1-4: 1.5 years
• Ages 5+: n/2 (midpoint assumption)
• Open-ended (85+): 1/Mx (Keyfitz)

Sub-yearly Life Expectancy: Raw vs Seasonally Adjusted

For weekly, monthly, and quarterly data, Pro users can toggle between two display modes:

  • Raw Values: The directly calculated LE values. These show apparent "seasonal" variation (e.g., lower values in winter), but this is a calculation artifact from short-term mortality fluctuations—not real LE changes. A person's expected lifespan doesn't actually vary week to week.
  • Seasonally Adjusted (STL): We apply STL decomposition (Seasonal and Trend decomposition using Loess) and remove only the seasonal component, keeping both trend and remainder. This eliminates the artificial seasonal pattern while preserving the true underlying life expectancy signal and any real short-term variations.

The adjusted values are displayed by default and are recommended for most analyses. Raw values are useful for understanding the calculation artifacts or for research purposes.

Pro users can view remaining life expectancy at specific ages (e.g., LE at age 40 shows expected remaining years for someone aged 40)

Time Aggregations

Mortality data can be aggregated across different time periods. The choice of aggregation affects the granularity of analysis and can reveal different patterns.

Standard Calendar Periods

  • Weekly: ISO week numbers (W01-W53), highest temporal resolution
  • Monthly: Calendar months (Jan-Dec), balances detail and noise reduction
  • Quarterly: Calendar quarters (Q1-Q4), shows seasonal patterns
  • Yearly: Calendar years (Jan-Dec), removes seasonality

Alternative Year Definitions

Flu Season (Oct-Sep)

A year running from October to September, aligning with influenza season patterns in the Northern Hemisphere. This is useful for analyzing mortality impacts that follow seasonal respiratory illness patterns, as it captures each flu season as a single unit rather than splitting it across calendar years.

Display format: "2020/21" represents Oct 2020 - Sep 2021

Midyear (Jul-Jun)

A year running from July to June, sometimes used in Southern Hemisphere countries or for fiscal year analysis. This can align better with seasonal patterns in regions where winter mortality peaks occur mid-calendar-year.

Display format: "2020/21" represents Jul 2020 - Jun 2021

Simple Moving Averages (SMA)

Weekly data smoothed using simple moving averages to reduce noise and reveal underlying trends. Each data point represents the average of the specified number of surrounding weeks.

  • 13-Week SMA: ~3 months smoothing, shows short-term trends
  • 26-Week SMA: ~6 months smoothing, removes most seasonal variation
  • 52-Week SMA: 1 year smoothing, shows annual trends only
  • 104-Week SMA: 2 year smoothing, reveals only long-term trends
SMA(n) = Average of n consecutive weekly values

Excess Mortality Calculations

Excess mortality estimates the difference between observed deaths and expected deaths based on historical patterns. This is calculated by comparing actual mortality data to a baseline projection.

Important Caveat
Excess mortality estimates depend heavily on the baseline model chosen. Different baselines can produce substantially different results. These are model-based projections, not observed data.

Baseline Models

The baseline represents what mortality would have been expected in the absence of unusual events. We calculate baselines using time series forecasting methods from the R fable package , a professional forecasting framework that implements state-of-the-art statistical methods.

The default baseline uses a simple 3-year pre-pandemic average (2017–2019). While straightforward, this may not capture underlying trends. As noted by Levitt et al.: "Changes in mortality rates may differ markedly year to year and across age and gender groups. During major events such as pandemics, wars, or natural disasters, estimates may diverge from observed deaths proportionally to the event's impact."

Available Baseline Methods

1. Last Value (Naive) — fable::NAIVE

Projects the most recent historical value forward. Assumes mortality rates remain constant at their last observed level. Best for stable populations with minimal trends.

Baseline[t] = Observed[last_baseline_period]

2. Average (Mean) — fable::TSLM

Calculates the mean of the baseline period (default 2017–2019) and projects it forward. This is the default method. Smooths out year-to-year fluctuations but ignores trends.

Baseline[t] = Mean(Observed[baseline_period])

3. Linear Regression (Trend) — fable::TSLM + trend()

Fits a linear trend to the baseline period and extrapolates it forward. Accounts for long-term improvement (or decline) in mortality rates. Useful when mortality has been consistently improving.

Baseline[t] = β₀ + β₁ × t (fitted from baseline period)

4. Exponential Smoothing (ETS) — fable::ETS + error() + trend()

An adaptive method that gives more weight to recent observations while accounting for trends. Can capture non-linear patterns in mortality improvement. Generally the most sophisticated approach.

Adaptive smoothing with error correction and trend components

Seasonal Adjustments: For sub-annual data (weekly, monthly, quarterly), all baseline methods automatically include seasonal components to account for recurring patterns (e.g., winter mortality peaks, summer troughs).

Excess Calculation

Once the baseline is established, excess mortality is simply the difference between observed and expected values:

Excess Deaths = Observed Deaths - Baseline Deaths
Excess CMR = Observed CMR - Baseline CMR
Excess ASMR = Observed ASMR - Baseline ASMR
Excess LE = Observed LE - Baseline LE

Positive values indicate higher-than-expected mortality; negative values indicate lower-than-expected mortality.

Confidence Intervals

Upper and lower bounds represent statistical uncertainty in the baseline projection. These are calculated by the fable package based on the prediction error distribution of each model.

Confidence intervals widen further into the future, reflecting increasing uncertainty. When observed values fall outside the confidence interval, it suggests a statistically significant deviation from historical patterns.

Standard Populations for Age-Standardization

Age-standardization requires a reference age distribution (standard population). Different standard populations are used in different contexts:

WHO2015 (World Health Organization 2015)

The WHO's standard population based on global age distribution. This is the default choice for international comparisons as it represents a global average age structure. Recommended for comparing countries worldwide.

ESP2013 (European Standard Population 2013)

Eurostat's European standard population, reflecting the age structure of the European Union. Use this for comparing European countries or when working with Eurostat data.

US2000 (United States 2000 Census)

The standard population used by the CDC and US health agencies, based on the 2000 US Census. Essential for comparisons with official US statistics or when analyzing US states.

2020 (Country-Specific)

Uses each country's own 2020 population age distribution as the standard. This shows internal trends over time without external age-structure assumptions, but makes international comparisons less meaningful.

Which to choose? For international comparisons, use WHO2015 or ESP2013 (for Europe). For US-specific analysis, use US2000. For country-level time trends, use 2020. The choice affects the absolute values but usually not the trends or patterns.

Data Processing & Quality

Data Sources & Updates

All data is sourced from official statistical agencies and international organizations (see the Sources page for details). Data is updated daily and processed through a validation pipeline to ensure consistency and accuracy.

Data Limitations

  • Reporting Delays: Recent data may be incomplete or subject to revision as official sources update their records
  • Suppression: Some jurisdictions suppress data when death counts are low to protect privacy (common in US CDC data)
  • Age Group Availability: Not all countries report mortality by age group, limiting ASMR calculations
  • Definition Changes: Occasional changes in geographic boundaries or reporting methodologies can create discontinuities
  • Population Estimates: Population denominators are estimates and may not perfectly align with the census cycle

Missing Data Handling

When data is missing or suppressed, we do not impute or estimate values. Gaps in the visualizations indicate missing data. For excess mortality calculations, baseline models are fitted only to available historical data.

Technical References

Our statistical methods are based on peer-reviewed research and established public health practices:

Questions?

If you have questions about our methodology or need clarification on any calculations, please use our contact form or reach out on @MortalityWatch .