Montana Climate Office

CONUS Drought Monitoring Pipeline: Methods

Zachary H. Hoylman, R. Kyle Bocinsky, and Kelsey G. Jencso — Montana Climate Office, University of Montana

Updated: Version 1.1

Abstract The Montana Climate Office (MCO) CONUS Drought Monitoring Pipeline is an operational system that computes a suite of standardized drought indices on a daily basis across the contiguous United States (CONUS). The pipeline ingests daily gridded meteorological fields from gridMET [1], a high-resolution (~4 km; 1/24°) observationally constrained climate dataset that merges PRISM spatial patterns with reanalysis-derived temporal signals. Input variables include precipitation, potential evapotranspiration (PET), vapor pressure deficit (VPD), and maximum air temperature.

From these inputs the pipeline derives five standardized drought index families — the Standardized Precipitation Index (SPI), the Standardized Precipitation-Evapotranspiration Index (SPEI), the Evaporative Demand Drought Index (EDDI), the Standardized VPD Index (SVPDI), and maximum temperature percentiles (Tmax) — alongside ancillary products including precipitation accumulation (mm), percent of normal and deviation from normal for both precipitation and VPD, empirical percentile ranks for precipitation, VPD, and Tmax, and temperature deviation from normal. Each metric is computed across multiple aggregation timescales ranging from 15 to 730 days, as well as water-year and year-to-date accumulations.

A distinguishing feature of this pipeline is its support for flexible climatological reference periods. Because a warming climate produces non-stationary baseline statistics, drought classifications derived from a fixed historical normal can be misleading about contemporary conditions [2]. The pipeline therefore supports rolling trailing windows, fixed WMO-standard normals, and full-period baselines, all configurable at runtime. All outputs are written as Cloud-Optimized GeoTIFFs (COG) with LZW compression, tiled internally for efficient partial reads by downstream visualization and analysis tools.

2. Data

2.1 gridMET Climate Data

All meteorological inputs are obtained from gridMET [1], a daily surface meteorological dataset covering CONUS at a spatial resolution of approximately 4 km (1/24°). gridMET is produced by statistically downscaling NCEP/NCAR reanalysis fields using the spatial climatological patterns embedded in PRISM (Parameter-elevation Regressions on Independent Slopes Model), yielding a dataset that is both temporally consistent and spatially detailed. The archive extends from 1979 to near-present (updated daily), providing the multi-decadal record required for robust climatological fitting. Data are freely available at climatologylab.org/gridmet.html.

The following gridMET variables are used by this pipeline:

Variable gridMET name Units Used for
Precipitationprmm/daySPI, SPEI water balance, precip % of normal, precip deviation, precip percentile, precip accumulation (mm)
Reference ET (ETo)petmm/daySPEI water balance, EDDI
Vapor pressure deficitvpdkPaSVPDI, VPD % of normal, VPD deviation, VPD percentile
Maximum air temperaturetmmxKTmax percentile, Tmax deviation

Download & Caching Strategy

gridMET distributes data as annual NetCDF-4 files. The pipeline loops over every year from START_YEAR (default 1979) through the current calendar year for all four variables (pr, pet, vpd, tmmx), using timestamp-conditional downloads via curl -R -z:

  • -R — Sets the local file's modification time to the server's Last-Modified timestamp on download.
  • -z <file> — Sends an If-Modified-Since header. If the remote file has not been updated since the local copy's mtime, the download is skipped entirely (HTTP 304).

This means every year — including historical years — is checked on each run, but only files that have actually changed on the server are re-downloaded. In practice, completed historical years are never re-fetched after the initial download, while the current (in-progress) year file is updated whenever the provider appends new daily observations. Failed downloads are retried up to 3 times with a short backoff.

All annual NetCDFs are retained to allow reprocessing with alternative reference periods without re-downloading. The default archive start year is 1979 (the full gridMET record); this is configurable via the START_YEAR environment variable.

3. Climatological Reference Periods

Scientific Motivation

Standardized drought indices measure anomalies relative to a baseline climatology. The choice of baseline profoundly affects index values: the same precipitation total can indicate moderate drought against a 1950–1980 baseline but near-normal conditions against a 1991–2020 WMO normal. As the climate warms, the statistical properties of temperature, PET, and water balance are trending systematically, violating the stationarity assumption embedded in fixed-period normals. Hoylman et al. (2022) [2] demonstrate that non-stationarity in baseline climate substantially alters drought frequency and severity assessments, motivating the use of flexible, contemporary reference periods for operational monitoring.

Supported Modes

The reference period is controlled at runtime via the CLIM_PERIODS environment variable (comma-separated, multiple periods may be computed in a single run):

Spec syntax Mode Slug appended to filename Description
rolling:30 Rolling trailing window rolling_30 Trailing 30-year window ending at the current year; tracks contemporary climatology (default)
fixed:1991:2020 Fixed interval fixed_1991_2020 WMO 1991–2020 standard normal; suitable for comparison with published climatologies
full Full archive full All years from START_YEAR to present; maximizes sample size

Example configurations:

# Single rolling 30-year window (default)
CLIM_PERIODS="rolling:30"

# Two periods computed in one run
CLIM_PERIODS="rolling:30,fixed:1991:2020"

# Fixed period only
CLIM_PERIODS="fixed:1991:2020"

# Full archive
CLIM_PERIODS="full"

Each period produces a fully independent set of output files with the slug appended before the .tif extension. Intermediate tile directories are also slug-tagged to prevent collisions when multiple periods are computed simultaneously.

Minimum anchor requirement: At least 10 calendar-year anchors must fall within the selected reference period for any distribution fit to proceed at a given grid cell. Cells that do not meet this threshold are assigned NA for that index and timestep.

4. Drought Index Methods

All indices are computed at each grid cell independently. For a given aggregation window of W days, a rolling sum or mean of the input variable is first computed over the full historical record. The resulting time series is then stratified by calendar day-of-year, and a probability distribution (or empirical CDF) is fitted to the subsample of values that fall within the selected climatological reference period. The current (most recent) observation is then transformed to a standardized anomaly or probability using that fitted distribution. Index values are positive for above-average conditions and negative (or high) for drought conditions, following established conventions for each index.

4.1 Standardized Precipitation Index (SPI)

Motivation

The SPI [3] expresses precipitation anomalies in units of standard deviations relative to a fitted climatological distribution, making it comparable across locations and timescales with differing precipitation regimes. It is the most widely used standardized drought index and serves as a fundamental building block for multi-variable composite assessments.

Input

Daily precipitation (pr), in millimeters.

Aggregation

Rolling window sums over the following timescales:

  • Fixed windows: 15, 30, 45, 60, 90, 120, 180, 365, and 730 days
  • Water-year accumulation (Oct 1 – current date, or equivalent)
  • Year-to-date accumulation (Jan 1 – current date)

Statistical Method

Precipitation is inherently bounded at zero and right-skewed; a two-parameter Gamma distribution (shape α, rate β) is therefore fitted to the climatological sample for each grid cell and aggregation window. Parameter estimation follows the L-moment / probability-weighted moment (PWM) approach of Hosking (1990) [4], as implemented in the R package lmomco:

  1. Zero handling (Stagge et al., 2015) — Following the methodology of Stagge et al. (2015) [5], zero (or below-threshold) precipitation values are handled via a mixed distribution approach using the Weibull plotting position. The probability of zero precipitation is estimated as p0 = m / (n + 1), where m is the number of zero values and n is the total sample size. Zero-valued observations are assigned the center of probability mass below the threshold:
    0 = (m + 1) / [ 2(n + 1) ]
    Non-zero values are mapped to the upper portion of the probability space:
    H(x) = p0 + (1 − p0) ⋅ G(x; α, β)
    where G is the Gamma CDF fitted only to positive values. This approach correctly represents the discrete mass at zero without the artifacts introduced by replacing zeros with small positive constants.
  2. PWM estimation — Unbiased probability-weighted moments β0, β1 are computed from the positive (non-zero) values in the climatological sample.
  3. L-moment conversion — PWMs are converted to L-moments (λ1, λ2, τ3).
  4. Parameter estimationlmomco::pargam() returns (α, β) from L-moments of the positive values only.
  5. CDF evaluation — For non-zero observations, the mixed CDF H(x) is evaluated to obtain a non-exceedance probability p. For zero-valued observations, p = 0 (center of probability mass).
  6. Normal quantile transform — SPI = Φ−1(p), where Φ−1 is the standard normal quantile function. No clamping is applied to the resulting SPI values.
SPI = Φ−1[ FΓ(α,β)( XW ) ]

Interpretation

Category Description Percentile Range SPI / SPEI Values
Normal or wet conditionsx > 30x > −0.524
D0Abnormally Dry30 ≥ x > 20−0.524 ≥ x > −0.842
D1Moderate Drought20 ≥ x > 10−0.842 ≥ x > −1.281
D2Severe Drought10 ≥ x > 5−1.281 ≥ x > −1.644
D3Extreme Drought5 ≥ x > 2−1.644 ≥ x > −2.054
D4Exceptional Droughtx < 2x < −2.054

Minimum Data Requirement

At least 10 anchor years must fall within the selected reference period for a distribution fit to be attempted. Grid cells with fewer anchor years receive a fill value of NA for the current timestep.

4.2 Ancillary Precipitation Metrics

In addition to the primary drought indices, four ancillary metrics are derived directly from rolling precipitation sums to provide operational context:

Metric Formula Units Description
Percent of Normal (PON) ( XW / μclim ) × 100 % Current accumulation as a percentage of the climatological mean
Departure XW − μclim mm Absolute anomaly relative to climatological mean
Percentile n( XW ) 0–1 Empirical ECDF rank within the climatological sample
Accumulation (precip-mm) XW mm Raw rolling precipitation sum

The climatological mean (μclim) and ECDF are derived from the same reference period used for SPI fitting, ensuring internal consistency across all precipitation-based products.

4.3 Standardized Precipitation-Evapotranspiration Index (SPEI)

Motivation

The SPEI [6] extends SPI by incorporating the atmospheric evaporative demand, making it sensitive to warming-driven increases in PET even when precipitation is unchanged. In a warming climate, SPEI captures hydrological drought stress that SPI would miss.

Input

Water balance = precipitation (pr) − reference evapotranspiration (pet), in mm/day. Negative values indicate a moisture deficit; positive values indicate a moisture surplus.

Aggregation

Rolling window means over the same timescales as SPI.

Statistical Method

The water balance variable is unbounded below (large PET on dry days) and has heavier distributional tails than precipitation alone. The Generalized Logistic (GLO) distribution, which accommodates both heavy tails and asymmetry, is therefore used in place of the Gamma:

  1. L-moments estimated from the climatological sample via unbiased PWMs.
  2. lmomco::parglo() fits the three-parameter GLO (ξ, α, k).
  3. GLO CDF evaluated at current observation → probability p.
  4. SPEI = Φ−1(p).
SPEI = Φ−1[ FGLO(ξ,α,k)( P − PET )W ]

Interpretation

SPEI values follow the same classification thresholds as SPI. Because it incorporates evaporative demand, SPEI typically shows stronger drying trends over recent decades than SPI alone, consistent with the non-stationarity findings of [2].

4.4 Evaporative Demand Drought Index (EDDI)

Motivation

EDDI [7] isolates the atmospheric demand component of the hydrological cycle. Because it is derived solely from PET — without precipitation — it captures early warning signals of drought stress driven by high temperature, low humidity, or high wind speed before soil moisture depletion has occurred.

Input

Reference evapotranspiration (pet), in mm/day.

Aggregation

Rolling window sums (cumulative evaporative demand) over the same timescales as SPI.

Statistical Method

EDDI uses a nonparametric rank-based approach, avoiding assumptions about the distributional form of PET:

  1. Rank — Observations within the climatological sample are ranked from highest (rank 1) to lowest. A rank of 1 therefore corresponds to maximum evaporative demand (drought stress).
  2. Tukey plotting position — Following Hobbins et al. (2016) [7], the empirical probability is estimated via the Tukey plotting position (Wilks 2011):
    pi = (i − 1/3) / (n + 1/3)
    where i is the rank of the aggregated E0 in the historical time series (i = 1 for maximum E0) and n is the number of observations in the climatological sample. Because rank 1 = highest demand, p is small for drought conditions.
  3. Inverse normal approximation — EDDI is derived via the rational approximation of Abramowitz and Stegun (1965) [8], following the formulation in Vicente-Serrano et al. (2010). For p ≤ 0.5, W = √(−2⋅ln(p)); for p > 0.5, p is replaced with 1−p and the sign of EDDI is reversed.
  4. Sign convention — High PET (low p) produces positive EDDI values, indicating drier-than-normal conditions. Negative values indicate wet anomalies. A zero EDDI indicates that accumulated E0 equals the climatological median.
EDDI = Φ−1[ pTukey( E0,W ) ]

Interpretation

Positive EDDI indicates above-normal atmospheric demand (drought precursor or ongoing drought amplifier). EDDI provides a pure atmospheric signal unconfounded by antecedent precipitation and is particularly valuable for flash drought early warning.

4.5 Standardized VPD Index (SVPDI)

Motivation

Vapor pressure deficit (VPD) is the difference between the saturation vapor pressure and the actual vapor pressure. Elevated VPD increases transpiration demand and can cause stomatal closure even when soil moisture is adequate, directly linking atmospheric drought to vegetation stress. The SVPDI [9] provides a standardized measure of VPD anomaly that is complementary to the precipitation-based indices.

Input

Mean daily vapor pressure deficit (vpd), in kPa.

Aggregation

Rolling window means over the same timescales as SPI.

Statistical Method

Following Nwayor et al. (2024) [9], VPD is treated as a zero-limited, right-skewed variable and fitted with a Gamma distribution. Nwayor et al. compared Gamma, Weibull, and log-normal distributions via AIC and found the Gamma to be the best-performing distribution for VPD across most locations globally. Our implementation uses L-moment/PWM parameter estimation (rather than MLE) for consistency with the SPI fitting procedure, and applies the same Stagge et al. (2015) [5] mixed-distribution zero handling used for SPI:

  1. Zero handling (Stagge et al., 2015) — Zero VPD values (rare but possible at high latitudes or during cold conditions) are handled via the same mixed-distribution approach as SPI. The probability of zero is estimated as p0 = m / (n + 1), and zero-valued observations are assigned the center of probability mass: 0 = (m + 1) / [2(n + 1)]. The Gamma distribution is fitted only to positive values.
  2. Gamma fitting — Unbiased PWMs → L-moments → lmomco::pargam() on positive values only.
  3. CDF evaluation — For non-zero observations: H(x) = p0 + (1 − p0) ⋅ G(x; α, β). For zero observations: p = 0.
  4. Normal quantile transform — SVPDI = Φ−1(p). No clamping is applied.
SVPDI = Φ−1[ H( VPDW ) ]

Interpretation

Positive SVPDI indicates above-normal VPD (heightened atmospheric demand and potential vegetation stress). Negative SVPDI indicates below-normal VPD (reduced atmospheric demand). The sign convention matches that of SPI/SPEI, where positive values represent drier conditions.

4.6 Ancillary VPD Metrics

Alongside the SVPDI, three ancillary metrics are computed from the rolling VPD means to provide operational context:

Metric Formula Units Description
Percent of Normal (PON) ( VPDW / μclim ) × 100 % Current VPD as a percentage of the climatological mean
Departure VPDW − μclim kPa Absolute anomaly relative to climatological mean VPD
Percentile n( VPDW ) 0–1 Empirical ECDF rank within the climatological sample

4.7 Maximum Temperature Metrics (Tmax)

Motivation

Anomalously high temperatures amplify drought through increased PET and accelerated snowmelt. A percentile-based temperature index provides context for heat as a drought driver without coupling it to a precipitation signal.

Input

Daily maximum air temperature (tmmx), in Kelvin.

Aggregation

Rolling window means over the standard timescales.

Statistical Method

A nonparametric empirical CDF approach is used; no parametric distribution is assumed:

  1. The empirical cumulative distribution function (ECDF) is constructed from the climatological sample using ecdf() from base R.
  2. The ECDF is evaluated at the current observation to produce a probability (0–1 scale), where 1 indicates the warmest value on record within the reference period.
Tmaxpctile = F̂n( TmaxW )

Deviation from Normal

In addition to the percentile, the pipeline computes the deviation of the current rolling-mean Tmax from the climatological mean for each timescale:

Tmaxdev = TmaxW − μclim

Positive values indicate above-normal temperatures; negative values indicate below-normal. Units are K.

Interpretation

Values > 0.9 indicate exceptionally warm conditions (top decile of the reference climatology). The output is reported as a 0–1 probability rather than a standardized anomaly, making it directly interpretable as an exceedance percentile.

5. Spatial Processing Architecture

Tiling

The CONUS domain is partitioned into rectangular 2°×2° tiles (configurable via environment variable). Each tile spans approximately 220×220 km at mid-latitudes, containing roughly 2,500 grid cells (50×50). Tiles are processed independently, which enables embarrassingly parallel execution and constrains peak memory usage per process.

Parallelism

Within each R script, tiles are distributed across CPU cores using pbmcapply::pbmclapply(), a fork-based parallel backend (equivalent to parallel::mclapply()) that additionally renders a real-time text progress bar. Key design constraints:

Mosaicking

After all tiles for a given index and timescale are written, the pipeline assembles them into a seamless CONUS mosaic:

  1. A GDAL Virtual Raster (VRT) is constructed referencing all tile GeoTIFF files. VRT creation is instantaneous (no pixel copying) and is used to drive the subsequent translation.
  2. gdal_translate converts the VRT to a Cloud-Optimized GeoTIFF with LZW compression (COMPRESS=LZW, PREDICTOR=2) and internal tiling (TILED=YES, BLOCKXSIZE=512, BLOCKYSIZE=512). The COPY_SRC_OVERVIEWS=YES flag and gdaladdo pre-computation ensure efficient thumbnail rendering by web map clients.

Land Mask

An optional CONUS land mask derived from Natural Earth state boundary polygons can be applied to suppress ocean and non-CONUS pixels in the final mosaic. This is controlled by the CONUS_MASK environment variable.

6. Output Products

All final output layers are written to the conus_drought/ directory as Cloud-Optimized GeoTIFFs. Filenames follow the pattern:

{index}_{timescale}_{clim_slug}.tif
# e.g.: spi_30d_rolling_30.tif
#       spei_90d_fixed_1991_2020.tif
#       eddi_365d_full.tif
Index Timescales Units / Range Distribution
SPI 15d, 30d, 45d, 60d, 90d, 120d, 180d, 365d, 730d, WY, YTD Dimensionless z-score Gamma (L-moments)
SPEI 15d, 30d, 45d, 60d, 90d, 120d, 180d, 365d, 730d, WY, YTD Dimensionless z-score Generalized Logistic (L-moments)
EDDI 15d, 30d, 45d, 60d, 90d, 120d, 180d, 365d, 730d, WY, YTD Dimensionless z-score Nonparametric (Gringorten)
SVPDI 15d, 30d, 45d, 60d, 90d, 120d, 180d, 365d, 730d, WY, YTD Dimensionless z-score Gamma (L-moments)
Tmax percentile 15d, 30d, 45d, 60d, 90d, 120d, 180d, 365d, 730d, WY, YTD Probability [0, 1] Empirical ECDF
Percent of Normal Same as SPI % (unbounded) Climatological mean
Departure Same as SPI mm Climatological mean
Percentile (precip) Same as SPI Probability [0, 1] Empirical ECDF
Accumulation (precip-mm) Same as SPI mm Raw sum
VPD % of Normal Same as SVPDI % (unbounded) Climatological mean
VPD Departure Same as SVPDI kPa Climatological mean
VPD Percentile Same as SVPDI Probability [0, 1] Empirical ECDF
Tmax deviation Same as Tmax percentile K Climatological mean

All rasters share the native gridMET CRS (EPSG:4326, geographic/WGS84) and pixel size (1/24° ≈ 4 km). NoData is encoded as NaN (IEEE 754 32-bit float).

7. Software & Reproducibility

Runtime Environment

The pipeline runs inside a Docker container based on the rocker/geospatial image (R 4.4.1, GDAL 3.x, PROJ 9.x, GEOS). All system libraries are pinned at image-build time, ensuring byte-for-byte reproducibility of spatial operations.

Key R Packages

PackageRole
terraRaster I/O, spatial operations, masking
sfVector geometry (tile boundaries, land mask)
lmomcoL-moment parameter estimation; Gamma, GLO distributions
ncdf4Reading gridMET NetCDF-4 source files
fsFile system operations (path manipulation, directory management)
purrrFunctional programming utilities for list operations
pbmcapplyFork-based parallel apply with real-time progress bar
gdalUtilitiesR wrappers for gdal_translate, gdaladdo, gdalbuildvrt

Configurable Parameters (Environment Variables)

VariableDefaultDescription
START_YEAR1979First year of gridMET archive to download/use
CLIM_PERIODSrolling:30Comma-separated reference period specification(s)
CORESAll availableNumber of parallel worker processes
TILE_DX2Tile width in degrees
TILE_DY2Tile height in degrees
CONUS_MASK1Apply CONUS land mask (0 = no mask)
DATA_DIR~/mco-drought-conus-dataRoot directory for input NetCDF cache and outputs
TIMESCALESallTimescales to compute
TILE_IDSall tilesSubset of tile IDs to process

Docker Usage

# Build the image
docker compose build

# Run the full pipeline with default settings
docker compose up

# Run with multiple reference periods
docker compose run -e CLIM_PERIODS="rolling:30,fixed:1991:2020" mco-drought

# Run with limited cores for debugging
docker compose run -e CORES=4 mco-drought

# Quick test (single metric, few tiles)
docker compose run --rm mco-drought bash docker/run_test.sh

Source Code

Source code and Docker configuration are maintained in the mco-drought-conus repository. The pipeline is structured as follows:

mco-drought-conus/
├── R/
│   ├── drought-functions.R        # Core statistical fitting functions
│   ├── pipeline-common.R        # Shared pipeline infrastructure
│   ├── 1_gridmet-cache.R          # gridMET download and caching
│   ├── 2_metrics-precip.R         # SPI + ancillary metrics
│   ├── 3_metrics-spei.R           # SPEI
│   ├── 4_metrics-eddi.R           # EDDI
│   ├── 5_metrics-vpd.R            # SVPDI
│   └── 6_metrics-tmax.R           # Tmax percentile
├── docker/
│   ├── Dockerfile
│   ├── docker-compose.yml
│   ├── run_once.sh                # Orchestration shell script
│   ├── make_web_cogs.sh         # COG post-processing
│   └── run_test.sh              # Quick-test runner
└── docs/
    └── methods.html               # This document

8. References

  1. Abatzoglou JT. Development of gridded surface meteorological data for ecological applications and modelling. International Journal of Climatology. 2013;33(1):121–131. doi:10.1002/joc.3413
  2. Hoylman ZH, Bocinsky RK, Jencso KG. Drought assessment has been outpaced by climate change: empirical arguments for a paradigm shift. Nature Communications. 2022;13:2715. doi:10.1038/s41467-022-30316-5
  3. McKee TB, Doesken NJ, Kleist J. The relationship of drought frequency and duration to time scales. Proceedings of the 8th Conference on Applied Climatology. 1993;17(22):179–183. American Meteorological Society.
  4. Hosking JRM. L-moments: analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society, Series B. 1990;52(1):105–124. doi:10.1111/j.2517-6161.1990.tb01775.x
  5. Stagge JH, Tallaksen LM, Gudmundsson L, Van Loon AF, Stahl K. Candidate distributions for climatological drought indices (SPI and SPEI). International Journal of Climatology. 2015;35(13):4027–4040. doi:10.1002/joc.4267
  6. Beguería S, Vicente-Serrano SM, Reig F, Latorre B. Standardized precipitation evapotranspiration index (SPEI) revisited: parameter fitting, evapotranspiration models, tools, datasets and related applications. International Journal of Climatology. 2014;34(10):3001–3023. doi:10.1002/joc.3887
  7. Hobbins MT, Wood A, McEvoy DJ, Huntington JL, Morton C, Anderson M, Hain C. The Evaporative Demand Drought Index. Part I: Linking drought evolution to variations in evaporative demand. Journal of Hydrometeorology. 2016;17(6):1745–1761. doi:10.1175/JHM-D-15-0121.1
  8. Abramowitz M, Stegun IA, eds. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards Applied Mathematics Series 55. Washington, DC: U.S. Government Printing Office; 1965.
  9. Nwayor IJ, Robeson SM, Ficklin DL, Maxwell JT. A multiscalar standardized vapor pressure deficit index for drought monitoring and impacts. International Journal of Climatology. 2024;44:5825–5838. doi:10.1002/joc.8668