Skip to main content

HydroLearn

Evaluating Deterministic and Ensemble Streamflow Estimates


How Good Is Your Forecast? Evaluating Deterministic and Ensemble Streamflow Models

Problem Statement

Hydrologic models are routinely used to forecast streamflow for flood warning, reservoir operation, and water-supply planning. Yet the same model can look great on one event and miss the next, and a colorful hydrograph plot can hide important errors. Worse, modern forecast systems do not return a single number any more. They return a cloud of plausible streamflows, an ensemble, and that cloud needs its own kind of verification.

Module Overview

This module gives early-career hydrologists a compact, hands-on introduction to how forecasters actually judge whether a streamflow model is doing its job. It moves you from continuous metrics (RMSE, NSE, KGE, PBIAS) and Moriasi performance ratings, through categorical event verification (POD, FAR, POFD, CSI, ROC), to probabilistic ensemble verification (Brier score, BSS, reliability diagrams, rank histograms, CRPS). At every step you read a real hydrograph, score it, and decide whether the result is fit for an operational decision.

The hands-on activities use the HyMOD conceptual rainfall-runoff model on the Leaf River catchment through a sequence of four Python notebooks, runnable directly in Google Colab.

Topics Covered

Deterministic evaluation metrics (RMSE, NSE, KGE, PBIAS); Moriasi performance ratings; hydrograph diagnostics; flow-duration curves; ensemble forecasting; contingency tables; categorical scores (POD, FAR, POFD, CSI); ROC curves and performance diagrams; Brier score and Brier skill score; reliability diagrams; rank histograms; Continuous Ranked Probability Score (CRPS); the HyMOD conceptual model.

Prerequisites

Before starting this module, learners should have:

  • Basic familiarity with hydrologic concepts (precipitation, runoff, streamflow).
  • Working knowledge of Python (numpy, pandas, matplotlib), enough to run a Jupyter notebook end-to-end. No experience with HyMOD or with ensemble systems is required.
  • Comfort reading a hydrograph.

Learning Objectives

At the end of this module, learners will be able to:

  1. Choose, compute, and interpret the right deterministic metric (NSE, KGE, PBIAS, RMSE, peak error) for a given forecasting use case.
  2. Read a hydrograph and a flow-duration curve to diagnose what kind of error a model is making.
  3. Build a 2×2 contingency table and compute POD, FAR, POFD, and CSI for a flood-warning forecast.
  4. Read a ROC curve, a reliability diagram, and a rank histogram to judge whether an ensemble is sharp, reliable, and well-dispersed.
  5. Compute and interpret the Brier score, Brier skill score, and CRPS, and recommend whether an ensemble forecast is fit for a stated operational decision.

This is accomplished through a series of short readings on fundamental concepts, accompanied by two hands-on Python learning activities and a capstone authentic task.

Suggested Implementation

This module is broken down into four sections with small units. Each section is self-contained and can be exercised individually. Total estimated effort is 1 to 1.5 hours, self paced.

Section Estimated time
Section 1, Introduction 5 min
Section 2, Deterministic evaluation 25 to 30 min (includes Learning Activity 1)
Section 3, Ensemble evaluation 30 to 35 min (includes Learning Activity 2)
Section 4, Authentic task 10 to 15 min

Course Authors

Course Staff Image, Mohamed Abdelkader

Mohamed Abdelkader

IIHR Hydroscience and Engineering, University of Iowa

mohamed-abdelkader@uiowa.edu

Course Staff Image, Humberto Vergara

Humberto Vergara

IIHR Hydroscience and Engineering, University of Iowa

Target Audience

Early-career hydrologists and operational forecasters at National Meteorological and Hydrological Services (NMHSs) who want to confidently evaluate streamflow models for operational decisions.

Tools Needed

A computer with internet access, a modern browser, and a Google account to run the accompanying Jupyter notebooks in Google Colab (no local install required). Basic working knowledge of Python is helpful.

Expected Effort

About 1 hour 15 minutes total (1 to 1.5 hour range). Self paced.

Course Sharing and Adaptation

This course is available for export by clicking the "Export Link" at the top right of this page. You will need a HydroLearn instructor studio account to do this. You will first need to sign up for a hydrolearn.org account, then register as an instructor by clicking 'studio.hydrolearn' and requesting course creation permissions.

Recommended Citation

Abdelkader, M., Vergara, H. (2026). How Good Is Your Forecast? Evaluating Deterministic and Ensemble Streamflow Models. WMO Capacity-Building Activity. University of Iowa.

Acknowledgments

Module developed for the World Meteorological Organization (WMO) capacity-building activity, built around the HyMOD ensemble-verification notebook series by Humberto Vergara and Mohamed Abdelkader (University of Iowa). References to the supporting literature are listed in the appendix of each section.

  1. Course Export

    Export Link

    To customize this course for your own needs, export a copy and import into a new empty course inside Studio. Requires a HydroLearn instructor Studio account.

  2. Course Number

    WMO_01
  3. Classes Start

  4. Estimated Effort

    3:00
Enroll