Water Bodies Module

The water bodies module provides functions for downloading, rasterizing, and routing flow through lakes and reservoirs.

Overview

This module integrates water body data from two sources:

  • NHD (National Hydrography Dataset) - US-only, high detail

  • HydroLAKES - Global coverage, includes reservoir data

Water bodies are treated differently based on their context:

  • Lakes inside endorheic basins act as sinks (terminal drainage)

  • Lakes outside basins act as conduits (water flows through to outlet)

See Flow Routing and Accumulation Guide for detailed information on water body routing strategies.

Data Download

src.terrain.water_bodies.download_water_bodies(bbox, output_dir, data_source='hydrolakes', min_area_km2=0.1, force_download=False)[source]

Download lake polygons and outlets for bounding box.

Parameters:
  • bbox (tuple) – Bounding box (south, west, north, east) in degrees

  • output_dir (str) – Directory to save downloaded data

  • data_source (str) – Data source: “nhd” (USA only) or “hydrolakes” (global)

  • min_area_km2 (float) – Minimum lake area to include (km²)

  • force_download (bool) – If True, re-download even if cached

Returns:

Path to downloaded GeoJSON file

Return type:

Path

Main download function that dispatches to NHD or HydroLAKES.

Example:

from src.terrain.water_bodies import download_water_bodies

geojson_path = download_water_bodies(
    bbox=(32.5, -117.5, 33.5, -116.5),  # south, west, north, east
    output_dir="data/water_bodies",
    data_source="hydrolakes",
    min_area_km2=0.01,
)
src.terrain.water_bodies.download_nhd_water_bodies(bbox, output_dir)[source]

Download NHDWaterbody and NHDFlowline from USGS REST API.

Parameters:
  • bbox (tuple) – Bounding box (south, west, north, east)

  • output_dir (str) – Output directory

Returns:

  • waterbodies (dict) – GeoJSON of lake polygons

  • flowlines (dict) – GeoJSON of stream lines

Return type:

Tuple[Dict, Dict]

Download from National Hydrography Dataset (US only).

src.terrain.water_bodies.download_hydrolakes(bbox, output_dir)[source]

Get HydroLAKES data for bounding box.

Note: HydroLAKES is distributed as a shapefile that must be downloaded once from hydrosheds.org. This function filters the local data to bbox.

For initial implementation, returns empty FeatureCollection with instructions for manual download.

Parameters:
  • bbox (tuple) – Bounding box (south, west, north, east)

  • output_dir (str) – Output directory

Returns:

GeoJSON FeatureCollection with lake polygons and pour_point properties

Return type:

dict

Download from HydroLAKES global database.

Rasterization

src.terrain.water_bodies.rasterize_lakes_to_mask(lakes_geojson, bbox, resolution=0.001)[source]

Rasterize lake polygons to a labeled mask with geographic transform.

Uses rasterio.features.rasterize for robust handling of complex polygons, interior rings (islands), and MultiPolygon geometries.

Parameters:
  • lakes_geojson (dict) – GeoJSON FeatureCollection with lake polygons

  • bbox (tuple) – Bounding box (south, west, north, east) in degrees

  • resolution (float) – Cell size in degrees (default: 0.001 ≈ 100m)

Returns:

  • mask (np.ndarray (uint16)) – Labeled mask where 0 = no lake, 1+ = lake ID

  • transform (Affine) – Geographic transform (pixel to coordinate)

Return type:

Tuple[ndarray, Affine]

Convert GeoJSON lake polygons to a labeled raster mask.

Example:

import json
from src.terrain.water_bodies import rasterize_lakes_to_mask

with open("lakes.geojson") as f:
    lakes_geojson = json.load(f)

lake_mask, transform = rasterize_lakes_to_mask(
    lakes_geojson,
    bbox=(32.5, -117.5, 33.5, -116.5),
    resolution=30.0,  # meters
)
# lake_mask: 2D array where each lake has unique ID (1, 2, 3, ...)

Outlet and Inlet Detection

src.terrain.water_bodies.identify_outlet_cells(lake_mask, outlets, transform)[source]

Convert outlet points to raster cell locations.

Parameters:
  • lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, N = lake ID)

  • outlets (dict) – Mapping of lake_id -> (lon, lat) outlet coordinates

  • transform (Affine) – Geographic transform

Returns:

outlet_mask – Boolean mask where True = outlet cell

Return type:

np.ndarray (bool)

Identify outlet pixels from geographic outlet coordinates.

src.terrain.water_bodies.identify_lake_outlets_from_nhd(waterbodies, flowlines)[source]

Identify outlet points by finding where flowlines exit lake polygons.

Parameters:
  • waterbodies (dict) – GeoJSON of lake polygons

  • flowlines (dict) – GeoJSON of stream lines

Returns:

Mapping of lake_id -> (lon, lat) outlet coordinates

Return type:

dict

Extract outlet coordinates from NHD flowline data.

src.terrain.water_bodies.identify_lake_inlets(lake_mask, dem, outlet_mask=None)[source]

Identify inlet cells for each lake (where water enters from surrounding terrain).

For each unique lake, finds cells on the lake boundary that are adjacent to lower-elevation non-lake cells. These represent where water naturally flows into the lake from surrounding terrain.

Parameters:
  • lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, 1+ = lake ID)

  • dem (np.ndarray) – Conditioned digital elevation model

  • outlet_mask (np.ndarray, optional) – Boolean outlet mask to exclude outlet cells from inlet identification

Returns:

Mapping of lake_id -> list of (row, col) inlet cell coordinates

Return type:

dict

Find cells where streams flow INTO lakes.

Example:

from src.terrain.water_bodies import identify_lake_inlets

inlets_dict = identify_lake_inlets(
    lake_mask,
    dem_conditioned,
    outlet_mask=lake_outlets,  # exclude outlets from inlets
)
# Returns {lake_id: [(row, col), ...], ...}

Flow Routing

src.terrain.water_bodies.create_lake_flow_routing(lake_mask, outlet_mask, dem)[source]

Create flow direction grid that routes all lake cells toward outlet.

Uses BFS from outlet to assign converging flow directions. For endorheic lakes (no outlet), all cells get flow_dir=0.

Parameters:
  • lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, N = lake ID)

  • outlet_mask (np.ndarray (bool)) – Boolean mask of outlet cells

  • dem (np.ndarray) – Digital elevation model (used for tie-breaking if needed)

Returns:

flow_dir – D8 flow direction for lake cells (0 elsewhere)

Return type:

np.ndarray (uint8)

Create D8 flow directions for lake interiors using BFS from outlets.

Example:

from src.terrain.water_bodies import create_lake_flow_routing

lake_flow = create_lake_flow_routing(
    lake_mask,
    lake_outlets,
    dem_conditioned,
)
# Returns D8 direction grid for lake cells (0 at outlets)

# Merge with terrain flow
flow_dir = np.where(lake_mask > 0, lake_flow, terrain_flow_dir)

Usage with Flow Accumulation

Complete workflow integrating water bodies with flow routing:

from src.terrain.flow_accumulation import (
    condition_dem_spec,
    compute_flow_direction,
    compute_drainage_area,
    detect_ocean_mask,
    detect_endorheic_basins,
)
from src.terrain.water_bodies import (
    download_water_bodies,
    rasterize_lakes_to_mask,
    identify_outlet_cells,
    create_lake_flow_routing,
)
import json

# Load DEM
with rasterio.open("dem.tif") as src:
    dem = src.read(1)
    transform = src.transform

# Detect ocean and basins
ocean_mask = detect_ocean_mask(dem, threshold=0.0)
basin_mask, basins = detect_endorheic_basins(dem, min_size=5000)

# Download water bodies
geojson_path = download_water_bodies(
    bbox=bbox,
    output_dir="water_bodies",
    data_source="hydrolakes",
)

with open(geojson_path) as f:
    lakes_geojson = json.load(f)

# Rasterize lakes
lake_mask, lake_transform = rasterize_lakes_to_mask(lakes_geojson, bbox)

# Extract outlets
outlets_dict = {
    idx: feature["properties"]["outlet"]
    for idx, feature in enumerate(lakes_geojson["features"], 1)
    if "outlet" in feature["properties"]
}
lake_outlets = identify_outlet_cells(lake_mask, outlets_dict, lake_transform)

# Condition DEM (mask lakes inside basins, keep others as conduits)
lakes_in_basins = lake_mask & basin_mask
conditioning_mask = ocean_mask | basin_mask | lakes_in_basins

dem_conditioned, outlets = condition_dem_spec(dem, nodata_mask=conditioning_mask)

# Compute flow direction (terrain only)
flow_dir_terrain = compute_flow_direction(dem_conditioned, mask=ocean_mask)

# Apply lake routing (for lakes outside basins)
lakes_outside = lake_mask & ~basin_mask
lake_flow = create_lake_flow_routing(lakes_outside, lake_outlets, dem_conditioned)
flow_dir = np.where(lakes_outside > 0, lake_flow, flow_dir_terrain)

# Compute drainage area
drainage_area = compute_drainage_area(flow_dir)

See Also

  • Flow Routing and Accumulation Guide - Comprehensive guide including water body handling

  • src.terrain.flow_accumulation - Core flow routing functions

  • src.terrain.water - Water detection from slope analysis