Water Bodies Module
The water bodies module provides functions for downloading, rasterizing, and routing flow through lakes and reservoirs.
Overview
This module integrates water body data from two sources:
NHD (National Hydrography Dataset) - US-only, high detail
HydroLAKES - Global coverage, includes reservoir data
Water bodies are treated differently based on their context:
Lakes inside endorheic basins act as sinks (terminal drainage)
Lakes outside basins act as conduits (water flows through to outlet)
See Flow Routing and Accumulation Guide for detailed information on water body routing strategies.
Data Download
- src.terrain.water_bodies.download_water_bodies(bbox, output_dir, data_source='hydrolakes', min_area_km2=0.1, force_download=False)[source]
Download lake polygons and outlets for bounding box.
- Parameters:
bbox (tuple) – Bounding box (south, west, north, east) in degrees
output_dir (str) – Directory to save downloaded data
data_source (str) – Data source: “nhd” (USA only) or “hydrolakes” (global)
min_area_km2 (float) – Minimum lake area to include (km²)
force_download (bool) – If True, re-download even if cached
- Returns:
Path to downloaded GeoJSON file
- Return type:
Path
Main download function that dispatches to NHD or HydroLAKES.
Example:
from src.terrain.water_bodies import download_water_bodies geojson_path = download_water_bodies( bbox=(32.5, -117.5, 33.5, -116.5), # south, west, north, east output_dir="data/water_bodies", data_source="hydrolakes", min_area_km2=0.01, )
- src.terrain.water_bodies.download_nhd_water_bodies(bbox, output_dir)[source]
Download NHDWaterbody and NHDFlowline from USGS REST API.
- Parameters:
- Returns:
waterbodies (dict) – GeoJSON of lake polygons
flowlines (dict) – GeoJSON of stream lines
- Return type:
Download from National Hydrography Dataset (US only).
- src.terrain.water_bodies.download_hydrolakes(bbox, output_dir)[source]
Get HydroLAKES data for bounding box.
Note: HydroLAKES is distributed as a shapefile that must be downloaded once from hydrosheds.org. This function filters the local data to bbox.
For initial implementation, returns empty FeatureCollection with instructions for manual download.
- Parameters:
- Returns:
GeoJSON FeatureCollection with lake polygons and pour_point properties
- Return type:
Download from HydroLAKES global database.
Rasterization
- src.terrain.water_bodies.rasterize_lakes_to_mask(lakes_geojson, bbox, resolution=0.001)[source]
Rasterize lake polygons to a labeled mask with geographic transform.
Uses rasterio.features.rasterize for robust handling of complex polygons, interior rings (islands), and MultiPolygon geometries.
- Parameters:
- Returns:
mask (np.ndarray (uint16)) – Labeled mask where 0 = no lake, 1+ = lake ID
transform (Affine) – Geographic transform (pixel to coordinate)
- Return type:
Convert GeoJSON lake polygons to a labeled raster mask.
Example:
import json from src.terrain.water_bodies import rasterize_lakes_to_mask with open("lakes.geojson") as f: lakes_geojson = json.load(f) lake_mask, transform = rasterize_lakes_to_mask( lakes_geojson, bbox=(32.5, -117.5, 33.5, -116.5), resolution=30.0, # meters ) # lake_mask: 2D array where each lake has unique ID (1, 2, 3, ...)
Outlet and Inlet Detection
- src.terrain.water_bodies.identify_outlet_cells(lake_mask, outlets, transform)[source]
Convert outlet points to raster cell locations.
- Parameters:
lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, N = lake ID)
outlets (dict) – Mapping of lake_id -> (lon, lat) outlet coordinates
transform (Affine) – Geographic transform
- Returns:
outlet_mask – Boolean mask where True = outlet cell
- Return type:
np.ndarray (bool)
Identify outlet pixels from geographic outlet coordinates.
- src.terrain.water_bodies.identify_lake_outlets_from_nhd(waterbodies, flowlines)[source]
Identify outlet points by finding where flowlines exit lake polygons.
- Parameters:
- Returns:
Mapping of lake_id -> (lon, lat) outlet coordinates
- Return type:
Extract outlet coordinates from NHD flowline data.
- src.terrain.water_bodies.identify_lake_inlets(lake_mask, dem, outlet_mask=None)[source]
Identify inlet cells for each lake (where water enters from surrounding terrain).
For each unique lake, finds cells on the lake boundary that are adjacent to lower-elevation non-lake cells. These represent where water naturally flows into the lake from surrounding terrain.
- Parameters:
lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, 1+ = lake ID)
dem (np.ndarray) – Conditioned digital elevation model
outlet_mask (np.ndarray, optional) – Boolean outlet mask to exclude outlet cells from inlet identification
- Returns:
Mapping of lake_id -> list of (row, col) inlet cell coordinates
- Return type:
Find cells where streams flow INTO lakes.
Example:
from src.terrain.water_bodies import identify_lake_inlets inlets_dict = identify_lake_inlets( lake_mask, dem_conditioned, outlet_mask=lake_outlets, # exclude outlets from inlets ) # Returns {lake_id: [(row, col), ...], ...}
Flow Routing
- src.terrain.water_bodies.create_lake_flow_routing(lake_mask, outlet_mask, dem)[source]
Create flow direction grid that routes all lake cells toward outlet.
Uses BFS from outlet to assign converging flow directions. For endorheic lakes (no outlet), all cells get flow_dir=0.
- Parameters:
lake_mask (np.ndarray) – Labeled lake mask (0 = no lake, N = lake ID)
outlet_mask (np.ndarray (bool)) – Boolean mask of outlet cells
dem (np.ndarray) – Digital elevation model (used for tie-breaking if needed)
- Returns:
flow_dir – D8 flow direction for lake cells (0 elsewhere)
- Return type:
np.ndarray (uint8)
Create D8 flow directions for lake interiors using BFS from outlets.
Example:
from src.terrain.water_bodies import create_lake_flow_routing lake_flow = create_lake_flow_routing( lake_mask, lake_outlets, dem_conditioned, ) # Returns D8 direction grid for lake cells (0 at outlets) # Merge with terrain flow flow_dir = np.where(lake_mask > 0, lake_flow, terrain_flow_dir)
Usage with Flow Accumulation
Complete workflow integrating water bodies with flow routing:
from src.terrain.flow_accumulation import (
condition_dem_spec,
compute_flow_direction,
compute_drainage_area,
detect_ocean_mask,
detect_endorheic_basins,
)
from src.terrain.water_bodies import (
download_water_bodies,
rasterize_lakes_to_mask,
identify_outlet_cells,
create_lake_flow_routing,
)
import json
# Load DEM
with rasterio.open("dem.tif") as src:
dem = src.read(1)
transform = src.transform
# Detect ocean and basins
ocean_mask = detect_ocean_mask(dem, threshold=0.0)
basin_mask, basins = detect_endorheic_basins(dem, min_size=5000)
# Download water bodies
geojson_path = download_water_bodies(
bbox=bbox,
output_dir="water_bodies",
data_source="hydrolakes",
)
with open(geojson_path) as f:
lakes_geojson = json.load(f)
# Rasterize lakes
lake_mask, lake_transform = rasterize_lakes_to_mask(lakes_geojson, bbox)
# Extract outlets
outlets_dict = {
idx: feature["properties"]["outlet"]
for idx, feature in enumerate(lakes_geojson["features"], 1)
if "outlet" in feature["properties"]
}
lake_outlets = identify_outlet_cells(lake_mask, outlets_dict, lake_transform)
# Condition DEM (mask lakes inside basins, keep others as conduits)
lakes_in_basins = lake_mask & basin_mask
conditioning_mask = ocean_mask | basin_mask | lakes_in_basins
dem_conditioned, outlets = condition_dem_spec(dem, nodata_mask=conditioning_mask)
# Compute flow direction (terrain only)
flow_dir_terrain = compute_flow_direction(dem_conditioned, mask=ocean_mask)
# Apply lake routing (for lakes outside basins)
lakes_outside = lake_mask & ~basin_mask
lake_flow = create_lake_flow_routing(lakes_outside, lake_outlets, dem_conditioned)
flow_dir = np.where(lakes_outside > 0, lake_flow, flow_dir_terrain)
# Compute drainage area
drainage_area = compute_drainage_area(flow_dir)
See Also
Flow Routing and Accumulation Guide - Comprehensive guide including water body handling
src.terrain.flow_accumulation- Core flow routing functionssrc.terrain.water- Water detection from slope analysis