Flow Accumulation Module

The flow accumulation module provides hydrological flow routing algorithms for DEM data, implementing D8 flow direction and drainage area computation.

Overview

This module implements a complete hydrological flow routing pipeline based on peer-reviewed literature:

  1. Outlet Identification - Classify cells that serve as drainage termini

  2. DEM Conditioning - Breach sinks and fill residual depressions

  3. Flow Direction - Assign D8 flow directions (ESRI encoding)

  4. Flow Accumulation - Compute upstream contributing area

See Flow Routing and Accumulation Guide for detailed algorithm explanations and debugging strategies.

Main Functions

src.terrain.flow_accumulation.flow_accumulation(dem_path, precipitation_path, output_dir=None, flow_algorithm='d8', fill_method='breach', min_basin_size=10000, max_fill_depth=None, coastal_elev_threshold=10.0, edge_mode='all', max_breach_depth=50.0, max_breach_length=100, parallel_method='checkerboard', epsilon=None, masked_basin_outlets=None, detect_basins=False, min_basin_depth=1.0, cell_size=None, max_cells=None, target_vertices=None, mask_ocean=True, ocean_elevation_threshold=0.0, backend='spec', lake_mask=None, lake_outlets=None, upscale_precip=False, upscale_factor=4, upscale_method='auto', cache=False, cache_dir=None)[source]

Compute flow accumulation with precipitation weighting.

Automatically downsamples DEM if it exceeds max_cells to improve performance.

Parameters:
  • dem_path (str) – Path to DEM raster file (GeoTIFF)

  • precipitation_path (str) – Path to annual precipitation raster (mm/year)

  • output_dir (str, optional) – Directory for output files (default: same as DEM)

  • flow_algorithm (str, default 'd8') – Flow routing method (‘d8’ only for now)

  • fill_method (str, default 'breach') – Depression handling (‘breach’ or ‘fill’)

  • cell_size (float, optional) – Override DEM resolution in meters

  • max_cells (int, optional) – Maximum number of cells for flow computation. DEM will be downsampled if it exceeds this limit. Mutually exclusive with target_vertices.

  • target_vertices (int, optional) – Target number of vertices for final rendering. Automatically sets max_cells = target_vertices * 3 for flow accuracy. Mutually exclusive with max_cells.

  • mask_ocean (bool, default True) – If True, detect and exclude ocean/water bodies from flow computation. Ocean cells (elevation <= ocean_elevation_threshold, connected to border) will not be filled and will have flow_dir = 0.

  • ocean_elevation_threshold (float, default 0.0) – Elevation threshold (meters) for ocean detection. Cells at or below this elevation connected to the border are considered ocean.

  • min_basin_size (int, optional, default 10000) – Minimum basin size (cells) to preserve. Endorheic basins >= this size will not be filled (preserves large natural basins like Salton Sea). Set to None to disable basin preservation.

  • max_fill_depth (float, optional) – Maximum fill depth (meters). Depressions requiring fill > this depth will be preserved. Set to None to allow unlimited fill depth.

  • backend ({"legacy", "spec", "pysheds"}, default "spec") – Which implementation to use for core hydrology algorithms: - “spec”: Spec-compliant 4-stage pipeline (outlet ID + breaching + fill) - RECOMMENDED - “legacy”: Morphological reconstruction + workarounds (deprecated) - “pysheds”: PySheds library integration (experimental, may produce cycles)

  • backend="pysheds") (Legacy parameters (used when backend="legacy" or) –

    fill_methodstr, default “breach”

    Depression handling (“breach” or “fill”)

    min_basin_sizeint, default 10000

    Minimum basin size (cells) to preserve

    max_fill_depthfloat, optional

    Maximum fill depth (meters) for preserving deep basins

  • backend="spec") (Spec-compliant parameters (used when) –

    coastal_elev_thresholdfloat, default 10.0

    Maximum elevation for coastal outlets (meters above sea level)

    edge_mode{“all”, “local_minima”, “outward_slope”, “none”}, default “all”

    Boundary outlet strategy (see flow-spec.md for details)

    max_breach_depthfloat, default 50.0

    Maximum elevation drop at any cell during breaching (meters)

    max_breach_lengthint, default 100

    Maximum breach path length (cells)

    epsilonfloat, optional

    Minimum gradient in filled areas (meters/cell). If None (default), automatically calculated as 1e-5 * cell_resolution per flow-spec.md guidelines (e.g., 1e-4 for 10m DEM). Pass explicit value to override.

    masked_basin_outletsnp.ndarray (bool), optional

    User-supplied outlet locations for known lakes/basins

  • lake_mask (np.ndarray, optional) – Labeled mask of known water bodies (0 = no lake, N = lake ID). Lake interior cells will route flow toward their outlets.

  • lake_outlets (np.ndarray (bool), optional) – Boolean mask of lake outlet cells. Required if lake_mask is provided. Outlets receive accumulated flow from all lake cells.

  • detect_basins (bool, default False) – If True, automatically detect and preserve endorheic basins (closed drainage basins). Basins are masked during DEM conditioning to preserve their original topography. Works with spec backend only.

  • min_basin_depth (float, default 1.0) – Minimum basin depth (meters) to be considered endorheic. Only used when detect_basins=True. Basins shallower than this threshold are not preserved.

  • upscale_precip (bool, default False) – If True, upscale precipitation data to match DEM resolution using ESRGAN/bilateral upscaling before computing upstream rainfall. This preserves fine-scale precipitation patterns and reduces coastal artifacts. Upscaling happens BEFORE ocean masking.

  • upscale_factor (int, default 4) – Target upscaling factor for precipitation (2, 4, or 8). Only used if upscale_precip=True.

  • upscale_method (str, default "auto") – Upscaling method: “auto” (try ESRGAN, fall back to bilateral), “esrgan” (Real-ESRGAN neural network), “bilateral” (bilateral filter), or “bicubic” (simple interpolation). Only used if upscale_precip=True.

  • cache (bool, default False) – If True, cache computation results and load from cache if valid. Cache is invalidated if DEM file is modified or parameters change.

  • cache_dir (str, optional) – Directory for cached files. Defaults to DEM’s directory if not specified.

  • coastal_elev_threshold (float)

  • edge_mode (Literal['all', 'local_minima', 'outward_slope', 'none'])

  • max_breach_depth (float)

  • max_breach_length (int)

  • parallel_method (str)

  • epsilon (float | None)

  • masked_basin_outlets (ndarray | None)

Returns:

Dictionary with keys: - flow_direction: np.ndarray (D8 encoded) - drainage_area: np.ndarray (cells draining to each pixel) - upstream_rainfall: np.ndarray (mm·m² total upstream) - conditioned_dem: np.ndarray (pit-filled DEM) - metadata: dict (processing info, including downsampling details) - files: dict (output file paths)

Return type:

dict

Raises:
  • FileNotFoundError – If DEM or precipitation file doesn’t exist

  • ValueError – If spatial alignment fails or both max_cells and target_vertices specified

Example:

from src.terrain.flow_accumulation import flow_accumulation

# Run complete flow pipeline
result = flow_accumulation(
    dem_path="data/my_dem.tif",
    precip_path="data/precipitation.tif",  # optional
    backend="spec",
    max_breach_depth=25.0,
    max_breach_length=80,
)

flow_dir = result['flow_dir']
drainage_area = result['drainage_area']
upstream_rainfall = result['upstream_rainfall']

Flow Direction

src.terrain.flow_accumulation.compute_flow_direction(dem, mask=None)[source]

Compute D8 flow direction from DEM (Stage 3 of flow-spec.md).

Assigns each cell the direction of steepest descent, using ESRI’s power-of-2 D8 encoding for compatibility with GIS workflows.

Parameters:
  • dem (np.ndarray) – Digital elevation model (2D array, already conditioned by Stages 1-2)

  • mask (np.ndarray (bool), optional) – Boolean mask indicating cells to exclude from flow computation. Masked cells (ocean, lakes, etc.) will have flow_dir = 0 (no flow).

Returns:

Flow direction encoded as D8 using ESRI power-of-2 convention: - 0 = outlet or masked cell (no downstream flow) - 1,2,4,8,16,32,64,128 = ESRI D8 codes (see module docstring)

Return type:

np.ndarray (uint8)

Notes

This implements Stage 3 of flow-spec.md (lines 347-390).

D8 Direction Encoding (ESRI ArcGIS):

8 4 2

16 x 1 32 64 128

Direction codes represent powers of 2 for bitwise compatibility: - 1=East, 2=NE, 4=North, 8=NW, 16=West, 32=SW, 64=South, 128=SE

Distances: - Orthogonal (1,4,16,64): distance = 1 - Diagonal (2,8,32,128): distance = sqrt(2)

Flow Direction Selection: Each cell flows toward the neighbor with maximum slope (elevation drop per unit distance). If no neighbor is lower, the cell is an outlet (flow_dir = 0). This should not occur after proper DEM conditioning (Stage 2) unless the cell is an outlet or masked.

See also

identify_outlets

Stage 1 (outlet identification)

breach_depressions_constrained

Stage 2a (depression breaching)

priority_flood_fill_epsilon

Stage 2b (depression filling)

compute_drainage_area

Stage 4 (flow accumulation)

Example:

import numpy as np
from src.terrain.flow_accumulation import compute_flow_direction

# Create simple DEM
dem = np.array([
    [10, 9, 8],
    [9, 8, 7],
    [8, 7, 6],
], dtype=np.float32)

flow_dir = compute_flow_direction(dem)
# Returns D8 direction codes (ESRI convention: 1=E, 64=S, 128=SE, etc.)

Drainage Area

src.terrain.flow_accumulation.compute_drainage_area(flow_dir)[source]

Compute drainage area (unweighted flow accumulation).

Stage 4 of flow-spec.md pipeline. Uses topological sort (Kahn’s algorithm) to traverse the flow network and accumulate drainage areas.

Parameters:

flow_dir (np.ndarray) – D8 flow direction grid (ESRI codes: 0,1,2,4,8,16,32,64,128) where 0 indicates outlet/nodata (no downstream flow)

Returns:

Number of cells draining through each pixel (including itself)

Return type:

np.ndarray

Raises:

RuntimeError – If a cycle is detected in the flow network, indicating a bug in DEM conditioning. Cycles should never occur after proper Stage 2 (DEM conditioning).

Notes

Implementation matches flow-spec.md Stage 4 (lines 392-438). Uses topological sort with cycle detection to ensure each cell’s contributions are computed before it contributes to its receiver.

If cycle detection fires, check that: - DEM was properly conditioned (Stage 2a: breaching, 2b: filling) - All outlets are properly identified (Stage 1) - No invalid flow directions exist (Stage 3)

Example:

from src.terrain.flow_accumulation import (
    compute_flow_direction,
    compute_drainage_area,
)

flow_dir = compute_flow_direction(dem)
drainage_area = compute_drainage_area(flow_dir)
# Each cell contains count of upstream contributing cells

Upstream Rainfall

src.terrain.flow_accumulation.compute_upstream_rainfall(flow_dir, precipitation)[source]

Compute precipitation-weighted flow accumulation.

Parameters:
  • flow_dir (np.ndarray) – D8 flow direction grid

  • precipitation (np.ndarray) – Annual precipitation (mm/year)

Returns:

Total upstream precipitation (mm·m²) at each pixel Represents cumulative rainfall from entire upstream area

Return type:

np.ndarray

Example:

from src.terrain.flow_accumulation import (
    compute_flow_direction,
    compute_upstream_rainfall,
)

flow_dir = compute_flow_direction(dem)
precipitation = np.ones_like(dem) * 500  # 500mm everywhere
upstream_rainfall = compute_upstream_rainfall(flow_dir, precipitation)
# Each cell contains sum of precipitation from all upstream cells

DEM Conditioning

src.terrain.flow_accumulation.condition_dem_spec(dem, nodata_mask=None, coastal_elev_threshold=10.0, edge_mode='all', max_breach_depth=50.0, max_breach_length=100, epsilon=None, masked_basin_outlets=None, parallel_method='checkerboard')[source]

Condition DEM using spec-compliant 4-stage pipeline.

Orchestrates the complete flow-spec.md pipeline: 1. Stage 1: Identify outlets (coastal, edge, masked basins) 2. Stage 2a: Constrained breaching (Dijkstra least-cost) 3. Stage 2b: Priority-flood fill residuals (with epsilon) 4. (External) D8 flow direction → see compute_flow_direction() 5. (External) Flow accumulation → see compute_drainage_area()

Parameters:
  • dem (np.ndarray) – Input digital elevation model

  • nodata_mask (np.ndarray (bool), optional) – Mask of ocean/off-grid cells (True = NoData)

  • coastal_elev_threshold (float, default 10.0) – Maximum elevation for coastal outlets (meters above sea level)

  • edge_mode ({"all", "local_minima", "outward_slope", "none"}, default "all") – Boundary outlet strategy

  • max_breach_depth (float, default 50.0) – Maximum elevation drop at any single cell during breaching (meters)

  • max_breach_length (int, default 100) – Maximum breach path length (cells)

  • epsilon (float, default 1e-4) – Minimum elevation increment per cell in filled areas (meters)

  • masked_basin_outlets (np.ndarray (bool), optional) – User-supplied outlet locations for known lakes/basins

  • parallel_method (str)

Returns:

  • conditioned_dem (np.ndarray) – Breached and filled DEM

  • outlets (np.ndarray (bool)) – Outlet mask (useful for diagnostics and visualization)

  • breached_dem (np.ndarray) – DEM after breaching but before filling (for fill depth calculation)

Return type:

Tuple[ndarray, ndarray, ndarray]

Notes

This implements the complete spec-compliant pipeline from flow-spec.md.

Implementation vs. Spec: The implementation uses a two-pass breach approach rather than the priority-flood breach sketch in the spec (lines 125-172). Both approaches are equivalent and spec-compliant; two-pass is chosen for clarity:

  1. Identify all sinks (cells with no downslope neighbor)

  2. For each sink, attempt least-cost Dijkstra path to an outlet/drain-point

  3. Apply monotonic gradient along successful paths

The priority-flood approach integrates breach discovery into the same priority queue as flow processing, which is more complex but equivalent.

Key Advantages Over Legacy Approach: - Explicit outlet classification prevents boundary artifacts - Selective breaching (max_depth, max_length constraints) preserves

legitimate basins and endorheic systems

  • Epsilon gradient applied during fill (not after), eliminating need for post-hoc flat resolution

  • No workarounds needed (_fix_coastal_flow, fill_small_sinks)

  • Deterministic and reproducible (no randomness or ties)

Critical Correctness Requirements: After conditioning, the DEM must satisfy: 1. Every land cell (not NoData) has a path to an outlet 2. Every land cell has at least one downslope neighbor (guaranteed by fill) 3. No cycles in flow directions (validated by compute_drainage_area) 4. Outlets are lowest points in their regions (set explicitly in Stage 1)

If cycles are detected in compute_drainage_area, check: - Outlets are correctly identified (Stage 1) - Breach parameters (max_depth, max_length) aren’t too restrictive - Epsilon isn’t too large (causing reversals on filled areas)

Parameter Tuning: - max_breach_depth: Controls minimum basin depth preserved

  • Smaller values (5-20m) preserve more detailed basins

  • Larger values (50-100m) allow breaching of deeper basins

  • Default 50m is suitable for most 30m DEMs

  • max_breach_length: Controls maximum breach path extent - Smaller values (10-30 cells) for detailed hydrology - Larger values (100-300 cells) for regional studies - Default 100 cells suitable for outlet-finding on large DEMs

  • epsilon: Controls micro-gradient in filled areas - Auto-calculated as 1e-5 × cell_resolution by default - Usually no manual tuning needed - If flats look “stair-stepped”, epsilon is too large

Performance Notes: - Stage 1 (outlets): O(n) with 8-neighbor checks - Stage 2a (breach): O(n log n) per sink via Dijkstra + priority queue - Stage 2b (fill): O(n log n) via priority-flood - Overall: O(n log n) where n = number of cells - Typical performance: 100M cells in ~30s (C/Rust), ~5m (Python/numba)

Examples

>>> dem = np.array([[5, 5, 5],
...                 [5, 3, 5],
...                 [5, 5, 5]], dtype=np.float32)
>>> nodata = np.zeros((3, 3), dtype=bool)
>>> conditioned, outlets, breached = condition_dem_spec(dem, nodata)
>>> # Outlets: [[False, False, False],
>>> #           [False, True,  False],
>>> #           [False, False, False]]
>>> # Conditioned: [[5, 5, 5],
>>> #               [5, 5, 5],
>>> #               [5, 5, 5]]  (pit filled to neighbor level)
>>> # Now use with Stage 3 and 4:
>>> flow_dir = compute_flow_direction(conditioned)
>>> drainage_area = compute_drainage_area(flow_dir)

This is the recommended conditioning function, implementing the spec-compliant constrained breaching and priority-flood fill algorithm.

Example:

from src.terrain.flow_accumulation import condition_dem_spec

dem_conditioned, outlets = condition_dem_spec(
    dem,
    nodata_mask=ocean_mask,
    coastal_elev_threshold=0.0,
    edge_mode='all',
    max_breach_depth=25.0,
    max_breach_length=80,
    epsilon=1e-4,
)
src.terrain.flow_accumulation.condition_dem(dem, method='fill', ocean_mask=None, min_basin_size=None, max_fill_depth=None, min_basin_depth=0.5, fill_small_sinks=None)[source]

Condition DEM by filling pits and resolving depressions.

Uses morphological reconstruction (priority flood algorithm) to properly fill depressions and pits. This is the standard algorithm used by most hydrological analysis tools (e.g., GRASS, WhiteboxTools, ArcGIS).

Supports masking ocean areas and preserving large endorheic basins.

Parameters:
  • dem (np.ndarray) – Input DEM

  • method (str, default 'fill') – Depression handling method (‘fill’ or ‘breach’) - ‘fill’: Complete depression filling (raises elevations) - ‘breach’: Minimal filling to preserve terrain (uses epsilon)

  • ocean_mask (np.ndarray (bool), optional) – Boolean mask indicating ocean/water cells to exclude from conditioning. Masked cells maintain original elevation.

  • min_basin_size (int, optional) – Minimum basin size (cells) to preserve. Basins >= this size are not filled (preserves large endorheic basins like Salton Sea).

  • max_fill_depth (float, optional) – Maximum fill depth (meters). Depressions requiring fill > this depth are preserved (protects deep natural basins).

  • min_basin_depth (float, default 0.5) – Minimum depression depth (meters) to be considered a preservable basin. Higher values = only preserve truly deep basins, fill shallower ones. Increase this for noisy high-resolution DEMs.

  • fill_small_sinks (int, optional) – Maximum sink size (cells) to fill. After main filling, any remaining local minima (sinks) with contributing area < this size are filled. This removes small artifacts that create fragmented drainage. Typical values: 10-100 cells.

Returns:

Conditioned DEM with depressions resolved

Return type:

np.ndarray

Examples

>>> # Basic usage
>>> conditioned = condition_dem(dem)
>>> # Mask ocean
>>> ocean = detect_ocean_mask(dem, threshold=0.0)
>>> conditioned = condition_dem(dem, ocean_mask=ocean)
>>> # Preserve large basins
>>> conditioned = condition_dem(dem, min_basin_size=10000)
>>> # Fill small sinks (< 50 cells) to reduce fragmentation
>>> conditioned = condition_dem(dem, fill_small_sinks=50)

Legacy conditioning function with simpler fill-only approach.

src.terrain.flow_accumulation.breach_depressions_constrained(dem, outlets, max_breach_depth=50.0, max_breach_length=100, epsilon=0.0001, nodata_mask=None, parallel_method='checkerboard')[source]

Remove depressions via constrained breaching (Stage 2a of flow-spec.md).

Uses Lindsay (2016) constrained least-cost breaching algorithm. Implements two-pass approach: 1. Identify all sinks (cells with no downslope neighbor) 2. For each sink, attempt Dijkstra breach within constraints

Parameters:
  • dem (np.ndarray) – Input digital elevation model

  • outlets (np.ndarray (bool)) – Outlet mask from identify_outlets()

  • max_breach_depth (float, default 50.0) – Maximum elevation drop allowed at any single cell (meters)

  • max_breach_length (int, default 100) – Maximum breach path length (cells)

  • epsilon (float, default 1e-4) – Small gradient for breach paths (meters per cell)

  • nodata_mask (np.ndarray (bool), optional) – Cells to exclude from breaching

  • parallel_method ({"checkerboard", "iterative"}, default "checkerboard") –

    Parallelization strategy for breach path finding: - “checkerboard”: Two-phase parallel processing using checkerboard partitioning.

    Fast but can only terminate at outlets (misses chaining opportunities).

    • ”iterative”: Iterative refinement that runs multiple rounds, allowing each round’s breaches to become termination targets for the next round. Finds more breaches via chaining but may be slower.

Returns:

DEM with depressions breached where possible

Return type:

np.ndarray

Notes

This implements Stage 2a of flow-spec.md (lines 115-289).

Outlet Virtual Elevation: Outlets (identified in Stage 1) act as guaranteed sinks in the breach algorithm. Breach paths MUST terminate at an outlet or at a cell that can drain naturally. Outlets are marked in the resolved set before breaching starts, ensuring they are always considered valid drainage targets.

Breach Path Termination: Breach paths from sinks can terminate at: 1. An outlet cell (guaranteed sink) 2. A cell lower than the sink (natural downslope) 3. A cell that has already been resolved (has a path to an outlet)

This ensures the final flow network has no artificial endorheic basins.

Residual Sinks: Sinks that cannot be breached within constraints (max_breach_depth, max_breach_length) are left for Stage 2b (priority-flood fill) to handle. These are typically: - Large legitimate basins (lakes, endorheic systems) - Depressions too deep/long to breach efficiently

References

Lindsay, J.B. (2016). Efficient hybrid breaching-filling sink removal methods for flow path enforcement in digital elevation models. Hydrological Processes, 30, 846–857.

Spec Reference: flow-spec.md lines 42-108 (outlet identification), 115-289 (constrained breach)

Low-level breaching function for advanced use cases.

src.terrain.flow_accumulation.priority_flood_fill_epsilon(dem, outlets, epsilon=0.0001, nodata_mask=None)[source]

Fill residual depressions with epsilon gradient (Stage 2b of flow-spec.md).

Uses Barnes et al. (2014) priority-flood algorithm with epsilon gradient applied DURING fill (not after). This creates flow-directing gradients as flats form, ensuring proper drainage without post-processing.

Parameters:
  • dem (np.ndarray) – Input DEM (already breached by Stage 2a)

  • outlets (np.ndarray (bool)) – Outlet mask from identify_outlets()

  • epsilon (float, default 1e-4) – Minimum elevation increment per cell (meters per cell). Creates gradients in filled areas to ensure drainage.

  • nodata_mask (np.ndarray (bool), optional) – Cells to exclude from filling

Returns:

DEM with residual depressions filled

Return type:

np.ndarray

Notes

This implements Stage 2b of flow-spec.md (lines 291-328).

Outlet Virtual Elevation: Outlets are seeded into the priority queue with their original elevations. As the fill progresses outward from outlets, depression cells are raised with epsilon increments, creating micro-gradients that naturally point back toward the outlets. This ensures filled areas drain properly in Stage 3.

Epsilon Application (Flat Resolution Strategy): Epsilon is applied DURING fill as cells are raised. This is the most robust flat resolution approach for several reasons:

  1. Implicit Gradient Creation: As the fill progresses from outlets toward interior sinks, cells filled later get progressively higher elevations (by epsilon increments). This creates natural gradients pointing back toward outlets without explicit post-processing.

  2. Alternatives Mentioned in Literature: - Garbrecht & Martz (1997): Dual-gradient method assigns flow to

    flats based on proximity to higher terrain. More complex to implement.

    • Postprocessing: Some systems fill without gradients, then resolve flats afterward. Can be ambiguous for complex flat structures.

  3. Recommendation: The epsilon-during-fill approach (used here) is: - Simpler to understand and implement - Produces consistent drainage patterns - Guaranteed to resolve all flats into valid flow networks - No need for separate flat-resolution algorithm

Epsilon Selection: For epsilon tuning, see flow-spec.md lines 484-489: - Auto-calculated (default): epsilon = 1e-5 * cell_resolution

  • For 10m DEM: epsilon ≈ 1e-4 m/cell (0.1 mm per cell)

  • For 1m DEM: epsilon ≈ 1e-5 m/cell (0.01 mm per cell)

  • For integer DEMs: Use epsilon = 1 in native elevation units - If elevation in millimeters: epsilon = 1 mm/cell - If elevation in centimeters: epsilon = 1 cm/cell

  • Too small epsilon: Floating-point accumulation errors may create ties or reversals in flat areas. Rule of thumb: epsilon should exceed DEM measurement precision.

  • Too large epsilon: Creates obvious artificial “stair-stepping” in filled areas. Visual inspection usually reveals values > 0.1m.

Seed Cells: The priority queue is seeded with: 1. All identified outlets (from Stage 1) - guaranteed sinks 2. All cells adjacent to NoData (domain boundary) - implicit outlets

This ensures water drains both to identified outlets AND off the map edge.

Flat Area Behavior: After filling, flat areas will have cells at different elevations (differing by epsilon). During Stage 3 (flow direction), steepest descent will cause water on flats to flow toward outlet-adjacent cells, creating coherent drainage patterns. This is more realistic than assuming multiple flow directions on wide flats.

References

Barnes, R., Lehman, C., & Mulla, D. (2014). Priority-flood: An optimal depression-filling and watershed-labeling algorithm for digital elevation models. Computers & Geosciences, 62, 117–127.

Garbrecht, J., & Martz, L.W. (1997). The assignment of drainage direction over flat surfaces in raster digital elevation models. Journal of Hydrology, 193, 204–213.

Spec Reference: flow-spec.md lines 291-328 (Stage 2b)

flow-spec.md lines 484-494 (flat resolution discussion)

Low-level priority-flood fill with epsilon gradient.

Outlet and Basin Detection

src.terrain.flow_accumulation.identify_outlets(dem, nodata_mask, coastal_elev_threshold=10.0, edge_mode='all', masked_basin_outlets=None)[source]

Identify drainage outlet cells (Stage 1 of flow-spec.md).

Classifies cells that act as drainage termini where water leaves the system. Implements three outlet types: coastal, edge, and masked basin outlets.

Parameters:
  • dem (np.ndarray) – Digital elevation model

  • nodata_mask (np.ndarray (bool)) – Mask of ocean/off-grid cells (True = NoData)

  • coastal_elev_threshold (float, default 10.0) – Maximum elevation for coastal outlets (meters above sea level). Prevents high cliffs adjacent to ocean from being spurious outlets.

  • edge_mode ({"all", "local_minima", "outward_slope", "none"}, default "all") – Boundary outlet strategy: - “all”: All boundary cells are outlets (safest, prevents artificial basins) - “local_minima”: Only boundary cells that are local minima - “outward_slope”: Boundary cells with interior neighbors sloping toward them - “none”: No edge outlets (for islands fully surrounded by coastline)

  • masked_basin_outlets (np.ndarray (bool), optional) – User-supplied outlet locations for known lakes/basins

Returns:

Boolean mask where True = outlet cell

Return type:

np.ndarray (bool)

Notes

This implements Stage 1 of flow-spec.md (lines 42-108).

Edge mode “all” is the safest default - it ensures no artificial endorheic basins form at boundaries. The cost is some fragmentation of edge drainage networks, but this is usually preferable to missed outlets.

Examples

>>> dem = np.array([[5, 5, 5],
...                 [5, 3, 5],
...                 [5, 5, 5]], dtype=np.float32)
>>> nodata = np.array([[False, False, False],
...                    [True,  False, False],
...                    [False, False, False]])
>>> outlets = identify_outlets(dem, nodata, coastal_elev_threshold=10.0)
>>> outlets[1, 1]  # Low coastal cell adjacent to ocean
True
src.terrain.flow_accumulation.detect_ocean_mask(dem, threshold=0.0, border_only=True)[source]

Detect ocean or water bodies in DEM.

Identifies cells at or below elevation threshold that are connected to the border (assumed to be ocean/large water bodies).

Uses connected component labeling for efficient O(n) detection.

Parameters:
  • dem (np.ndarray) – Digital elevation model

  • threshold (float, default 0.0) – Elevation threshold (meters). Cells <= threshold are candidates.

  • border_only (bool, default True) – If True, only return border-connected low-elevation areas (ocean). If False, return all areas below threshold (includes inland lakes).

Returns:

Boolean mask where True = ocean/water

Return type:

np.ndarray (bool)

Examples

>>> dem = np.array([[0, 0, 5], [0, 1, 6], [5, 6, 7]])
>>> ocean = detect_ocean_mask(dem, threshold=0.0, border_only=True)
>>> ocean
array([[ True,  True, False],
       [ True, False, False],
       [False, False, False]])

Example:

from src.terrain.flow_accumulation import detect_ocean_mask

ocean_mask = detect_ocean_mask(dem, threshold=0.0, border_only=True)
src.terrain.flow_accumulation.detect_endorheic_basins(dem, min_size=10, exclude_mask=None, min_depth=0.5)[source]

Detect endorheic (closed) basins in DEM.

Identifies closed depressions (basins with no outlet) that exceed a minimum size threshold. Used to preserve large natural basins like the Salton Sea, Death Valley, etc.

Parameters:
  • dem (np.ndarray) – Digital elevation model

  • min_size (int, default 10) – Minimum basin size in cells to be considered significant

  • exclude_mask (np.ndarray (bool), optional) – Mask of areas to exclude from basin detection (e.g., ocean). This improves performance by only filling land areas.

  • min_depth (float, default 0.5) – Minimum depression depth in meters to be considered a basin. Higher values = only preserve truly deep basins, fill shallower ones.

Returns:

  • basin_mask (np.ndarray (bool)) – Boolean mask where True = part of endorheic basin

  • basin_sizes (dict) – Dictionary mapping basin_id to size in cells

Return type:

tuple[ndarray, dict]

Examples

>>> # Create closed basin surrounded by mountains
>>> dem = np.array([[50, 50, 50],
...                 [50, 10, 50],
...                 [50, 50, 50]])
>>> mask, sizes = detect_endorheic_basins(dem, min_size=1)
>>> mask[1, 1]  # Center is basin
True

Example:

from src.terrain.flow_accumulation import detect_endorheic_basins

basin_mask, basins = detect_endorheic_basins(
    dem,
    min_size=5000,      # cells
    min_depth=1.0,      # meters
    exclude_mask=ocean_mask,
)

D8 Direction Constants

The module uses ESRI’s standard D8 power-of-2 encoding:

# D8 Neighbor Geometry:
#   8  4  2
#  16  x  1
#  32 64 128
src.terrain.flow_accumulation.D8_DIRECTIONS: dict

Mapping from (row_offset, col_offset) to direction code.

src.terrain.flow_accumulation.D8_OFFSETS: dict

Reverse mapping from direction code to (row_offset, col_offset).

See Also