Generating Density Heatmaps from Parcel Centroids Using Rasterio

Generating density heatmaps from parcel centroids using rasterio requires converting vector point geometries into a metric-aligned raster grid, populating cell values with point counts, applying a spatial smoothing kernel, and exporting the result as a georeferenced TIFF with preserved affine transformations. In automated compliance and zoning analysis pipelines, this technique transforms raw cadastral data into continuous intensity surfaces that reveal development pressure, infrastructure demand, and regulatory trigger zones. The most reliable implementation pairs geopandas for centroid extraction, rasterio for raster I/O and coordinate system management, and scipy.ndimage for Gaussian convolution.

import geopandas as gpd
import rasterio
import numpy as np
from rasterio.transform import from_origin
from scipy.ndimage import gaussian_filter
import warnings

# Suppress non-critical rasterio warnings during array initialization
warnings.filterwarnings("ignore", category=rasterio.errors.NotGeoreferencedWarning)

def generate_parcel_density_heatmap(
    parcels_path: str,
    output_path: str,
    cell_size: float = 100.0,
    bandwidth_sigma: float = 2.5,
    target_crs: str = "EPSG:32610"
) -> str:
    """
    Converts parcel polygons to a density heatmap via centroid binning and Gaussian smoothing.

    Parameters:
        parcels_path: Path to input vector file (GeoJSON, Shapefile, GPKG, etc.)
        output_path: Destination path for the output GeoTIFF
        cell_size: Raster resolution in linear units of target_crs (meters for UTM)
        bandwidth_sigma: Standard deviation for Gaussian smoothing (in pixels)
        target_crs: Target spatial reference system (must be metric)
    """
    # 1. Load parcels and enforce metric CRS
    gdf = gpd.read_file(parcels_path)
    if gdf.crs is None:
        raise ValueError("Input dataset lacks a defined CRS. Assign one before processing.")
    if gdf.crs != target_crs:
        gdf = gdf.to_crs(target_crs)

    centroids = gdf.geometry.centroid

    # 2. Calculate raster dimensions and affine transform
    x_min, y_min, x_max, y_max = gdf.total_bounds
    width = int(np.ceil((x_max - x_min) / cell_size))
    height = int(np.ceil((y_max - y_min) / cell_size))
    transform = from_origin(x_min, y_max, cell_size, cell_size)

    # 3. Initialize density array
    density_grid = np.zeros((height, width), dtype=np.float32)

    # 4. Bin centroids into raster cells
    cols = ((centroids.x - x_min) / cell_size).astype(int)
    rows = ((y_max - centroids.y) / cell_size).astype(int)

    # Filter out-of-bounds points (handles floating-point edge cases)
    valid = (rows >= 0) & (rows < height) & (cols >= 0) & (cols < width)
    np.add.at(density_grid, (rows[valid], cols[valid]), 1.0)

    # 5. Apply Gaussian smoothing for heatmap effect
    density_grid = gaussian_filter(density_grid, sigma=bandwidth_sigma)

    # 6. Export to GeoTIFF
    meta = {
        "driver": "GTiff",
        "dtype": "float32",
        "nodata": 0.0,
        "width": width,
        "height": height,
        "count": 1,
        "crs": target_crs,
        "transform": transform,
        "compress": "lzw",
        "tiled": True,
        "blockxsize": 256,
        "blockysize": 256
    }
    with rasterio.open(output_path, "w", **meta) as dst:
        dst.write(density_grid, 1)
    return output_path

Step-by-Step Breakdown

  1. CRS Enforcement & Centroid Extraction Parcel geometries must be projected into a metric coordinate system (e.g., UTM) before calculating distances or resolutions. The script validates the input CRS, reprojects if necessary, and extracts polygon centroids using geopandas. This reduces computational overhead compared to rasterizing full polygon boundaries.

  2. Affine Transform & Grid Sizing The from_origin function constructs a transformation matrix mapping pixel coordinates to real-world coordinates. Grid dimensions are derived from the dataset’s total bounds divided by the target cell_size. Using np.ceil ensures the raster fully encompasses the extent without clipping edge parcels.

  3. Fast Binning with np.add.at Instead of iterating through points, the script calculates row/column indices via vectorized arithmetic. np.add.at safely accumulates counts into overlapping indices, avoiding the race conditions that occur with standard NumPy indexing. Out-of-bounds points are filtered before accumulation to prevent IndexError.

  4. Gaussian Convolution Raw point counts produce a blocky, pixelated surface. Applying a Gaussian kernel smooths discrete counts into a continuous probability density surface. The sigma parameter controls the spatial influence radius in pixel units, effectively simulating a moving window that blends neighboring parcel densities.

  5. Georeferenced Export The metadata dictionary configures a compressed, tiled GeoTIFF optimized for web mapping and desktop GIS consumption. LZW compression reduces file size without loss, while 256×256 tiling aligns with standard rendering engines and cloud-optimized raster (COG) workflows.

Critical Configuration & Tuning

Parameter Recommendation Impact
cell_size 50–200m (urban), 200–500m (regional) Smaller cells increase spatial fidelity but multiply memory usage quadratically. Match resolution to your analysis scale.
bandwidth_sigma 1.5–4.0 pixels Lower values preserve local clustering; higher values generalize regional trends. Validate against known infrastructure nodes.
target_crs Always metric (UTM/State Plane) Degree-based CRS will distort distances and break smoothing kernels. Verify linear units before execution.
Memory Management Use rasterio.windows for >10M parcels For statewide datasets, chunk processing into spatial windows prevents MemoryError during array initialization.

The Gaussian smoothing step approximates Kernel Density Estimation (KDE) but operates on a pre-binned grid rather than raw coordinates. This trade-off sacrifices minor statistical precision for massive performance gains, making it suitable for compliance screening where relative intensity matters more than exact probability values. For formal statistical reporting, consider pairing this rasterized approach with scipy.stats.gaussian_kde on a sampled subset.

Integration into Compliance Workflows

Density surfaces generated through this pipeline serve as foundational layers in broader spatial decision systems. When embedded within Spatial Analysis Pipelines for Density & Proximity Checks, the output raster can be thresholded to flag zones exceeding regulatory density limits, or used as a weighting layer for multi-criteria suitability models.

In municipal planning, these heatmaps directly feed Land Use Intersection Mapping workflows, where continuous density surfaces are intersected with zoning districts, environmental constraints, and utility corridors. The resulting overlay matrices identify parcels where development intensity conflicts with existing infrastructure capacity or conservation mandates.

For production deployments, wrap the function in a CLI or FastAPI endpoint and validate inputs against schema definitions. Use rasterio’s built-in coordinate reference system validation to catch projection mismatches early, and log processing metrics (extent, cell count, execution time) for audit trails. When deploying at scale, consider migrating the smoothing step to cupy or dask for GPU/distributed acceleration, though the standard scipy.ndimage implementation remains optimal for municipal and county-scale datasets.

The complete workflow adheres to OGC GeoTIFF standards and integrates seamlessly with modern GIS stacks. For advanced raster manipulation and coordinate transformation patterns, consult the official rasterio documentation and the GeoPandas user guide.