Comparing OSRM vs Valhalla for retail catchment analysis

This page resolves a single engineering decision: given a candidate storefront, which routing engine — OSRM or Valhalla — produces the catchment polygon your site-selection pipeline should consume, and how do you query each one safely from Python.

When evaluating site viability for urban retail expansion, location intelligence teams must move beyond simple Euclidean buffers to network-constrained catchment polygons that reflect actual pedestrian, transit, and vehicular accessibility. The architectural divergence between OSRM and Valhalla directly shapes the accuracy of multi-modal drive-time and walk-time polygons, especially when wiring engine output into the multi-modal routing stage of an automated isochrone polygon workflow. Both engines consume OpenStreetMap data, but their graph construction, costing parameterization, and error propagation differ enough to demand distinct integration strategies.

Prerequisites

Before running the client below, provision the inputs and packages it depends on.

Requirement	Purpose	Notes
Python 3.10+	Runtime	Type hints and `match` not required, but tested here
`requests`	HTTP client	Connection pooling via `Session`
`shapely` >= 2.0	Geometry parsing and repair	Provides `make_valid`
`pyproj` >= 3.4	CRS assertion and reprojection	WGS84 → metric transform
A Valhalla service	Native `/isochrone` endpoint	Self-hosted Docker or managed host
An OSRM service	`/table` and `/route` endpoints	Built from a regional `.osm.pbf` extract
Regional OSM extract	Routing graph source	Same extract for both engines for parity

Both services should be built from the same OSM extract and the same snapshot date. If the catchments will be compared or unioned with one another, mismatched extracts introduce edge-set differences that look like engine behaviour but are really data drift.

Graph architecture and costing paradigms

The primary differentiator is how each engine handles multi-modal costing. OSRM relies on a monolithic routing profile compiled via Lua, where pedestrian, bicycle, and automotive routing are isolated into separate pre-built graphs. This architecture delivers exceptional query latency via contraction hierarchies but requires explicit profile switching and post-processing to merge overlapping accessibility zones. For retail planners that means generating separate drive-time and walk-time layers, then performing spatial unions downstream.

Valhalla implements a unified graph with dynamic costing_options that allow real-time weighting of transit transfers, walking penalties, and road classifications within a single request. This design supports native computation of hybrid catchments without external graph stitching. When the goal is a transit-aware urban polygon, the most consequential configuration parameters are the costing model and the contours time thresholds.

Dimension	OSRM	Valhalla
Graph model	Monolithic, per-profile pre-built graphs	Unified graph with dynamic costing
Multi-modal routing	Separate graphs, stitched downstream	Native within a single request
Isochrone output	Sampled via `/table` + interpolation	Native `/isochrone` GeoJSON endpoint
Costing control	Lua profile, compile-time	Runtime `costing_options` JSON
Query latency	Sub-second, contraction-hierarchy optimized	Higher, but more flexible
Best retail fit	High-throughput distance matrices	Multi-modal catchment polygons

The decision usually collapses to a single question, illustrated below.

Configuration and execution parameters

The retail-relevant knobs differ per engine. Tune these rather than accepting defaults, which are calibrated for generic navigation, not catchment realism.

Parameter	Engine	Typical retail value	Effect
`costing`	Valhalla	`auto`, `pedestrian`, `multimodal`	Selects the cost model for the contour
`contours[].time`	Valhalla	5, 10, 15 (minutes)	Catchment time budget(s)
`polygons`	Valhalla	`true`	Returns filled polygons, not rings
`denoise`	Valhalla	0.5–1.0	Drops small disconnected fragments
`generalize`	Valhalla	0–50 (metres)	Douglas–Peucker tolerance; 0 keeps detail
`transfer_penalty`	Valhalla	60–180 (seconds)	Friction of each transit transfer
`annotations=duration`	OSRM	enabled	Returns the cost matrix for contouring
`profile`	OSRM	`car`, `foot`, `bike`	Selects the pre-built graph

A note on coordinate order, which is the single most common integration bug: Valhalla’s /isochrone accepts {"lat": ..., "lon": ...} objects, while OSRM’s URL path uses {lon},{lat} order. Mixing the two silently produces a polygon in the wrong hemisphere rather than an error.

Annotated Python implementation

The client below queries Valhalla’s native isochrone, enforces strict WGS84 coordinate validation on input, repairs invalid GeoJSON, and reprojects the catchment to a metric CRS before any area computation. Per the routing-engine convention used across this stack, no geometry is measured in degrees: the raw response is asserted as EPSG:4326 and transformed with pyproj before area is ever read.

python

import logging
from typing import Any, Optional

import requests
from pyproj import CRS, Transformer
from shapely.geometry import shape, Polygon, MultiPolygon
from shapely.ops import transform as shp_transform
from shapely.validation import make_valid

logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)

WGS84 = CRS.from_epsg(4326)          # routing engines always emit lon/lat degrees
METRIC = CRS.from_epsg(3857)         # Web Mercator; swap for a local UTM zone for true areas


class RetailCatchmentRouter:
    def __init__(self, osrm_base: str, valhalla_base: str, timeout: int = 30, max_retries: int = 3):
        self.osrm_base = osrm_base.rstrip("/")
        self.valhalla_base = valhalla_base.rstrip("/")
        self.timeout = timeout
        self.max_retries = max_retries
        self.session = requests.Session()
        self.session.headers.update({"Accept": "application/json", "Content-Type": "application/json"})
        # Build the reprojection once; reused for every polygon.
        self._to_metric = Transformer.from_crs(WGS84, METRIC, always_xy=True).transform

    def _validate_coordinates(self, lat: float, lon: float) -> None:
        # Fail fast on swapped lat/lon — the classic Valhalla-vs-OSRM integration bug.
        if not (-90.0 <= lat <= 90.0) or not (-180.0 <= lon <= 180.0):
            raise ValueError(f"Invalid WGS84 coordinates: lat={lat}, lon={lon}")

    def query_valhalla_isochrone(
        self, lat: float, lon: float, minutes: int, costing: str = "auto"
    ) -> Optional[dict[str, Any]]:
        """Fetch a native Valhalla isochrone with retail-tuned costing.

        Valhalla's /isochrone expects {"lat": ..., "lon": ...} — NOT lon/lat order.
        """
        self._validate_coordinates(lat, lon)
        url = f"{self.valhalla_base}/isochrone"
        payload = {
            "locations": [{"lat": lat, "lon": lon}],
            "costing": costing,
            "contours": [{"time": minutes, "color": "ff0000"}],
            "costing_options": {
                "auto": {"use_ferry": 0.5, "use_toll": 0.0, "maneuver_penalty": 5.0},
                "pedestrian": {"walking_speed": 5.0, "use_ferry": 0.0, "use_living_streets": 0.8},
                "transit": {"use_bus": 0.5, "use_rail": 0.8, "transfer_penalty": 120},
            },
            "polygons": True,   # return filled polygons, not just contour rings
            "denoise": 1.0,     # drop tiny disconnected fragments
            "generalize": 0.0,  # keep full vertex detail for accurate joins
        }

        for attempt in range(self.max_retries):
            try:
                response = self.session.post(url, json=payload, timeout=self.timeout)
                response.raise_for_status()
                data = response.json()
                if not data.get("features"):
                    raise RuntimeError("Valhalla response missing isochrone features.")
                return data
            except requests.exceptions.RequestException as exc:
                logger.warning("Valhalla attempt %d/%d failed: %s", attempt + 1, self.max_retries, exc)
                if attempt == self.max_retries - 1:
                    return None
        return None

    def parse_to_valid_polygon(self, response: dict[str, Any]) -> Optional[Polygon]:
        """Extract, repair, and select the primary catchment polygon (still in WGS84)."""
        if not response or "features" not in response:
            return None
        try:
            geom = shape(response["features"][0]["geometry"])
            if not geom.is_valid:
                geom = make_valid(geom)  # repair self-intersections before any join
            if isinstance(geom, MultiPolygon):
                # The main catchment is the largest ring; islands are denoise residue.
                return max(geom.geoms, key=lambda g: g.area)
            return geom if isinstance(geom, Polygon) else None
        except (KeyError, IndexError, ValueError) as exc:
            logger.error("GeoJSON parsing failed: %s", exc)
            return None

    def catchment_area_km2(self, polygon: Polygon) -> float:
        """Reproject WGS84 → metric CRS, THEN measure. Never measure area in degrees."""
        metric_poly = shp_transform(self._to_metric, polygon)
        return metric_poly.area / 1_000_000.0

For OSRM, there is no native contour endpoint: query /table with annotations=duration over a dense lattice of sample points, threshold the resulting duration vector at your time budget, and contour the reachable points (for example with scipy.spatial alpha shapes) into a polygon you then pass through the same parse_to_valid_polygon and catchment_area_km2 steps. The validation and reprojection contract is identical regardless of which engine produced the geometry.

Failure modes and debugging

Symptom	Likely cause	Fix
Polygon lands in the ocean	Swapped lat/lon between Valhalla and OSRM order	Trust `_validate_coordinates`; never reuse an OSRM URL tuple in a Valhalla payload
`area` returns a tiny float (~0.0001)	Area measured in degrees, not metres	Reproject via `catchment_area_km2` before reading `area`
`TopologyException` on spatial join	Self-intersecting contour ring	`make_valid` is mandatory before any downstream join
Many small islands in the result	`denoise` too low for a sparse graph	Raise `denoise` toward 1.0, or keep the largest polygon only
Empty `features` array	Seed snapped to a disconnected component	See troubleshooting disconnected road networks
HTTP 429 under batch load	Rate limit on a shared host	Exponential backoff plus coordinate deduplication

When scaling batch generation, deduplicate near-identical seeds and cache polygon geometries using a spatial index such as H3 or S2 so overlapping site evaluations do not re-query the engine — the same reuse pattern covered in caching strategies for repeated network queries.

Verification

Confirm a catchment is correct before it enters the pipeline:

Feature count: the response carries exactly one polygon per requested contour time; len(response["features"]) should equal the number of contours.
Geometry validity: assert polygon.is_valid is True after parse_to_valid_polygon; a False here means a downstream spatial join will raise.
CRS sanity: the raw polygon’s bounds must lie within [-180, -90, 180, 90]; anything outside signals a coordinate-order bug.
Area plausibility: catchment_area_km2 for a 15-minute urban auto contour should land in roughly the 5–40 km² range — values near zero indicate a degree-space measurement, and absurdly large values indicate a disconnected-graph blow-out.
Containment: the seed point must fall inside its own catchment (polygon.contains(Point(lon, lat))); if not, the snap radius placed the origin off-network.

Once validated and reprojected, the polygon is ready to drive a point-in-polygon join against demographic layers — the standard handoff into trade-area scoring.

← Back to Multi-Modal Routing for Urban Retail

Comparing OSRM vs Valhalla for retail catchment analysis

Prerequisites #

Graph architecture and costing paradigms #

Configuration and execution parameters #

Annotated Python implementation #

Failure modes and debugging #

Verification #

Related #