When to Use Unit vs Integration Tests in GIS

7 min read

Choosing whether a given spatial check belongs in the unit tier or the integration tier is a concrete classification problem, not a style preference: it decides whether the test runs in milliseconds against an in-memory shapely geometry or in seconds against a live PostGIS connection. Getting the boundary wrong is what produces flaky assertions, environment-dependent failures that slip past CI gates, and silent topology degradation. This page gives the decision rule, the pytest markers that enforce it, and — because GIS test fixtures frequently carry real coordinates — the isolation controls defined by the parent guide on security boundaries in spatial QA. It is one concrete decision inside the broader discipline of geospatial QA fundamentals and architecture.

Root-Cause Framing: Why the Boundary Blurs in GIS

In most domains the unit/integration split is obvious. Spatial code blurs it because the same call can be pure computation or a hidden I/O operation depending on library internals:

pyproj network fallbacks. A CRS transformation looks like pure math, but pyproj.Transformer.from_crs may reach out to a grid-shift CDN to fetch a datum pipeline when the local PROJ data directory is incomplete. A test you wrote as a unit test then silently performs an HTTP call, becoming non-deterministic and slow — and, in a locked-down runner, failing entirely.
GEOS is a shared C library. Predicates such as intersects, contains, and touches are deterministic per GEOS version, but a libgeos upgrade in the runner image changes boundary-touching DE-9IM results at the bit level. The logic is pure; the environment is not, so a “unit” assertion couples to container provisioning unless versions are pinned.
Geometry that only exists in the database. Operations like ST_DWithin, GiST index selection, and ST_Subdivide have no in-memory equivalent — their behaviour is a property of the PostGIS server and its planner. A test that asserts on them is inherently an integration test even if it looks like a one-line query.
Floating-point representation. IEEE 754 means coordinates rarely round-trip exactly, so geom1 == geom2 fails unpredictably. The fix (tolerance-based assertions) is pure unit logic, but engineers often “fix” the flake by snapshotting database output instead, dragging a pure check into the integration tier for the wrong reason.

The decision rule below cuts through all four: classify by what the test depends on at run time, not by what the code superficially resembles.

The Classification Boundary

A test is a unit test if and only if it depends on nothing outside the process: no socket, no file descriptor, no database handle. Everything else is an integration test. Aligning this split with the GIS test pyramid keeps the fast deterministic checks at the base and the stateful ones above it.

Unit tier — coordinate math, geometry predicates (intersects, contains, touches), CRS projection of in-memory geometries, attribute validation, topology rules. Zero external services; runs in milliseconds.
Integration tier — reading/writing GeoParquet, GeoJSON or Shapefiles, executing PostGIS queries, validating spatial indexes, orchestrating ETL, and consuming OGC services (WFS/WMS). Validates that isolated components interact correctly under realistic data.

Classification Reference

Map each trigger condition to a tier, a CI stage, and an execution target. This is the heuristic teams encode directly into marker selection.

Trigger condition	Tier	CI stage	Execution target
Coordinate math, CRS transform of in-memory geometry, topology rule	Unit	Pre-commit / PR	Local / ephemeral runner
File I/O (GeoParquet, Shapefile), spatial index validation	Integration	Nightly / merge	Dedicated test DB / MinIO
Network-bound OGC service, raster reprojection	Integration	Scheduled / canary	Isolated VPC / staging
Credential validation, PII masking of fixtures	Unit + Integration	Pre-merge	Secret scanner + test env

Unit-tier geometric assertions are never exact-equality checks. Because buffer() returns a polygonal approximation of a circle and projection introduces rounding, compare against an analytic value within a relative tolerance $\tau$ :

\left| \frac{A_{\text{actual}} - A_{\text{expected}}}{A_{\text{expected}}} \right| \le \tau, \qquad \tau \approx 10^{-2}\ \text{to}\ 10^{-6}

with the looser bound for buffered approximations and the tighter bound for exact geometry algebra. Choosing $\tau$ deliberately is the same discipline covered in setting up spatial tolerance thresholds in assertions.

Step-by-Step Implementation

Step 1 — Write pure unit tests for spatial logic (`pytest` 7+, `shapely` 2.x, `pyproj` 3.6+)

Unit tests operate on in-memory geometries, instantiate no readers or connections, and assert with explicit tolerance. For the semantics of the predicates and buffer operations below, the authoritative reference is the Shapely documentation.

# tests/unit/test_geometry_ops.py
import numpy as np
import pytest
from shapely.geometry import Point
from shapely.ops import transform
import pyproj


def calculate_buffer_area(geom: Point, distance: float) -> float:
    """Pure unit function: area of a buffered point."""
    return geom.buffer(distance).area


def project_geometry(geom, src_crs: str, dst_crs: str):
    """Pure unit function: project a geometry between CRSs."""
    transformer = pyproj.Transformer.from_crs(src_crs, dst_crs, always_xy=True)
    return transform(transformer.transform, geom)


@pytest.mark.unit
@pytest.mark.parametrize(
    "coords,distance,expected_area",
    [
        ((0.0, 0.0), 10.0, np.pi * 100.0),
        ((-122.4194, 37.7749), 5.0, np.pi * 25.0),
    ],
)
def test_buffer_area_tolerance(coords, distance, expected_area):
    actual = calculate_buffer_area(Point(coords), distance)
    # buffer() approximates a circle, so compare to pi*r^2 at ~1% relative tolerance
    assert np.isclose(actual, expected_area, rtol=1e-2, atol=1e-6)


@pytest.mark.unit
def test_crs_projection_determinism():
    projected = project_geometry(Point(-118.2437, 34.0522), "EPSG:4326", "EPSG:3857")
    # Assert bounds, not exact floats, to absorb precision drift across PROJ versions
    assert -13160000 < projected.x < -13150000
    assert 4000000 < projected.y < 4050000

Three rules keep this tier clean: never instantiate a file reader or DB connection; use np.isclose() or shapely.equals_exact() with an explicit tolerance; and stub pyproj so a missing grid does not trigger the network fallback described above. Synthetic in-memory geometry like this is exactly the territory of mocking geospatial data for tests.

Step 2 — Write integration tests for stateful data flow (`pytest` 7+, `SQLAlchemy` 2.x, `geopandas` 0.14+)

Integration tests provision an isolated schema, write through GeoPandas, and assert on server-side behaviour that has no in-memory analogue. For query planning and index validation, see the PostGIS documentation. Scope the spatial surface deliberately so the suite is not throttled by heavy I/O — the remit of scoping rules for map data validation.

# tests/integration/test_postgis_pipeline.py
import geopandas as gpd
import pytest
from shapely.geometry import Point
from sqlalchemy import create_engine, text


@pytest.fixture(scope="module")
def db_engine():
    # Dedicated test database — never staging or prod
    return create_engine("postgresql+psycopg2://test_user:test_pass@localhost:5432/gis_test")


@pytest.fixture(autouse=True)
def clean_test_schema(db_engine):
    with db_engine.begin() as conn:
        conn.execute(text("DROP SCHEMA IF EXISTS test_qa CASCADE;"))
        conn.execute(text("CREATE SCHEMA test_qa;"))
    yield
    with db_engine.begin() as conn:
        conn.execute(text("DROP SCHEMA IF EXISTS test_qa CASCADE;"))


@pytest.mark.integration
def test_spatial_join_and_index_creation(db_engine):
    """ETL flow: insert -> index -> spatial join -> result."""
    gdf = gpd.GeoDataFrame(
        {"id": [1, 2], "value": ["A", "B"]},
        geometry=[Point(0, 0), Point(1, 1)],
        crs="EPSG:4326",
    )
    gdf.to_postgis("test_points", db_engine, schema="test_qa", if_exists="replace", index=False)

    with db_engine.begin() as conn:
        conn.execute(text(
            "CREATE INDEX idx_test_points_geom ON test_qa.test_points USING GIST (geometry);"
        ))

    # ST_DWithin on geography measures metres. (0,0) and (1,1) are ~157 km apart,
    # so a 200 km threshold finds both neighbours.
    query = """
        SELECT a.id, b.id AS neighbor_id
        FROM test_qa.test_points a
        JOIN test_qa.test_points b
          ON ST_DWithin(a.geometry::geography, b.geometry::geography, 200000)
        WHERE a.id != b.id;
    """
    with db_engine.connect() as conn:
        result = conn.execute(text(query)).fetchall()
    assert len(result) == 2

Use scope="module" for the engine to avoid pool exhaustion, wrap data in schema teardowns or transactional rollbacks, and assert on EXPLAIN ANALYZE output when you need to prove the GiST index is actually used rather than scanned past. Connection-level patterns are detailed in best practices for mocking PostGIS connections.

Step 3 — Enforce the boundary with markers in CI

Markers turn the classification rule into a gate. The unit selection runs on every push; the integration selection runs only after it passes.

# pytest.ini
[pytest]
markers =
    unit: Pure spatial logic, no I/O
    integration: Requires DB, file system, or network
    slow: Takes >5s, run in the nightly pipeline
addopts = -m "unit" --strict-markers -q

--strict-markers makes an unregistered marker a hard error, so a test cannot silently escape classification. Run pytest -m unit on every push and fail fast if it exceeds ~30s; run pytest -m integration behind a Docker Compose PostGIS service once the unit gate is green; and cache the pyproj data directory in the runner so CRS resolution never reaches the network mid-suite.

Step 4 — Keep real location data out of the unit tier

Because spatial fixtures can encode real coordinates, the boundary is also a security boundary. Inject database URIs via CI secrets ($POSTGIS_TEST_URI), never conftest.py; round fixture coordinates so they cannot be reverse-geocoded to a real address; block outbound WMS/WFS in unit tests with responses or pytest-httpserver to prevent accidental metering or data egress; and run integration tests in ephemeral containers whose network policy only reaches approved spatial registries.

Verification Pattern

Confirm the split holds by proving the unit tier touches nothing external. Run it with sockets disabled — if any “unit” test reaches the network, the run fails loudly:

pip install pytest-socket
pytest -m unit --disable-socket -q

A green run is positive evidence the unit tier is pure; a failure points straight at a hidden pyproj fetch or an accidental DB handle that should be reclassified as integration.

Failure Modes and Edge Cases

Anti-meridian geometries. A LineString crossing ±180° longitude is valid in EPSG:4326, but a unit assertion on its planar bounds reports a near-global extent. Assert with geographic awareness (or split at the meridian) rather than treating the coordinates as a flat plane.
Polar CRS axis order. Projected polar systems such as EPSG:3413 and authority-ordered geographic CRSs can swap axes. A projection unit test that hard-codes (x, y) passes while every transformed point is wrong — always set always_xy=True and assert axis direction explicitly.
Empty geometry. Polygon().area is 0.0 and Point().buffer(1) is empty; an is_empty geometry slipping into a tolerance comparison yields a misleading pass. Guard for emptiness before asserting numeric equivalence.
Mixed Z/M coordinates. A PointZ buffered in a 2D unit test silently drops the Z dimension, so an assertion that “passes” hides a lost coordinate. Assert has_z where the contract requires three dimensions.
GEOS version drift across the boundary. A boundary-touching intersects result can differ between libgeos builds. Pin the GEOS/PROJ versions in the runner image so a unit assertion stays deterministic instead of failing only on a rebuilt container.

Conclusion

Classify spatial tests by their run-time dependencies, not their surface syntax: pure in-memory geometry, CRS, and attribute logic belongs in a fast unit tier with explicit tolerance, while anything touching PostGIS, files, or the network belongs in an isolated integration tier. Enforce the split with pytest markers and prove it with --disable-socket, then return to the parent security boundaries in spatial QA to decide how strictly each tier must isolate sensitive location data.