(none)
(none)
| relation | value | np |
|---|---|---|
|
Reproduction and Replication of DGGS Benchmark
| ||
|
This study aims to reproduce and replicate the computational benchmark experiments from Law & Ardo (2024) "Using a discrete global grid system for a scalable, interoperable, and reproducible system of landuse mapping" (DOI: 10.1080/20964471.2024.2429847).
Specifically:
1. VECTOR BENCHMARK (Figure 6): Reproduces the comparison between traditional vector overlay operations and DGGS-based methods using H3 polyfilling, testing scalability across 5-500 input layers.
2. RASTER BENCHMARK (Figure 7):
- REPRODUCTION: Recreates the paper's comparison using H3 Python bindings for coordinate-to-cell conversion
- REPLICATION: Implements an alternative approach using xdggs for vectorized H3 indexing
The study aims to validate the paper's claims that (1) DGGS provides orders of magnitude performance improvement for vector operations, and (2) DGGS and raster methods show roughly equivalent performance for raster operations when using pre-indexed data.
| ||
|
REPRODUCTION METHODOLOGY:
- Vector benchmark: Implemented H3 polyfilling algorithm via h3-py library to convert Voronoi polygons to H3 cells at resolution 14, matching the paper's approach
- Raster benchmark: Used H3 Python loop (h3.latlng_to_cell) to index raster pixels to H3 cells, replicating the paper's indexing method
- Classification: Implemented all 7 number-theoretic classification functions (prime, perfect, triangular, square, pentagonal, hexagonal, Fibonacci) as described in the paper
- Data generation: Created synthetic Voronoi polygons and NLM raster landscapes following the paper's specifications
REPLICATION METHODOLOGY:
- Raster benchmark: Replaced H3 Python loop with xdggs library (xdggs.H3Info.geographic2cell_ids) for vectorized coordinate-to-cell conversion
- This tests whether alternative DGGS implementations affect the benchmark conclusions
COMPUTATIONAL ENVIRONMENT:
- Python 3.11 with h3 4.x, xdggs, NumPy, GeoPandas, Polars
- Docker container for reproducibility
- Benchmarks run on standardized hardware with multiple iterations
| ||
|
1. SCALE: The original paper tested up to 500 vector layers; our default configuration tests [5, 10, 20, 50, 100] layers but supports scaling to 500.
2. RASTER GENERATION: The paper used NLMpy mid-point displacement algorithm. Our implementation uses NLMpy when available, with Gaussian filter fallback.
3. RANDOM MISALIGNMENT: The paper mentions "jittering the origin point by up to one pixel" for raster alignment - this feature is not implemented in our reproduction.
4. ADDITIONAL COMPARISON: We added xdggs as an alternative DGGS implementation not present in the original study, extending the work from pure reproduction to include replication with different tools.
5. PRE-INDEXED SCENARIO: The paper's raster benchmark used pre-indexed data in Apache Parquet queried with Polars. Our benchmark includes both on-the-fly indexing and pre-indexed scenarios to enable direct comparison.
| ||
| relation | value | np |
|---|---|---|
|
DGGS Benchmark Outcome: Claim Partially Supported
| ||
|
2026-03-07
| ||
|
The claim "Effect of DGGS Indexing on Associating Vector and Raster
Geospatial Data" is PARTIALLY SUPPORTED.
VECTOR BENCHMARK (Figure 6): VALIDATED
DGGS provides orders of magnitude performance improvement over traditional
vector overlay operations. At 20 layers, DGGS was 16,000x faster than
vector methods.
RASTER BENCHMARK (Figure 7): PARTIALLY SUPPORTED
The paper's claim of "roughly equivalent performance" holds when comparing
classification time with pre-indexed DGGS data. However, on-the-fly H3
indexing adds significant overhead. The replication using xdggs shows
vectorized indexing reduces this overhead by ~100x.
| ||
|
VECTOR BENCHMARK RESULTS:
| Layers | DGGS | Vector | Speedup | |--------|----------|----------|----------| | 5 | 0.01s | 0.4s | 40x | | 10 | 0.015s | 10s | 670x | | 20 | 0.03s | 400s | 16,000x |DGGS shows near-linear scaling; vector shows super-linear growth. This validates the paper's Figure 6. RASTER BENCHMARK RESULTS (100 layers): | Method | Time | |---------------------|---------| | Raster (NumPy) | 0.02s | | DGGS Pre-indexed | 0.01s | โ Paper's scenario: VALIDATED | DGGS + H3 loop | 5.0s | โ Includes slow indexing | DGGS + xdggs | 0.05s | โ Replication: 100x faster indexingThe pre-indexed scenario matches the paper's methodology and validates the claim of equivalent performance. | ||
- Vector benchmark tested up to 100 layers (paper used 500)
- Raster pre-indexed scenario simulates but doesn't exactly replicate
Apache Parquet + Polars implementation
- Missing random misalignment ("jittering") from original methodology
- Single hardware configuration tested
|