spa-l2-lagos-hotspot-overlaps
Analyst notes
Description
This task probes whether the agent reasons about scale and projection before doing any geometry. The 100 m² sliver threshold is the gotcha: it only makes sense in a metric CRS, so an agent that filters area in WGS84 degrees keeps all the synthetic slivers and lands on a wrong top-10 % list. The overlay step is also overlap-aware (a hex can intersect several land-use polygons), so the per-cell aggregate has to weight by intersection area, not by polygon count.
Approach
- Reproject both inputs into a Lagos-appropriate metric CRS before any area work, so the 100 m² threshold and the area weights are in real units.
- Drop land-use polygons whose area is below 100 m² and keep a per-cell count of how many were dropped.
- Intersect the kept land-use polygons with the hex grid and, for each hex cell, compute the area-weighted mean of pop_density across all overlapping pieces.
- Rank the cells by area-weighted density descending, take the top 10 %, and assign a unique integer rank starting at 1.
- Write the two files with the same hex_id set: a GeoParquet with the hex polygons and a plain Parquet with the ranking table plus the overlap and sliver counts.
Pitfalls
- Filtering the 100 m² threshold on the EPSG:4326 inputs removes almost nothing, because every polygon's degree² area is far below 100, so the slivers stay in and pull random cells onto the top list.
- Using an unweighted mean of pop_density per cell instead of weighting by intersection area gives the right hex_ids but wrong density values, costing the density subcheck.
- Computing the top 10 % off the full 1 782-cell hex grid instead of the eligible cells (those that overlap at least one kept land-use polygon) inflates the row count well past the reference's 104, so the heavily weighted hex_id overlap check fails.
- Picking a non-Nigerian metric CRS (a generic Web Mercator, or no projection at all) costs subcheck points even if the geometry math is otherwise correct, because the grader reads the original CRS as well as reprojecting for the comparison.
- Letting the GeoParquet and the Parquet ranking disagree on which hex_ids are in the top list fails the heavily weighted cross-file consistency check even if both files individually look right.
Map
Recent runs task v3
| adapter | started | score | steps | duration | cost | status |
|---|---|---|---|---|---|---|
| openrouter-gemma4-26b-basic | 2026-06-18T07:32:32Z | pending | — | — | — | pending |
| openrouter-deepseek-v4-flash-basic | 2026-06-18T03:08:04Z | 1.00 | 16 | 3:07 | 0.76¢ | done |
| openrouter-deepseek-v4-flash-detailed | 2026-06-17T22:01:33Z | 0.79 | 10 | 2:01 | 0.58¢ | done |
| openrouter-gemma4-26b-detailed | 2026-06-17T19:47:47Z | 0.95 | 20 | 3:01 | 1.08¢ | done |
| openrouter-deepseek-v4-flash-basic | 2026-06-16T21:43:55Z | 1.00 | 10 | 2:13 | 0.31¢ | done |