geo-l1-capetown-building-centroids
Analyst notes
Description
L1 geometric-ops task that probes a per-footprint centroid plus a deliberately-unstated CRS reprojection. The input shapefile ships in EPSG:32734 (UTM 34S, the canonical metric CRS for the south-western Cape), the output is GeoJSON so RFC 7946 pins WGS84, and the prompt never mentions either CRS. A second hidden gotcha sits in the dBase 10-char column-name limit: the persona names the join key `building_id`, but on disk the column lands as `building_i` and the agent has to restore the full name on output.
Approach
- Open the shapefile and notice that the column on disk is `building_i`, then plan to rename it back to `building_id` on output.
- Either compute the centroid directly on the EPSG:32734 geometry (it is already a metric CRS so centroid math is well-defined) and reproject the resulting Points to WGS84, or reproject the polygons to WGS84 first and let the library handle the centroid in lon/lat.
- Write the result as `building_centroids.geojson` with one Point per input polygon, the `building_id` column populated, and the file in WGS84 so it matches the GeoJSON convention.
- Sanity-check the row count comes out at 122 and every row has a non-empty `building_id`.
Pitfalls
- Forgetting to reproject to WGS84 before writing GeoJSON leaves the file in EPSG:32734 metres, which costs the canonical-CRS subcheck (about 0.95 instead of 1.0; EPSG:32734 stays in the meaningful set since it is the natural metric CRS for Cape Town, so the geometric work still gets credit).
- Computing `.centroid` on a GeoDataFrame loaded without inspecting the .prj and assuming the coordinates are in degrees produces a centroid biased toward the equator, which the per-building distance subchecks catch.
- Dropping the attribute table on write loses `building_i` entirely and Gate 1 rejects on the missing `building_id` column.
- Shipping `building_i` instead of `building_id` on output fails the column-name gate because the persona named the field `building_id`.
- Using `point_on_surface` instead of `centroid` lands close to the centroid on rectangular footprints but drifts by metres on the L-shaped and U-shaped buildings in the set, which the tighter median-distance subcheck flags.
- Unioning all buildings first and then computing one centroid fails the row-count check (1 vs 122) along with every id-keyed distance check.
Map
Recent runs task v1
| adapter | started | score | steps | duration | cost | status |
|---|---|---|---|---|---|---|
| openrouter-gemma4-26b-basic | 2026-06-18T07:32:32Z | pending | — | — | — | pending |
| openrouter-deepseek-v4-flash-basic | 2026-06-18T03:08:04Z | 0.00 | 5 | 0:46 | 0.10¢ | done |
| openrouter-deepseek-v4-flash-detailed | 2026-06-17T22:01:33Z | 0.00 | 6 | 0:36 | 0.11¢ | done |
| openrouter-gemma4-26b-detailed | 2026-06-17T19:47:47Z | 0.00 | 6 | 0:54 | 0.17¢ | done |
| openrouter-deepseek-v4-flash-basic | 2026-06-16T21:43:55Z | 0.95 | 7 | 0:46 | 0.27¢ | done |