dc-l2-cairo-invalid-dedup
Analyst notes
Description
This task tests whether the agent can plan and execute a multi-step polygon-cleanup chain on a deliberately corrupted parcel snapshot. The hidden gotcha is that the bowtie polygons report shapely `.area` of 0, so a naive `geometry.area < 1` sliver filter will silently delete the bowtie features instead of repairing them. The agent has to run `make_valid` first, then filter slivers on the repaired geometry, then dedup and recompute area before coercing to MultiPolygon.
Approach
- Read the input GeoJSON and inspect a few rows to spot the mixed Polygon and MultiPolygon types, the self-intersecting bowties, the duplicated rows with conflicting attributes, and the sub-1 m² slivers.
- Repair invalid geometries first so subsequent area-based filters see the real area, not the 0 area shapely returns for a self-intersecting ring.
- Drop fragments below 1 m² using the repaired geometry's own area.
- Collapse exact-duplicate geometries by WKB equality, keeping the row with the smallest record_seq.
- Recompute area_m2 from the surviving geometry rather than carrying over the stale legacy value.
- Promote every single-part Polygon to a 1-part MultiPolygon so the output schema is homogeneous, and write parcels_canonical.geoparquet with the requested columns.
Pitfalls
- Filtering slivers on raw geometry.area before running make_valid silently deletes the 20 bowtie parcels, because self-intersecting rings report area 0 and fall under the 1 m² threshold.
- Half-repairing a bowtie by keeping only one of the two triangles make_valid would produce halves the parcel's area and drops the union IoU below 0.99, which trips geometric_extent_preserved.
- Using keep="last" on the dedup step keeps the synthetic 900_000+ parcel_ids and the conflicting parcel_class values, so both the parcel_id Jaccard and the attribute-match subchecks collapse.
- Carrying the legacy area_m2 column through unchanged makes area_m2_recomputed fail on the 20 bowtie rows whose repaired area no longer matches the stale value.
- Writing the output in EPSG:4326 (the GeoJSON convention) instead of the input's EPSG:22992 costs both CRS subchecks even though the geometric work otherwise scores.
Map
Recent runs task v3
| adapter | started | score | steps | duration | cost | status |
|---|---|---|---|---|---|---|
| openrouter-gemma4-26b-basic | 2026-06-18T07:32:32Z | pending | — | — | — | pending |
| openrouter-deepseek-v4-flash-basic | 2026-06-18T03:08:04Z | 1.00 | 15 | 2:47 | 0.75¢ | done |
| openrouter-deepseek-v4-flash-detailed | 2026-06-17T22:01:33Z | 0.74 | 8 | 1:16 | 0.40¢ | done |
| openrouter-gemma4-26b-detailed | 2026-06-17T19:47:47Z | 0.94 | 15 | 1:00 | 0.86¢ | done |
| openrouter-deepseek-v4-flash-basic | 2026-06-16T21:43:55Z | 1.00 | 8 | 1:41 | 0.44¢ | done |