dc-l2-cairo-invalid-dedup

Analyst notes

Description

This task tests whether the agent can plan and execute a multi-step polygon-cleanup chain on a deliberately corrupted parcel snapshot. The hidden gotcha is that the bowtie polygons report shapely `.area` of 0, so a naive `geometry.area < 1` sliver filter will silently delete the bowtie features instead of repairing them. The agent has to run `make_valid` first, then filter slivers on the repaired geometry, then dedup and recompute area before coercing to MultiPolygon.

Approach

  1. Read the input GeoJSON and inspect a few rows to spot the mixed Polygon and MultiPolygon types, the self-intersecting bowties, the duplicated rows with conflicting attributes, and the sub-1 m² slivers.
  2. Repair invalid geometries first so subsequent area-based filters see the real area, not the 0 area shapely returns for a self-intersecting ring.
  3. Drop fragments below 1 m² using the repaired geometry's own area.
  4. Collapse exact-duplicate geometries by WKB equality, keeping the row with the smallest record_seq.
  5. Recompute area_m2 from the surviving geometry rather than carrying over the stale legacy value.
  6. Promote every single-part Polygon to a 1-part MultiPolygon so the output schema is homogeneous, and write parcels_canonical.geoparquet with the requested columns.

Pitfalls

Inputs

nameformatcrsgeometryfeatures
cairo_parcels_legacy geojson EPSG:22992 MultiPolygon, Polygon 290

Expected outputs

nameformatcrsgeometryfeatures
parcels_canonical.geoparquet geoparquet EPSG:22992 MultiPolygon 210

Map

Recent runs task v3

adapterstartedscorestepsdurationcoststatus
openrouter-gemma4-26b-basic 2026-06-18T07:32:32Z pending pending
openrouter-deepseek-v4-flash-basic 2026-06-18T03:08:04Z 1.00 15 2:47 0.75¢ done
openrouter-deepseek-v4-flash-detailed 2026-06-17T22:01:33Z 0.74 8 1:16 0.40¢ done
openrouter-gemma4-26b-detailed 2026-06-17T19:47:47Z 0.94 15 1:00 0.86¢ done
openrouter-deepseek-v4-flash-basic 2026-06-16T21:43:55Z 1.00 8 1:41 0.44¢ done