spa-l2-cairo-shop-knn

Analyst notes

Description

This task chains three spatial primitives (k-nearest neighbours, within-distance flagging, and a small distance matrix) on a medium-scale point layer, with an attribute-cleaning step folded on top. The hidden gotcha is that the chain names in the source are deliberately inconsistent across Arabic, Latin, and casing variants, and the prompt asks for consistency without naming the transliteration axes. The bundled GPKG is already in a metric CRS (EPSG:22992, Egypt Red Belt), so a correct solution computes distances directly without reprojection.

Approach

  1. Open the GPKG and inspect both layers to find the CRS, the column names, and the spread of raw shop names.
  2. For each of the 100 anchors, compute distances to all shops in the layer's metric CRS and keep the five nearest, sorted ascending.
  3. Derive the within-1 km boolean from the metric distance rather than guessing it from another column.
  4. For each anchor, find its three closest sibling anchors and build the 5 by 3 distance matrix from the chosen knn shops to those siblings, keeping rows aligned with the knn order.
  5. Cluster the raw shop names into chains across the transliteration variants, pick one canonical spelling per chain, and apply the same canonical name to every shop in that chain.
  6. Tidy the anchor names (strip whitespace and casing junk) and serialise the result as a top-level JSON array following the schema in the prompt.

Pitfalls

Inputs

nameformatcrsgeometryfeatures
cairo_retail gpkg EPSG:22992 Point 10,000

Expected outputs

nameformatcrsgeometryfeatures
market_neighbourhoods.json json

Map

Recent runs task v2

adapterstartedscorestepsdurationcoststatus
openrouter-gemma4-26b-basic 2026-06-18T07:32:32Z pending pending
openrouter-deepseek-v4-flash-basic 2026-06-18T03:08:04Z done 25 4:41 2.60¢ done
openrouter-deepseek-v4-flash-detailed 2026-06-17T22:01:33Z done 21 2:43 0.87¢ done
openrouter-gemma4-26b-detailed 2026-06-17T19:47:47Z done 14 2:29 2.82¢ done
openrouter-deepseek-v4-flash-basic 2026-06-16T21:43:55Z 1.00 19 3:29 1.38¢ done