dc-l2-lagos-snap-normalize
Map (reference ← swipe → agent)
0:00
I need to get our legacy zoning layer ready for the state portal. lagos_zoning_legacy.gpkg has inconsistent zoning_class values (mixed casing, spelling variations, and some blank entries) plus vertex precision issues that cause tiny gaps and overlaps between adjacent parcels. Can you clean up the class names to consistent canonical TitleCase values, drop any rows with blank or whitespace-only classes, fix the vertex precision problems, and produce one merged geometry per canonical zoning class? Please write the result to zoning_aggregated.gpkg, with each row carrying two columns: zoning_class and a numeric area_m2.
0:00
System info
{
"adapter": "openrouter",
"model": "google/gemma-4-26b-a4b-it",
"base_url": "https://openrouter.ai/api/v1"
}
0:01
Bash
ls -l
total 2468 -rw-r--r-- 1 agent agent 2527232 Jun 16 12:14 lagos_zoning_legacy.gpkg
0:04
Write
/work/inspect_data.py
content · 12 lines · py
import geopandas as gpd
# Load the data
gdf = gpd.read_file('lagos_zoning_legacy.gpkg')
print("Columns:", gdf.columns.tolist())
print("First few rows:")
print(gdf.head())
print("\nValue counts for zoning_class:")
print(gdf['zoning_class'].value_counts(dropna=False))
print("\nCRS:", gdf.crs)
Wrote /work/inspect_data.py (294 chars).
0:06
Bash
python inspect_data.py
Columns: ['parcel_id', 'lga_source', 'zoning_class', 'area_m2', 'geometry']
First few rows:
parcel_id ... geometry
0 1 ... POLYGON ((540000 720000, 540010 720000, 540010...
1 2 ... POLYGON ((540010 720000, 540020 720000, 540020...
2 3 ... POLYGON ((540020 720000, 540030 720000, 540030...
3 4 ... POLYGON ((540030 720000, 540040 720000, 540040...
4 5 ... POLYGON ((540040 720000, 540050 720000, 540050...
[5 rows x 5 columns]
Value counts for zoning_class:
zoning_class
Residential 420
COMMERCIAL 420
industrial 420
Agri. 420
residential 419
resi. 419
Comm. 419
COMM. 419
indus. 419
AGRI. 419
Industrial 418
AGRICULTURAL 418
RESIDENTIAL 417
RESI. 417
comm. 417
commercial 417
Indus. 417
INDUSTRIAL 417
Agricultural 417
agri. 417
Resi. 416
Commercial 416
INDUS. 416
agricultural 416
13
13
NaN 12
\t 12
Name: count, dtype: int64
CRS: EPSG:26331
0:52
Write
/work/solve.py
content · 95 lines · py
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon, MultiPolygon
def clean_zoning_class(val):
if pd.isna(val):
return None
s = str(val).strip().lower()
if not s:
return None
# Mapping dictionary for variations
mapping = {
'residential': 'Residential',
'resi.': 'Residential',
'resi': 'Residential',
'commercial': 'Commercial',
'comm.': 'Commercial',
'comm': 'Commercial',
'industrial': 'Industrial',
'indus.': 'Industrial',
'indus': 'Industrial',
'agricultural': 'Agricultural',
'agri.': 'Agricultural',
'agri': 'Agricultural'
}
# Try direct mapping first
if s in mapping:
return mapping[s]
# Fallback to Title Case if not in mapping, but let's see if we missed something
return s.title()
def solve():
# 1. Load data
print("Loading data...")
gdf = gpd.read_file('lagos_zoning_legacy.gpkg')
# 2. Clean zoning_class
print("Cleaning zoning_class...")
gdf['zoning_class'] = gdf['zoning_class'].apply(clean_zoning_class)
# Drop rows where zoning_class is None or whitespace-only (already handled by None in clean_zoning_class)
gdf = gdf.dropna(subset=['zoning_class'])
gdf = gdf[gdf['zoning_class'].str.strip() != '']
# 3. Fix vertex precision issues (tiny gaps/overlaps)
# A common way is to round coordinates to a certain precision or use a snap-to-grid approach.
# Since it's a metric CRS (EPSG:26331), 1mm is 0.001. Let's round to 1cm (0.01) or 1mm (0.001).
# Let's try rounding to 1mm (0.001) to avoid losing too much detail but snap vertices.
print("Fixing vertex precision...")
def snap_geometry(geom, precision=0.001):
if geom is None:
return None
# Rounding coordinates is a simple way to snap to a grid
from shapely.ops import transform
def round_coords(x, y, z=None):
return (round(x / precision) * precision, round(y / precision) * precision)
return transform(lambda x, y, z=None: round_coords(x, y), geom)
gdf['geometry'] = gdf['geometry'].apply(lambda g: snap_geometry(g))
# 4. Merge geometries by zoning_class
print("Merging geometries by zoning_class...")
# We need to dissolve the geometries based on zoning_class.
# To ensure we handle overlaps/gaps properly after snapping,
# we should ideally use a buffer(0) or union.
# Dissolve in geopandas is a union operation.
# Group by zoning_class and dissolve
dissolved = gdf.dissolve(by='zoning_class', aggfunc={'area_m2': 'sum'})
# The dissolve operation results in 'zoning_class' being the index.
# We need to move it back to a column and ensure 'area_m2' is preserved.
dissolved = dissolved.reset_index()
# Note: The area_m2 in the prompt might mean the area of the merged polygon.
# "each row carrying two columns: zoning_class and a numeric area_m2"
# If they want the total area of the resulting polygon, we should recalculate it.
print("Recalculating merged area...")
dissolved['area_m2'] = dissolved['geometry'].area
# 5. Save to file
print(f"Saving to zoning_aggregated.gpkg...")
# Ensure columns are in correct order/subset
output_gdf = dissolved[['zoning_class', 'area_m2', 'geometry']]
output_gdf.to_file('zoning_aggregated.gpkg', driver='GPKG')
print("Done!")
print(output_gdf.head())
if __name__ == "__main__":
solve()
Wrote /work/solve.py (3555 chars).
0:53
Bash
python solve.py
Loading data... Cleaning zoning_class... Fixing vertex precision... Merging geometries by zoning_class... Recalculating merged area... Saving to zoning_aggregated.gpkg... Done! zoning_class area_m2 geometry 0 Agricultural 250000.0 POLYGON ((540500 720560, 540500 720570, 540500... 1 Commercial 250000.0 POLYGON ((540500 720100, 540500 720110, 540500... 2 Industrial 250000.0 POLYGON ((540000 720560, 540000 720570, 540000... 3 Residential 250000.0 POLYGON ((540000 720100, 540000 720110, 540000...
1:09
Bash
python -c "import geopandas as gpd; gdf = gpd.read_file('zoning_aggregated.gpkg'...
zoning_class area_m2 geometry 0 Agricultural 250000.0 POLYGON ((540500 720560, 540500 720570, 540500... 1 Commercial 250000.0 POLYGON ((540500 720100, 540500 720110, 540500... 2 Industrial 250000.0 POLYGON ((540000 720560, 540000 720570, 540000... 3 Residential 250000.0 POLYGON ((540000 720100, 540000 720110, 540000... Index(['zoning_class', 'area_m2', 'geometry'], dtype='str')
1:32
Write
Invalid JSON arguments: Unterminated string starting at: line 1 column 13 (char 12)