fio-l2-capetown-landuse-dissolve
Map (reference ← swipe → agent)
0:00
I'm prepping a transit-corridor study for the Cape Town metro, and I need a class-level summary of `landuse` so the team's spatial-SQL notebooks can join it against the bus-route table. Can you give me one row per landuse class with the geometry unified into a single MultiPolygon, the total area in m², and the count of source parcels that fed in? Please write the result to `landuse_dissolved.geoparquet` with `class`, `area_m2`, and `parcel_count` columns.
0:00
System info
{
"adapter": "openrouter",
"model": "google/gemma-4-26b-a4b-it",
"base_url": "https://openrouter.ai/api/v1"
}
0:02
Bash
ls -F
capetown_landuse.fgb
0:06
Bash
python -c "import geopandas as gpd; gdf = gpd.read_file('capetown_landuse.fgb');...
id ... geometry 0 f328c054-b040-3a7d-a015-53747d608fef ... MULTIPOLYGON (((872000.605 6252622.417, 871205... 1 f1943847-f785-390c-abb0-edc923f97997 ... MULTIPOLYGON (((253543.223 6283900.241, 254921... 2 c37e40aa-f202-3b15-b68d-89dafeeaa478 ... MULTIPOLYGON (((255047.308 6280773.284, 255024... 3 b9d39db7-5dcc-379e-bfeb-c994628df5a7 ... MULTIPOLYGON (((254978.723 6280842.179, 254992... 4 0033302c-c1d6-3624-b71e-0e300db64963 ... MULTIPOLYGON (((264759.63 6280552.85, 264825.5... [5 rows x 4 columns] EPSG:32734 Index(['id', 'class', 'subtype', 'geometry'], dtype='str')
1:08
Write
/work/solve.py
content · 66 lines · py
import geopandas as gpd
def solve():
# Load the landuse data
input_file = 'capetown_landuse.fgb'
output_file = 'landuse_dissolved.geoparquet'
gdf = gpd.read_file(input_file)
# Ensure we are working with a metric CRS for area calculations
# The input is EPSG:32734, which is UTM zone 34S (metric)
# We'll verify if it's already metric or if we need to project.
# Since the print showed large numbers like 872000, it's likely metric.
# Group by 'class'
# We need:
# 1. Unified geometry (dissolved)
# 2. Total area in m2
# 3. Count of source parcels (rows in the original gdf per class)
# To get the count and area, we can aggregate first, then dissolve.
# Or dissolve and then calculate area.
# But we need the count of original parcels.
# Step 1: Aggregate to get count and area per class
# Note: area must be calculated on individual geometries or the dissolved one.
# Let's calculate area for each parcel first to be safe.
gdf['area_m2_indiv'] = gdf.geometry.area
summary = gdf.groupby('class').agg(
area_m2=('area_m2_indiv', 'sum'),
parcel_count=('id', 'count')
).reset_index()
# Step 2: Dissolve geometries by class
# We dissolve the original gdf to get the unified MultiPolygon
dissolved_gdf = gdf.dissolve(by='class')
# The dissolved_gdf index is 'class'. We need to merge the summary stats.
# dissolved_gdf currently has only 'geometry' (and whatever else was in gdf)
# Let's make it cleaner.
# Re-dissolve to ensure we only have the geometry and the class index
dissolved_gdf = gdf.dissolve(by='class')
# Merge the summary stats back
# summary has 'class', 'area_m2', 'parcel_count'
# dissolved_gdf has index 'class'
final_gdf = dissolved_gdf[['geometry']].merge(summary, left_index=True, right_on='class')
# Reorder columns to match requested: class, area_m2, parcel_count, geometry
# (GeoDataFrame will keep geometry as the active geometry column)
final_gdf = final_gdf[['class', 'area_m2', 'parcel_count', 'geometry']]
# Set geometry back correctly
final_gdf = final_gdf.set_geometry('geometry')
# Write to GeoParquet
final_gdf.to_parquet(output_file)
print(f"Successfully wrote {len(final_gdf)} rows to {output_file}")
print(final_gdf.head())
if __name__ == "__main__":
solve()
Wrote /work/solve.py (2454 chars).
1:10
Bash
python solve.py
Successfully wrote 72 rows to landuse_dissolved.geoparquet
class ... geometry
0 airfield ... POLYGON ((267829.758 6244743.665, 267820.919 6...
1 allotments ... MULTIPOLYGON (((263870.512 6232477.907, 263865...
2 animal_keeping ... MULTIPOLYGON (((264114.249 6261265.896, 264220...
3 barracks ... MULTIPOLYGON (((262288.681 6215543.331, 262280...
4 base ... POLYGON ((272069.73 6245748.538, 272064.541 62...
[5 rows x 4 columns]
1:21
Bash
python -c "import geopandas as gpd; gdf = gpd.read_parquet('landuse_dissolved.ge...
Index(['class', 'area_m2', 'parcel_count', 'geometry'], dtype='str')
class ... geometry
0 airfield ... POLYGON ((267829.758 6244743.665, 267820.919 6...
1 allotments ... MULTIPOLYGON (((263870.512 6232477.907, 263865...
2 animal_keeping ... MULTIPOLYGON (((264114.249 6261265.896, 264220...
3 barracks ... MULTIPOLYGON (((262288.681 6215543.331, 262280...
4 base ... POLYGON ((272069.73 6245748.538, 272064.541 62...
[5 rows x 4 columns]
class str
area_m2 float64
parcel_count int64
geometry geometry
dtype: object
1:29