dd-l3-lagos-overture-buildings
Map (reference ← swipe → agent)
0:00
Updating the flood-risk model before rainy season. Need every building footprint exceeding 1000 m² across Lagos State (Nigeria) from Overture.
Also need a per-LGA roll-up: for each Lagos Local Government Area, the building count, total footprint area, count of buildings with a non-null Overture height, and the median height where available (null otherwise).
Write two output files. `lagos_buildings.geoparquet` should be EPSG:4326 GeoParquet with columns `id`, `height`, `footprint_area_m2`, `lga`, and `geometry`. `lagos_building_summary.parquet` should be a plain Parquet with no geometry column and columns `lga`, `n_buildings`, `total_footprint_m2`, `n_with_height`, and `p50_height_m`.
0:00
System info
{
"adapter": "openrouter",
"model": "deepseek/deepseek-v4-flash",
"base_url": "https://openrouter.ai/api/v1"
}
0:07
Let me break down this task: 1. Get Overture Maps building footprints for Lagos State, Nigeria 2. Filter to buildings > 1000 m² 3. Get LGA (Local Government Area) boundaries for Lagos State 4. Spatial join buildings to LGAs 5. Create two output files: - `lagos_buildings.geoparquet` - EPSG:4326 with columns: id, height, footprint_area_m2, lga, geometry - `lagos_building_summary.parquet` - plain Parquet with columns: lga, n_buildings, total_footprint_m2, n_with_height, p50_height_m Let me start by exploring what tools are available and writing a script. First, let me check what's available in the environment.
0:07
Bash
pip list 2>/dev/null | grep -iE "overture|geopandas|shapely|duckdb|pyogrio|fiona...
duckdb 1.5.3 fiona 1.10.1 geopandas 1.1.3 overturemaps 1.0.0 pyogrio 0.12.1 shapely 2.1.2
0:08
Bash
ls /work/
(no output)
0:30
Good, all needed libraries are present. Let me write the Python script.
Plan:
1. Use `overturemaps` to download buildings for Lagos State, Nigeria. I'll need to know the bounding box for Lagos State.
2. Filter to buildings with footprint area > 1000 m².
3. Join with LGA boundaries to assign each building to an LGA.
4. Compute the summary per LGA.
For the LGA boundaries, I need to get the LGAs of Lagos State. I can use Overture's divisions or fetch from Natural Earth / other sources. Let me think...
Actually, Overture has division data (administrative boundaries). I can download the divisions for Nigeria and filter to Lagos State LGAs. Overture has `divisions` theme with administrative boundaries.
Alternatively, I could use a shapefile from Natural Earth or GADM. But Overture should have this.
Let me check how overturemaps works - the Python library has `overturemaps_download` CLI and `overturemaps` Python module.
Let me write the script step by step.
First, I need to find the bounding box for Lagos State. Lagos State in Nigeria is approximately between 6.2°N to 6.8°N and 2.7°E to 4.0°E.
Let me get the Lagos State boundary from Overture divisions, then use that to clip buildings.
Actually, the approach:
1. Download administrative divisions for Nigeria from Overture (level 4 for state, level 6 for LGA).
2. Filter to Lagos State.
3. Use the Lagos State boundary to spatially filter buildings.
4. For each building intersecting Lagos State, compute footprint area, and join with LGA.
Let me write the script.
Actually, Overture's Python library is pretty simple. Let me check the overturemaps API.
```python
import overturemaps
overturemaps.record_batch_reader("theme", "type", bbox=...)
```
The themes are: "buildings", "places", "divisions", "transportation", etc.
For divisions, the types are: "division_area", "division_boundary", "locality", etc.
Let me look at the division_area schema for Nigeria.
Let me write a comprehensive script.
Actually, I'll start by exploring the data a bit.
Let me think about what bounding box to use. Lagos State covers roughly:
- West: ~2.68°E
- East: ~4.0°E
- South: ~6.2°N
- North: ~6.8°N
Let me be generous: bbox around (2.5, 6.1, 4.2, 7.0).
Then I'll:
1. Download divisions for Nigeria to get LGA boundaries
2. Download buildings within the bbox
3. Filter buildings to those actually within Lagos State
4. Spatially join buildings to LGAs
5. Compute summary
Let me write the full script.
Actually, let me be more efficient. I'll:
1. Get Lagos State boundary from Overture divisions
2. Get LGA boundaries for Lagos from Overture divisions
3. Get buildings within the Lagos bbox from Overture
4. Filter buildings to those intersecting Lagos State polygon
5. Compute area for each building
6. Filter to area > 1000 m²
7. Spatial join with LGA polygons
8. Compute the summary statistics
9. Write outputs
Let me write this script.
For the division data, in Overture the administrative divisions use `division_area` type with properties like `subtype` (country, region, county/municipality, locality) and `country`, `region`, etc.
For Nigeria:
- Country level: subtype='country', region='NG'
- State level: might be subtype='region'
- LGA level: might be subtype='county' or 'municipality' or 'locality'
Let me check by exploring.
Actually, let me look up Overture's division schema quickly.
In Overture Maps, `division_area` has:
- `subtype`: One of "country", "region", "county", "locality", "macrocounty", "macroregion"
- `country`: ISO code like "NG"
- `region`: State name
- `locality`: City/town name
- `subregion`: County/LGA name
For Nigeria:
- Country: subtype='country', region='Nigeria'
- States: subtype='region', country='NG'
- LGAs: subtype='county' or 'locality', region='Lagos'
Let me write a script that first explores the divisions to understand the schema.
0:30
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
# Explore the schema for divisions
reader = overturemaps.record_batch_reader('divisions', 'division_area')
import pyarrow as pa
print('Schema:')
print(reader.schema)
"
Traceback (most recent call last):
File "<string>", line 4, in <module>
reader = overturemaps.record_batch_reader('divisions', 'division_area')
File "/usr/local/lib/python3.14/site-packages/overturemaps/core.py", line 307, in record_batch_reader
result = _prepare_query(
overture_type, bbox, release, connect_timeout, request_timeout, stac
)
File "/usr/local/lib/python3.14/site-packages/overturemaps/core.py", line 244, in _prepare_query
path = _dataset_path(overture_type, release)
File "/usr/local/lib/python3.14/site-packages/overturemaps/core.py", line 409, in _dataset_path
theme = type_theme_map[overture_type]
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'divisions'
0:38
The overturemaps library expects the type to be something like 'building' etc. Let me check the correct API. The `divisions` type doesn't exist as an overture_type. For divisions, I need to query the 'divisions' theme differently. Let me look at the overturemaps library to understand the API better.
0:38
Bash
cd /work && python3 -c "
cd /work && python3 -c " import overturemaps print(dir(overturemaps)) print() help(overturemaps.record_batch_reader) "
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'core', 'geodataframe', 'get_all_overture_types', 'models', 'record_batch_reader']
Help on function record_batch_reader in module overturemaps.core:
record_batch_reader(
overture_type,
bbox=None,
release=None,
connect_timeout=None,
request_timeout=None,
stac=False
) -> Optional[pa.RecordBatchReader]
Return a pyarrow RecordBatchReader for the desired bounding box and s3 path, or None on error.
0:46
Let me look at what overture types are available.
0:46
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
types = overturemaps.get_all_overture_types()
print('All types:')
for t in sorted(types):
print(t)
"
All types: address bathymetry building building_part connector division division_area division_boundary infrastructure land land_cover land_use place segment water
0:52
So `division_area` is a recognized type. Let's see the schema and first few records for divisions to figure out how to map LGA names.
0:52
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
# Look at divisions schema
reader = overturemaps.record_batch_reader('division_area', release='2024-09-18-alpha')
if reader:
schema = reader.schema
print('Schema:')
print(schema)
# Get a few rows to understand structure
import pyarrow as pa
for batch in reader:
df = batch.to_pandas()
print('Columns:', list(df.columns))
print('Sample rows:')
print(df.head(3).to_string())
break
else:
print('No reader returned')
"
Traceback (most recent call last):
File "<string>", line 4, in <module>
reader = overturemaps.record_batch_reader('division_area', release='2024-09-18-alpha')
File "/usr/local/lib/python3.14/site-packages/overturemaps/core.py", line 307, in record_batch_reader
result = _prepare_query(
overture_type, bbox, release, connect_timeout, request_timeout, stac
)
File "/usr/local/lib/python3.14/site-packages/overturemaps/core.py", line 266, in _prepare_query
dataset = ds.dataset(
intersecting_files if intersecting_files is not None else path,
...<5 lines>...
),
)
File "/usr/local/lib/python3.14/site-packages/pyarrow/dataset.py", line 790, in dataset
return _filesystem_dataset(source, **kwargs)
File "/usr/local/lib/python3.14/site-packages/pyarrow/dataset.py", line 472, in _filesystem_dataset
fs, paths_or_selector = _ensure_single_source(source, filesystem)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/pyarrow/dataset.py", line 437, in _ensure_single_source
raise FileNotFoundError(path)
FileNotFoundError: overturemaps-us-west-2/release/2024-09-18-alpha/theme=divisions/type=division_area/
0:59
The default release seems to not have divisions. Let me try the latest release or just use the building data which we know exists. Let me explore the building schema first.
0:59
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
# Check building schema first
reader = overturemaps.record_batch_reader('building')
if reader:
schema = reader.schema
print('Building Schema:')
print(schema)
for batch in reader:
df = batch.to_pandas()
print('Columns:', list(df.columns))
print('Sample row:')
print(df.iloc[0])
break
else:
print('No building reader returned')
"
Building Schema:
id: string
names: struct<primary: string, common: map<string, string ('common')>, rules: list<element: struct<variant: (... 159 chars omitted)
child 0, primary: string
child 1, common: map<string, string ('common')>
child 0, common: struct<key: string not null, value: string> not null
child 0, key: string not null
child 1, value: string
child 2, rules: list<element: struct<variant: string, language: string, perspectives: struct<mode: string, countries (... 87 chars omitted)
child 0, element: struct<variant: string, language: string, perspectives: struct<mode: string, countries: list<element (... 72 chars omitted)
child 0, variant: string
child 1, language: string
child 2, perspectives: struct<mode: string, countries: list<element: string>>
child 0, mode: string
child 1, countries: list<element: string>
child 0, element: string
child 3, value: string
child 4, between: list<element: double>
child 0, element: double
child 5, side: string
sources: list<element: struct<property: string, dataset: string, license: string, record_id: string, update_t (... 65 chars omitted)
child 0, element: struct<property: string, dataset: string, license: string, record_id: string, update_time: string, c (... 50 chars omitted)
child 0, property: string
child 1, dataset: string
child 2, license: string
child 3, record_id: string
child 4, update_time: string
child 5, confidence: double
child 6, between: list<element: double>
child 0, element: double
level: int32
height: double
min_height: double
is_underground: bool
num_floors: int32
num_floors_underground: int32
min_floor: int32
subtype: string
class: string
facade_color: string
facade_material: string
roof_material: string
roof_shape: string
roof_direction: double
roof_orientation: string
roof_color: string
roof_height: double
geometry: binary
-- field metadata --
ARROW:extension:name: 'geoarrow.wkb'
has_parts: bool
version: int32
bbox: struct<xmin: double, xmax: double, ymin: double, ymax: double>
child 0, xmin: double
child 1, xmax: double
child 2, ymin: double
child 3, ymax: double
-- schema metadata --
geo: '{"version": "1.1.0", "primary_column": "geometry", "columns": {"geo' + 281
Columns: ['id', 'names', 'sources', 'level', 'height', 'min_height', 'is_underground', 'num_floors', 'num_floors_underground', 'min_floor', 'subtype', 'class', 'facade_color', 'facade_material', 'roof_material', 'roof_shape', 'roof_direction', 'roof_orientation', 'roof_color', 'roof_height', 'geometry', 'has_parts', 'version', 'bbox']
Sample row:
id 76c4a544-9ad3-4da0-be7d-22892dcbeb58
names {'primary': 'St. Paul's Burgers', 'common': No...
sources [{'property': '', 'dataset': 'OpenStreetMap', ...
level NaN
height NaN
min_height NaN
is_underground False
num_floors NaN
num_floors_underground NaN
min_floor NaN
subtype religious
class NaN
facade_color NaN
facade_material NaN
roof_material NaN
roof_shape NaN
roof_direction NaN
roof_orientation NaN
roof_color NaN
roof_height NaN
geometry b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x06\x00...
has_parts False
version 1
bbox {'xmin': -179.96853637695312, 'xmax': -179.963...
Name: 0, dtype: object
1:52
Now I understand the building schema. I need to: 1. Get buildings for Lagos State, Nigeria, with area > 1000 m² 2. Get LGAs (Local Government Areas) for Lagos State - these would be in division data 3. Spatial join to get per-LGA stats Let me first figure out the division_area schema and get Lagos boundaries, then query buildings within Lagos. Let me think about the approach: 1. Use Overture to get division_area data for Nigeria to find Lagos State and its LGAs 2. Use a bounding box for Lagos to filter buildings 3. Compute footprint area from geometry 4. Filter to buildings > 1000 m² 5. Spatial join with LGAs 6. Compute summary stats Let me start by getting the Lagos division_area info.
1:52
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
# Look at division schema
reader = overturemaps.record_batch_reader('division')
if reader:
schema = reader.schema
print('Division Schema:')
print(schema)
for batch in reader:
df = batch.to_pandas()
print('Columns:', list(df.columns))
print('Sample row:')
print(df.iloc[0])
break
else:
print('No reader returned')
"
Division Schema:
id: string
geometry: binary
-- field metadata --
ARROW:extension:name: 'geoarrow.wkb'
country: string
sources: list<element: struct<property: string, dataset: string, license: string, record_id: string, update_t (... 65 chars omitted)
child 0, element: struct<property: string, dataset: string, license: string, record_id: string, update_time: string, c (... 50 chars omitted)
child 0, property: string
child 1, dataset: string
child 2, license: string
child 3, record_id: string
child 4, update_time: string
child 5, confidence: double
child 6, between: list<element: double>
child 0, element: double
subtype: string
admin_level: int32
class: string
names: struct<primary: string, common: map<string, string ('common')>, rules: list<element: struct<variant: (... 159 chars omitted)
child 0, primary: string
child 1, common: map<string, string ('common')>
child 0, common: struct<key: string not null, value: string> not null
child 0, key: string not null
child 1, value: string
child 2, rules: list<element: struct<variant: string, language: string, perspectives: struct<mode: string, countries (... 87 chars omitted)
child 0, element: struct<variant: string, language: string, perspectives: struct<mode: string, countries: list<element (... 72 chars omitted)
child 0, variant: string
child 1, language: string
child 2, perspectives: struct<mode: string, countries: list<element: string>>
child 0, mode: string
child 1, countries: list<element: string>
child 0, element: string
child 3, value: string
child 4, between: list<element: double>
child 0, element: double
child 5, side: string
wikidata: string
perspectives: struct<mode: string, countries: list<element: string>>
child 0, mode: string
child 1, countries: list<element: string>
child 0, element: string
local_type: map<string, string ('local_type')>
child 0, local_type: struct<key: string not null, value: string> not null
child 0, key: string not null
child 1, value: string
region: string
hierarchies: list<element: list<element: struct<division_id: string, subtype: string, name: string>>>
child 0, element: list<element: struct<division_id: string, subtype: string, name: string>>
child 0, element: struct<division_id: string, subtype: string, name: string>
child 0, division_id: string
child 1, subtype: string
child 2, name: string
parent_division_id: string
norms: struct<driving_side: string>
child 0, driving_side: string
population: int32
capital_division_ids: list<element: string>
child 0, element: string
capital_of_divisions: list<element: struct<division_id: string, subtype: string>>
child 0, element: struct<division_id: string, subtype: string>
child 0, division_id: string
child 1, subtype: string
cartography: struct<prominence: int32, min_zoom: int32, max_zoom: int32, sort_key: int32>
child 0, prominence: int32
child 1, min_zoom: int32
child 2, max_zoom: int32
child 3, sort_key: int32
version: int32
bbox: struct<xmin: double, xmax: double, ymin: double, ymax: double>
child 0, xmin: double
child 1, xmax: double
child 2, ymin: double
child 3, ymax: double
-- schema metadata --
geo: '{"version": "1.1.0", "primary_column": "geometry", "columns": {"geo' + 275
Columns: ['id', 'geometry', 'country', 'sources', 'subtype', 'admin_level', 'class', 'names', 'wikidata', 'perspectives', 'local_type', 'region', 'hierarchies', 'parent_division_id', 'norms', 'population', 'capital_division_ids', 'capital_of_divisions', 'cartography', 'version', 'bbox']
Sample row:
id 23e81262-d6ed-45a3-a1a0-4bc6a2a887d8
geometry b'\x01\x01\x00\x00\x00)\xcb\x10\xc7\xbaha\xc0\...
country NaN
sources [{'property': '', 'dataset': 'OpenStreetMap', ...
subtype locality
admin_level NaN
class village
names {'primary': 'Amundsen–Scott South Pole', 'comm...
wikidata Q243307
perspectives None
local_type [(en, village)]
region NaN
hierarchies [[{'division_id': '23e81262-d6ed-45a3-a1a0-4bc...
parent_division_id NaN
norms None
population 49.0
capital_division_ids None
capital_of_divisions None
cartography {'prominence': 20.0, 'min_zoom': None, 'max_zo...
version 1
bbox {'xmin': -139.27281188964844, 'xmax': -139.272...
Name: 0, dtype: object
2:41
Let me understand the structure better. Division records have: - `country` field - `subtype` field (like "locality", "region", "country", etc.) - `names.primary` for the name - `region` for the region - `admin_level` - `geometry` (geoarrow.wkb) I need to find Lagos State in Nigeria. Let me think about the best approach. Plan: 1. Get the division boundary for Lagos State (Nigeria) to use as a spatial filter 2. Read all building footprints that intersect Lagos state boundary 3. Filter to those with area > 1000 m² 4. Spatially join with LGA boundaries to get LGA name for each building 5. Compute per-LGA summary stats 6. Write output files First, let me find the Lagos state division and get its geometry. Then I need LGA boundaries which are admin_level 3 or so in Nigeria. Let me first query the divisions to find Lagos State.
2:41
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import overturemaps
# Get divisions to find Lagos State
reader = overturemaps.record_batch_reader('division')
# We need to filter for country=NG (Nigeria) and names.primary like '%Lagos%'
# Let's use DuckDB approach instead - filter with pyarrow
import pyarrow.compute as pc
import pyarrow as pa
import geopandas as gpd
from shapely import wkb
# First pass - just grab a few to see country codes
for batch in reader:
df = batch.to_pandas()
# Check countries
countries = df['country'].dropna().unique()
print('Sample countries:', countries[:10])
# Check subtypes
print('Subtypes:', df['subtype'].unique()[:20])
# Check region values
print('Region sample:', df['region'].dropna().unique()[:10])
break
"
Sample countries: <ArrowStringArray> ['NZ', 'NU', 'TO', 'FJ', 'WF', 'AS', 'TK', 'WS', 'CK', 'PF'] Length: 10, dtype: str Subtypes: <ArrowStringArray> [ 'locality', 'county', 'region', 'dependency', 'neighborhood', 'country', 'macrohood', 'microhood'] Length: 8, dtype: str Region sample: <ArrowStringArray> ['NZ-CIT', 'TO-05', 'TO-02', 'TO-01', 'TO-04', 'FJ-E', 'FJ-N', 'TO-03', 'WF-UV', 'WF-SG'] Length: 10, dtype: str
3:17
Let me use a more efficient approach. I'll use DuckDB with spatial extension to query both buildings and divisions. The overturemaps library gives us S3 paths, but I can use DuckDB to filter by bounding box or directly query the data. Actually, let me use a smarter approach - I'll get the Lagos state boundary first, then use it to filter buildings. Let me use DuckDB which is more efficient for this kind of spatial filtering. Let me first figure out how to get Lagos State boundary. I'll use overturemaps with DuckDB.
3:17
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import duckdb
con = duckdb.connect()
con.execute('INSTALL spatial; LOAD spatial;')
# Query divisions to find Lagos
result = con.execute(\"\"\"
SELECT id, names, subtype, admin_level, country, region, geometry
FROM st_read('s3://overturemaps-us-west-2/release/2025-05-13-alpha/theme=divisions/type=division/*')
WHERE country = 'NG'
AND subtype = 'region'
AND names->>'primary' LIKE '%Lagos%'
LIMIT 5
\"\"\").fetchdf()
print(result[['id', 'names', 'subtype', 'region']].to_string())
"
Traceback (most recent call last):
File "<string>", line 6, in <module>
result = con.execute("""
~~~~~~~~~~~^^^^
SELECT id, names, subtype, admin_level, country, region, geometry
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<4 lines>...
LIMIT 5
^^^^^^^
""").fetchdf()
^^^^
_duckdb.IOException: IO Error: Could not open GDAL dataset at: s3://overturemaps-us-west-2/release/2025-05-13-alpha/theme=divisions/type=division/*
LINE 3: FROM st_read('s3://overturemaps-us-west-2/release/2025-05-13...
^
3:27
Let me try a different approach. I'll use overturemaps library to get data with a bounding box for Nigeria/Lagos, or use DuckDB with httpfs to read parquet files directly.
3:27
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
# Let's look at what release is latest and get divisions for Nigeria
# First get the building data with filters using pyarrow
import pyarrow as pa
import pyarrow.compute as pc
import pyarrow.dataset as ds
# For divisions, use DuckDB with httpfs plugin to query parquet directly
import duckdb
con = duckdb.connect()
con.execute("INSTALL httpfs; LOAD httpfs;")
con.execute("SET s3_region='us-west-2';")
con.execute("SET s3_access_key_id='';")
con.execute("SET s3_secret_access_key='';")
con.execute("SET s3_session_token='';")
con.execute("SET s3_use_ssl=true;")
# List the release directory to find available releases
result = con.execute("""
SELECT *
FROM parquet_metadata('s3://overturemaps-us-west-2/release/2025-05-13-alpha/theme=divisions/type=division/*')
LIMIT 3
""").fetchdf()
print(result.to_string())
PYEOF
Traceback (most recent call last):
File "<stdin>", line 20, in <module>
_duckdb.IOException: IO Error: No files found that match the pattern "s3://overturemaps-us-west-2/release/2025-05-13-alpha/theme=divisions/type=division/*"
LINE 3: FROM parquet_metadata('s3://overturemaps-us-west-2/release/2025...
^
3:40
It seems like the latest release might not have division data. Let me use the overturemaps library which handles this more gracefully. Let me use the DuckDB approach with pyarrow dataset. Actually, let me try the overturemaps library properly - it seems to work with the divisions type (same as division_area might be the same thing). Let me check.
3:40
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
# List available releases
# Try different release formats
import overturemaps.core as core
# Let's use the building theme first - just download buildings near Lagos
# Lagos bounding box roughly: 2.68 to 4.0 longitude, 6.2 to 6.8 latitude
bbox = (2.5, 6.2, 4.0, 6.8)
# Get building reader for just Lagos area
reader = overturemaps.record_batch_reader('building', bbox=bbox)
if reader:
schema = reader.schema
print('Got building reader with schema')
print(schema)
# Read some data
import pyarrow as pa
count = 0
for batch in reader:
df = batch.to_pandas()
count += len(df)
print(f'Batch size: {len(df)}, running total: {count}')
if count > 1000:
break
print(f'Total rows read: {count}')
else:
print('No reader')
PYEOF
Got building reader with schema
id: string
names: struct<primary: string, common: map<string, string ('common')>, rules: list<element: struct<variant: (... 159 chars omitted)
child 0, primary: string
child 1, common: map<string, string ('common')>
child 0, common: struct<key: string not null, value: string> not null
child 0, key: string not null
child 1, value: string
child 2, rules: list<element: struct<variant: string, language: string, perspectives: struct<mode: string, countries (... 87 chars omitted)
child 0, element: struct<variant: string, language: string, perspectives: struct<mode: string, countries: list<element (... 72 chars omitted)
child 0, variant: string
child 1, language: string
child 2, perspectives: struct<mode: string, countries: list<element: string>>
child 0, mode: string
child 1, countries: list<element: string>
child 0, element: string
child 3, value: string
child 4, between: list<element: double>
child 0, element: double
child 5, side: string
sources: list<element: struct<property: string, dataset: string, license: string, record_id: string, update_t (... 65 chars omitted)
child 0, element: struct<property: string, dataset: string, license: string, record_id: string, update_time: string, c (... 50 chars omitted)
child 0, property: string
child 1, dataset: string
child 2, license: string
child 3, record_id: string
child 4, update_time: string
child 5, confidence: double
child 6, between: list<element: double>
child 0, element: double
level: int32
height: double
min_height: double
is_underground: bool
num_floors: int32
num_floors_underground: int32
min_floor: int32
subtype: string
class: string
facade_color: string
facade_material: string
roof_material: string
roof_shape: string
roof_direction: double
roof_orientation: string
roof_color: string
roof_height: double
geometry: binary
-- field metadata --
ARROW:extension:name: 'geoarrow.wkb'
has_parts: bool
version: int32
bbox: struct<xmin: double, xmax: double, ymin: double, ymax: double>
child 0, xmin: double
child 1, xmax: double
child 2, ymin: double
child 3, ymax: double
-- schema metadata --
geo: '{"version": "1.1.0", "primary_column": "geometry", "columns": {"geo' + 281
Batch size: 1087, running total: 1087
Total rows read: 1087
5:02
The buildings reader works with a bbox. Now I need to: 1. Get the boundary of Lagos State (Nigeria). I can use the `division` or `division_area` type. Let me try getting divisions for Nigeria. 2. Get all buildings within Lagos State that have footprint area > 1000 m². 3. Do a spatial join with LGAs in Lagos State. Let me first try to use overturemaps to get Nigeria divisions to find Lagos State and its LGAs. Actually, let me think about this more carefully. The user wants: - Buildings > 1000 m² in Lagos State - Per-LGA summary I need: 1. Lagos State boundary (to clip buildings) 2. LGA boundaries within Lagos State (for the per-LGA roll-up) Let me use the `division` or `division_area` type. Let me first try to get the division_area for Nigeria.
5:02
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
# Let's get divisions for Nigeria using a bbox covering all of Nigeria
nigeria_bbox = (2.0, 4.0, 15.0, 14.0) # rough extents
reader = overturemaps.record_batch_reader('division', bbox=nigeria_bbox)
if reader:
import pyarrow as pa
import pyarrow.compute as pc
all_rows = []
for batch in reader:
df = batch.to_pandas()
all_rows.append(df)
import pandas as pd
result = pd.concat(all_rows, ignore_index=True)
print(f"Total divisions: {len(result)}")
# Filter for Nigeria
ng = result[result['country'] == 'NG']
print(f"Nigerian divisions: {len(ng)}")
print("Subtypes:", ng['subtype'].value_counts().to_dict())
# Look for Lagos
for i, row in ng.iterrows():
name = row['names']
if name and 'primary' in name:
if 'Lagos' in str(name['primary']) or 'lagos' in str(name['primary']):
print(f"Found Lagos: subtype={row['subtype']}, admin_level={row['admin_level']}, id={row['id']}")
print(f" Names: {name}")
# Also look for region-level divisions
ng_regions = ng[ng['subtype'] == 'region']
print(f"\nRegions in Nigeria ({len(ng_regions)}):")
for i, row in ng_regions.iterrows():
name = row['names']
if name and 'primary' in name:
print(f" {name['primary']} (id={row['id']})")
# Look for locality (LGA level)
ng_lgas = ng[ng['subtype'] == 'locality']
print(f"\nLocalities in Nigeria ({len(ng_lgas)}):")
for i, row in ng_lgas.iterrows():
name = row['names']
if name and 'primary' in name:
if 'Lagos' in str(name['primary']) or 'Ikeja' in str(name['primary']):
print(f" {name['primary']} (id={row['id']}, level={row['admin_level']})")
PYEOF
Total divisions: 52347
Nigerian divisions: 24692
Subtypes: {'locality': 23260, 'county': 773, 'macrohood': 353, 'neighborhood': 231, 'region': 37, 'microhood': 37, 'country': 1}
Found Lagos: subtype=locality, admin_level=nan, id=ee95bda9-42c1-4a3d-9385-671da35b97fd
Names: {'primary': 'Lagos-Iyidi', 'common': None, 'rules': None}
Found Lagos: subtype=locality, admin_level=nan, id=902166cd-3918-4b7d-b342-eac736aa984e
Names: {'primary': 'Lagos', 'common': [('hy', 'Լագոս'), ('bs', 'Lagos'), ('ba', 'Лагос'), ('hr', 'Lagos'), ('ta', 'லாகோஸ்'), ('ka', 'ლაგოსი'), ('ar', 'لاغوس'), ('fr', 'Lagos'), ('lv', 'Lagosa'), ('am', 'ሌጎስ'), ('bn', 'লেগোস'), ('uz', 'Lagos'), ('uk', 'Лаґос'), ('ky', 'Лагос'), ('pa', 'ਲਾਗੋਸ'), ('tt', 'Лагос'), ('pt-PT', 'Lagos'), ('pt', 'Lagos'), ('cs', 'Lagos'), ('gl', 'Lagos'), ('el', 'Λάγος'), ('it', 'Lagos'), ('ca', 'Lagos'), ('pt-BR', 'Lagos'), ('os', 'Лагос'), ('vi', 'Thành phố Lagos'), ('la', 'Lacupolis'), ('tg', 'Лагос'), ('as', "লাগ'চ"), ('yo', 'Èkó'), ('nl', 'Lagos'), ('zh-Hans', '拉哥斯'), ('bg', 'Лагос'), ('ckb', 'لاگۆس'), ('ko', '라고스'), ('eo', 'Lagoso'), ('mk', 'Лагос'), ('oc', 'Lagos'), ('et', 'Lagos'), ('de', 'Lagos'), ('ps', 'لاګوس'), ('yi', 'לאגאס'), ('ha', 'Lagos'), ('cv', 'Лагос'), ('ur', 'لاگوس'), ('ru', 'Лагос'), ('ml', 'ലാഗോസ്'), ('th', 'เลกอส'), ('id', 'Lagos'), ('ce', 'Лагос'), ('sr-Latn', 'Lagos'), ('sv', 'Lagos'), ('tr', 'Lagos'), ('da', 'Lagos'), ('my', 'လေးဂို့စ်မြို့'), ('sr-Cyrl', 'Лагос'), ('zh-Hant', '拉哥斯'), ('en', 'Lagos'), ('gu', 'લાગોસ'), ('he', 'לאגוס'), ('kn', 'ಲಾಗೋಸ್'), ('sk', 'Lagos'), ('az', 'Laqos'), ('es', 'Lagos'), ('hi', 'लागोस्'), ('mr', 'लागोस'), ('be', 'Лагас'), ('sd', 'لاگوس'), ('ja', 'ラゴス'), ('wal', 'Leegoos'), ('fi', 'Lagos'), ('ro', 'Lagos'), ('lt', 'Lagosas'), ('no', 'Lagos'), ('kk', 'Лагос'), ('sl', 'Lagos'), ('fa', 'لاگوس'), ('mn', 'Лагос'), ('zh', '拉哥斯'), ('hu', 'Lagos'), ('nv', 'Tótaʼ Yadaazʼáhí'), ('pl', 'Lagos')], 'rules': array([{'variant': 'alternate', 'language': 'eo', 'perspectives': None, 'value': 'Lejgoso', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'ar', 'perspectives': None, 'value': 'لاجوس', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'zh-Hans', 'perspectives': None, 'value': '拉各斯', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'zh-Hant', 'perspectives': None, 'value': '拉各斯州', 'between': None, 'side': None}],
dtype=object)}
Found Lagos: subtype=county, admin_level=2.0, id=4f1aabae-5a08-4179-bdfb-f0e9d71d2ca5
Names: {'primary': 'Lagos Mainland', 'common': None, 'rules': None}
Found Lagos: subtype=county, admin_level=2.0, id=65a83885-eae9-4a94-8663-f6917cc9a04e
Names: {'primary': 'Lagos Island', 'common': [('ar', 'جزيرة لاغوس')], 'rules': None}
Found Lagos: subtype=region, admin_level=1.0, id=246ecf78-8622-4d7d-8d12-102e73a67cda
Names: {'primary': 'Lagos', 'common': [('ka', 'ლაგოსის შტატი'), ('ar', 'لاغوس'), ('fr', 'État de Lagos'), ('uk', 'Лагос'), ('ceb', 'Lagos State'), ('war', 'Lagos'), ('pt', 'Lagos'), ('cs', 'Stát Lagos'), ('gl', 'Estado de Lagos'), ('it', 'Lagos'), ('ca', 'Estat de Lagos'), ('yo', 'Ìpínlẹ̀ Èkó'), ('nl', 'Lagos'), ('bg', 'Лагос'), ('ko', '라고스주'), ('mk', 'Лагос'), ('et', 'Lagose osariik'), ('de', 'Lagos'), ('ha', 'Lagos'), ('azb', 'لاقوس ایالتی'), ('ig', 'Ȯra Lagos'), ('ur', 'لاگوس ریاست'), ('nan', 'Lagos Chiu'), ('ru', 'Лагос'), ('th', 'รัฐเลกอส'), ('id', 'Lagos'), ('sv', 'Lagos'), ('tr', 'Lagos Eyaleti'), ('da', 'Lagos'), ('sr-Cyrl', 'Лагос'), ('en', 'Lagos State'), ('he', 'לאגוס'), ('mrj', 'Лагос'), ('az', 'Laqos ştatı'), ('es', 'Estado de Lagos'), ('hi', 'लेगोस राज्य'), ('sw', 'Jimbo la Lagos'), ('ja', 'ラゴス州'), ('fi', 'Lagos'), ('ro', 'Statul Lagos'), ('no', 'Lagos'), ('vec', 'Ƚagos'), ('fa', 'ایالت لاگوس'), ('zh', '拉哥斯'), ('ms', 'Negeri Lagos'), ('pl', 'Lagos')], 'rules': array([{'variant': 'alternate', 'language': 'zh', 'perspectives': None, 'value': '拉各斯州', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'ar', 'perspectives': None, 'value': 'ولاية لاغوس', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'es', 'perspectives': None, 'value': 'Estado Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'pt', 'perspectives': None, 'value': 'Estado Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'it', 'perspectives': None, 'value': 'Stato Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'fr', 'perspectives': None, 'value': 'État Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'de', 'perspectives': None, 'value': 'Staat Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'nl', 'perspectives': None, 'value': 'Staat Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'sv', 'perspectives': None, 'value': 'Lagos-tillstånd', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'no', 'perspectives': None, 'value': 'Lagos Stat', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'pl', 'perspectives': None, 'value': 'Stan Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'da', 'perspectives': None, 'value': 'Lagos Stat', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'tr', 'perspectives': None, 'value': 'Durum Lagos', 'between': None, 'side': None},
{'variant': 'alternate', 'language': 'id', 'perspectives': None, 'value': 'Negara Bagian Lagos', 'between': None, 'side': None}],
dtype=object)}
Found Lagos: subtype=locality, admin_level=nan, id=692bf9bf-2560-4099-bd46-17e572ec44b4
Names: {'primary': 'Lagos Ngwa', 'common': None, 'rules': None}
Found Lagos: subtype=locality, admin_level=nan, id=d713af22-06ab-4968-9e12-61acbbfedc10
Names: {'primary': 'Lagos', 'common': None, 'rules': array([{'variant': 'alternate', 'language': None, 'perspectives': None, 'value': 'Legos', 'between': None, 'side': None}],
dtype=object)}
Regions in Nigeria (37):
Bayelsa (id=cb3667da-feca-4134-a2ab-9e56ae20d888)
Delta (id=52a1d730-bebf-436c-a651-68120bbde6da)
Rivers (id=0cd59ead-397b-4191-b509-3435273ce9e3)
Abia (id=bd1621b6-84b3-4740-9f81-37e9f2db2781)
Akwa Ibom (id=32c1ef4d-73b9-40c5-b678-75f7db8ae2ae)
Lagos (id=246ecf78-8622-4d7d-8d12-102e73a67cda)
Osun (id=c5325914-264e-40bb-a6b5-f0f251fa252c)
Ondo (id=5a77317a-df8c-474d-ad3f-c4139e76977a)
Ogun (id=cd9acba9-bca3-4c03-a710-9f7595c9ba79)
Ekiti (id=cc5fd654-fe96-4669-8ebb-c31f3b518300)
Benue (id=7d015b44-b243-4e4d-b783-c601da19caf5)
Enugu (id=8fb0401f-2882-46c1-9c82-09cc6e440103)
Edo (id=ec265234-0fa9-4029-8708-d231086e6614)
Anambra (id=b969d978-f37c-497e-b9e0-4e919b08bcaa)
Ebonyi (id=321af3fa-daf6-4ee4-a44f-9320ab988822)
Cross River (id=1080bc17-a8ff-463d-beb5-737d9ce7caf5)
Imo (id=816c0609-8b25-416e-a371-1cef367af993)
Niger (id=4ef29b92-843f-44b8-9c96-0dfd6b5e8df5)
Bauchi (id=75bb3c50-9a72-47d7-ba5f-62265d45f259)
Gombe (id=249d1b4f-b142-46ae-96e2-936eb179ffb3)
Kaduna (id=fc099834-bdc2-455a-8e39-a1d9461b7306)
Plateau (id=dab63028-e1ea-432e-b046-e4c2cdc6e554)
Kwara (id=a885d9fc-5f3f-435c-b6bc-5fbbaa8354c4)
Federal Capital Territory (id=7024d0ba-3e1f-48f8-8cce-68cd4c547546)
Oyo (id=11b2c0bc-6ff6-4fa3-a280-16223f784f2c)
Kogi (id=eafcc82d-b007-419c-ae1e-e044a43f3acb)
Nasarawa (id=7eba7479-07c6-4cb6-b0ef-8ed680046706)
Taraba (id=1bf3556c-1792-47be-8be1-b93582b26295)
Katsina (id=ab9f8fd2-b0a6-4b36-8439-079c7d5a327b)
Sokoto (id=8d8f1152-7457-45d3-a1e4-00300d7e38b5)
Yobe (id=36afa46f-a4f9-4899-b845-48927bedc89e)
Jigawa (id=6f302ba8-593d-43ff-b3a1-2719c53c51a1)
Zamfara (id=ccd5e620-3571-462d-a220-11e75484a710)
Kebbi (id=3cf85a9c-f776-479f-b0f1-a9ac34750aae)
Kano (id=a314f96f-6eb8-4d6a-916e-70feaf6bd01a)
Adamawa (id=e373feee-2375-4109-848e-ded05ed4c14c)
Borno (id=8d29b482-cc24-4a48-8f4c-45e6ce714bc0)
Localities in Nigeria (23260):
Lagos-Iyidi (id=ee95bda9-42c1-4a3d-9385-671da35b97fd, level=nan)
Lagos (id=902166cd-3918-4b7d-b342-eac736aa984e, level=nan)
Ikeja (id=83ba1a0c-3a16-415a-a70a-b0c0da41ceff, level=nan)
Lagos Ngwa (id=692bf9bf-2560-4099-bd46-17e572ec44b4, level=nan)
Lagos (id=d713af22-06ab-4968-9e12-61acbbfedc10, level=nan)
5:26
I found the Lagos region (State) with id=246ecf78-8622-4d7d-8d12-102e73a67cda. Now I need to get the geometry of this region to use as a spatial filter for buildings. Then I need to find all LGAs (Local Government Areas) within Lagos State. In Overture, LGAs might be the "county" subtype within the region. Let me look at what divisions exist within Lagos State - specifically counties (which should be the LGAs) and also get the boundary geometry.
5:26
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
# Lagos state bbox
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
# Get division_area to find Lagos State geometry
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
if reader:
import pandas as pd
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
print(f"Total division_areas: {len(df)}")
# Filter for Nigeria
ng = df[df['country'] == 'NG']
print(f"Nigeria: {len(ng)}")
# Find Lagos region
for i, row in ng.iterrows():
name = row['names']
if name and 'primary' in name and name['primary'] == 'Lagos' and row['subtype'] == 'region':
print(f"Found Lagos region: id={row['id']}, subtype={row['subtype']}")
break
else:
print("No division_area reader")
PYEOF
Total division_areas: 58 Nigeria: 40 Found Lagos region: id=5fb3f6cb-3993-4bb7-8a53-73e775feb14f, subtype=region
5:47
So I've found: 1. Lagos State region ID in `division_area`: `5fb3f6cb-3993-4bb7-8a53-73e775feb14f` 2. Lagos State region ID in `division`: `246ecf78-8622-4d7d-8d12-102e73a67cda` Now I need to: 1. Get the Lagos State boundary geometry from `division_area` (or `division`) 2. Query buildings within that boundary 3. Filter buildings with footprint > 1000 m² 4. Assign LGA names (Local Government Areas) - in Nigeria, LGAs are admin_level 2 or 3 areas Let me think about this more carefully. For Nigeria's administrative hierarchy: - Country (admin_level 1 in Overture? Actually country is subtype='country') - States (subtype='region', admin_level=1) - Lagos State - LGAs (Local Government Areas) - These could be subtype='county' or 'locality' with certain admin_level Let me first get the Lagos state boundary and then find LGAs within it.
5:47
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
import pandas as pd
# Lagos state bbox
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
# Get division_areas for Lagos region
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
ng = df[df['country'] == 'NG']
# Find Lagos region
lagos_region = ng[(ng['subtype'] == 'region') & (ng['names'].apply(lambda x: x.get('primary') if x else '') == 'Lagos')]
print(f"Lagos region row count: {len(lagos_region)}")
# Get the geometry
import shapely.wkb as wkb_lib
from shapely import wkb
for i, row in lagos_region.iterrows():
geom = wkb.loads(row['geometry'])
print(f"Geometry type: {geom.geom_type}")
print(f"Area (degrees^2): {geom.area}")
print(f"Bounds: {geom.bounds}")
print(f"Id: {row['id']}")
PYEOF
Lagos region row count: 1 Geometry type: Polygon Area (degrees^2): 0.3030300903079853 Bounds: (2.7059846, 6.366684, 4.3509056, 6.6967964) Id: 5fb3f6cb-3993-4bb7-8a53-73e775feb14f
6:16
Now I have the Lagos State geometry. Let me now build a comprehensive script to: 1. Get the Lagos State boundary geometry 2. Get all buildings within that boundary (or a generous bbox) 3. Filter buildings over 1000 m² 4. Get all LGAs (localities/subtype=locality) within Lagos State 5. For each building, determine which LGA it falls in 6. Compute the summary statistics 7. Write outputs Let me approach this step by step. First, I need to understand the administrative hierarchy. Lagos State has LGAs (Local Government Areas). In Overture, these might be at the "county" level (admin_level=2) as subdivisions of the region. Let me check what's available.
6:16
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
import pandas as pd
from shapely import wkb
# Get Lagos state geometry first
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
ng = df[df['country'] == 'NG']
lagos_region = ng[(ng['subtype'] == 'region') & (ng['names'].apply(lambda x: x.get('primary') if x else '') == 'Lagos')]
lagos_geom = wkb.loads(lagos_region.iloc[0]['geometry'])
lagos_region_id = lagos_region.iloc[0]['id']
print(f"Lagos region id: {lagos_region_id}")
# Now get divisions within/around Lagos - look for counties (LGAs)
# Use a wider bbox that covers Lagos state
reader2 = overturemaps.record_batch_reader('division', bbox=lagos_geom.bounds)
all_rows2 = []
for batch in reader2:
all_rows2.append(batch.to_pandas())
divs = pd.concat(all_rows2, ignore_index=True)
print(f"Total divisions in area: {len(divs)}")
# Filter for Nigeria
ng_divs = divs[divs['country'] == 'NG']
print(f"Nigeria divisions: {len(ng_divs)}")
# Filter for subtype = county (LGAs in Nigeria are admin level 2)
lgas = ng_divs[ng_divs['subtype'] == 'county']
print(f"Counties (potential LGAs): {len(lgas)}")
for i, row in lgas.iterrows():
name = row['names']
if name and 'primary' in name:
print(f" {name['primary']} (id={row['id']})")
# Also check for localities that might be LGAs
localities = ng_divs[ng_divs['subtype'] == 'locality']
print(f"\nLocalities: {len(localities)}")
# Check admin_level distribution
print(f"Admin levels in localities: {localities['admin_level'].value_counts().to_dict()}")
PYEOF
Lagos region id: 5fb3f6cb-3993-4bb7-8a53-73e775feb14f
Total divisions in area: 451
Nigeria divisions: 406
Counties (potential LGAs): 22
Amuwo Odofin (id=7583d723-71b9-4f18-9fff-bff98df8322d)
Badagry (id=202749c9-2473-4ecb-8528-72dc7a1278ee)
Ojo (id=2c66b8ba-4745-4334-b145-ae00b1a6f604)
Ajeromi/Ifelodun (id=b9ecb255-fc97-4e0a-8bc7-9dd0588f7a21)
Eti Osa (id=4d40b68b-a144-47cc-9345-4604adc7fb60)
Surulere (id=22ef6348-2be5-4543-8f1d-6e9faf24e126)
Lagos Mainland (id=4f1aabae-5a08-4179-bdfb-f0e9d71d2ca5)
Lagos Island (id=65a83885-eae9-4a94-8663-f6917cc9a04e)
Ibeju Lekki (id=343b4b84-9150-4f3b-937f-c58ca20db27f)
Apapa (id=dc5547ff-cb10-4818-8e9b-363c14daa13b)
Epe (id=053c8979-6fdf-4078-a9d1-3b545a06d72d)
Agege (id=24e04d44-9e6e-47ba-a71c-737793381785)
Ikeja (id=af88a8c6-f84f-4629-a25b-1e79a9ccbc3d)
Ìkòròdú (id=634b4f3c-a5c4-42d2-b18b-1432b67373b8)
Shomolu (id=1b64c604-707e-4a65-a114-75f0f1abb3c5)
Mushin (id=fe1ca044-b547-49be-9883-2c8ffa3f04d5)
Oshodi/Isolo (id=e199f023-49ae-46ff-92a4-580c4fe1346d)
Kosofe (id=9713c2c3-7416-41b8-9912-6e1450d3347d)
Alimosho (id=b4bd3fef-918f-4a60-bfd3-3569ed991562)
Ado Odo/Ota (id=14cdd032-7c30-44aa-beee-476b4d62184e)
Ipokia (id=199f6f87-94c7-40e8-ba34-662f345e5794)
Ifako/Ijaye (id=a9df031a-eaa0-4b74-b42c-5bf303feb0e1)
Localities: 309
Admin levels in localities: {}
6:42
I can see there are 20 LGAs in Lagos State (plus some from Ogun state like Ado Odo/Ota and Ipokia). Let me get the division_area geometries for the LGAs to use them for spatial filtering of buildings. Let me also look at the division_area for these LGAs so I can clip buildings to each LGA. Actually, the best approach is: 1. Get the Lagos state geometry from division_area 2. Get the LGA (county) geometries from division_area 3. Download all buildings within Lagos state bbox 4. Filter buildings that intersect the Lagos state boundary 5. Spatially join with LGAs 6. Calculate the required statistics Let me use overturemaps to download all buildings in the Lagos bbox, then filter spatially with DuckDB or GeoPandas.
6:42
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
import pandas as pd
from shapely import wkb
# Get Lagos state geometry
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
ng = df[df['country'] == 'NG']
lagos_row = ng[(ng['subtype'] == 'region') & (ng['names'].apply(lambda x: x.get('primary') if x else '') == 'Lagos')]
lagos_geom = wkb.loads(lagos_row.iloc[0]['geometry'])
print(f"Lagos bounds: {lagos_geom.bounds}")
# Get the counties (LGAs) division_area geometries
county_ids = [
"7583d723-71b9-4f18-9fff-bff98df8322d", "202749c9-2473-4ecb-8528-72dc7a1278ee",
"2c66b8ba-4745-4334-b145-ae00b1a6f604", "b9ecb255-fc97-4e0a-8bc7-9dd0588f7a21",
"4d40b68b-a144-47cc-9345-4604adc7fb60", "22ef6348-2be5-4543-8f1d-6e9faf24e126",
"4f1aabae-5a08-4179-bdfb-f0e9d71d2ca5", "65a83885-eae9-4a94-8663-f6917cc9a04e",
"343b4b84-9150-4f3b-937f-c58ca20db27f", "dc5547ff-cb10-4818-8e9b-363c14daa13b",
"053c8979-6fdf-4078-a9d1-3b545a06d72d", "24e04d44-9e6e-47ba-a71c-737793381785",
"af88a8c6-f84f-4629-a25b-1e79a9ccbc3d", "634b4f3c-a5c4-42d2-b18b-1432b67373b8",
"1b64c604-707e-4a65-a114-75f0f1abb3c5", "fe1ca044-b547-49be-9883-2c8ffa3f04d5",
"e199f023-49ae-46ff-92a4-580c4fe1346d", "9713c2c3-7416-41b8-9912-6e1450d3347d",
"b4bd3fef-918f-4a60-bfd3-3569ed991562", "a9df031a-eaa0-4b74-b42c-5bf303feb0e1"
]
# But wait - we need to get the division_area for those specific county IDs
# The division_area table should have matching IDs
# Let's look at the ng counties that overlap with Lagos state
counties_in_lagos = ng[ng['subtype'] == 'county']
print(f"\nCounties in division_area: {len(counties_in_lagos)}")
for i, row in counties_in_lagos.iterrows():
name = row['names']
primary_name = name.get('primary') if name else 'N/A'
print(f" {primary_name} (id={row['id']})")
PYEOF
Lagos bounds: (2.7059846, 6.366684, 4.3509056, 6.6967964) Counties in division_area: 30 Ipokia (id=00496a96-e1ca-42cb-a1c8-531f48aa7461) Badagry (id=93ccd19d-3389-4c00-a8e3-920e20810d39) Yewa South (id=dc9f547a-48ef-4f9c-a90b-31f6b247a3b8) Ado Odo/Ota (id=51955eb1-4697-4268-b8e2-8554f7a77475) Amuwo Odofin (id=32cfa1af-5e74-41a7-9272-b9dd8da715d9) Ifako/Ijaye (id=934f5b76-4cbf-4902-b124-b33d057d87b1) Ifo (id=7aa34b16-ba84-45d9-94c4-083c65b8cf09) Alimosho (id=1a9532db-bcd5-4050-b62b-f27a22f2ebcf) Ojo (id=7918fd74-2ce4-48af-b3c1-1a4f1e26558e) Shagamu (id=1f5072a4-71e6-409b-89c9-cd999be6a442) Obafemi Owode (id=c0111aed-bd0c-4c3f-a90b-51d3ac847093) Kosofe (id=cb527384-149e-41c1-b3c5-0070deab27d4) Shomolu (id=b098198f-55d1-46a6-ac1e-71ae35fdd913) Lagos Mainland (id=639f85d6-f581-4f31-b52b-98da63a57271) Apapa (id=4b715cfc-a9b3-4a05-b601-353b77619939) Mushin (id=ffe6c369-91e2-4984-97d1-bd24feff3732) Ikeja (id=c540f662-44c9-498d-8f3d-2a41a039a29f) Agege (id=fcaab4e8-075a-4ab6-a2d2-cd614e95dfc2) Oshodi/Isolo (id=457a7986-f212-4765-b432-e1c1b77d6d61) Ajeromi/Ifelodun (id=e2af9375-5eed-46f8-a258-515f22297084) Surulere (id=f0f8532f-4c5e-428a-981e-12a31b2a9044) Lagos Island (id=c58fd335-e502-421c-b8e3-f263c5fb0ff7) Eti Osa (id=0c5600b6-df25-4e75-a25e-ae9937299bad) Ìkòròdú (id=030cb2c1-697b-418e-b820-49393989a03b) Epe (id=61530733-5992-41f4-9212-43f1b460319c) Ibeju Lekki (id=70bd9c0f-fcdd-4bd2-81df-34d9351c997e) Ìjẹ̀bú Òde (id=a277c8df-124c-47b7-9ab5-092e7f3bcdab) Odogbolu (id=d8c2f7a3-d77d-4781-ba92-5359250c2516) Ikenne (id=fe2221c4-4b6b-4bad-b9c4-12a6f7c5c849) Ijebu East (id=6b8e9727-a665-4667-a087-31276385e8b6)
7:05
I see there are more counties in the division_area than what I found in the division table. That's because the division_area table has county-level geometries that overlap with the Lagos bounding box. Some of these are in Ogun state (like Ado Odo/Ota, Ipokia, etc.), not Lagos state proper. I need to get the division_area geometries for Lagos State's actual LGAs. Let me get the division_area for all counties and then filter by spatial intersection with the Lagos region geometry. Let me think about the best approach: 1. Get Lagos State polygon from `division_area` where subtype='region' and name='Lagos' 2. Get all county polygons from `division_area` that intersect with Lagos State 3. Download all buildings in the Lagos State bbox 4. Filter buildings by spatial intersection with Lagos State 5. Calculate footprint area for each building (area of geometry in square meters) 6. Join with LGA info 7. Output the two files Let me write this properly.
7:05
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import overturemaps
import pandas as pd
from shapely import wkb
import geopandas as gpd
# Get Lagos state geometry from division_area
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
ng = df[df['country'] == 'NG']
# Get Lagos region
lagos_row = ng[(ng['subtype'] == 'region') & (ng['names'].apply(lambda x: x.get('primary') if x else '') == 'Lagos')]
lagos_geom = wkb.loads(lagos_row.iloc[0]['geometry'])
lagos_gdf = gpd.GeoDataFrame(geometry=[lagos_geom], crs='EPSG:4326')
print(f"Lagos State bounds: {lagos_geom.bounds}")
# Get all counties (LGAs) from division_area
counties = ng[ng['subtype'] == 'county'].copy()
print(f"Total counties in bbox area: {len(counties)}")
# Convert to GeoDataFrame
county_geoms = [wkb.loads(g) for g in counties['geometry']]
county_gdf = gpd.GeoDataFrame({
'id': counties['id'].values,
'name': counties['names'].apply(lambda x: x.get('primary') if x else ''),
}, geometry=county_geoms, crs='EPSG:4326')
# Spatial join - counties that intersect Lagos state
lagos_counties = gpd.sjoin(county_gdf, lagos_gdf, how='inner', predicate='intersects')
print(f"\nLGAs in Lagos State ({len(lagos_counties)}):")
for i, row in lagos_counties.iterrows():
print(f" {row['name']} (id={row['id']})")
# Save county geometries for later use
lagos_counties.to_parquet('/work/lagos_lgas.parquet')
# Also save Lagos state polygon
lagos_gdf.to_parquet('/work/lagos_state.parquet')
PYEOF
Lagos State bounds: (2.7059846, 6.366684, 4.3509056, 6.6967964) Total counties in bbox area: 30 LGAs in Lagos State (27): Ipokia (id=00496a96-e1ca-42cb-a1c8-531f48aa7461) Badagry (id=93ccd19d-3389-4c00-a8e3-920e20810d39) Ado Odo/Ota (id=51955eb1-4697-4268-b8e2-8554f7a77475) Amuwo Odofin (id=32cfa1af-5e74-41a7-9272-b9dd8da715d9) Ifako/Ijaye (id=934f5b76-4cbf-4902-b124-b33d057d87b1) Ifo (id=7aa34b16-ba84-45d9-94c4-083c65b8cf09) Alimosho (id=1a9532db-bcd5-4050-b62b-f27a22f2ebcf) Ojo (id=7918fd74-2ce4-48af-b3c1-1a4f1e26558e) Shagamu (id=1f5072a4-71e6-409b-89c9-cd999be6a442) Kosofe (id=cb527384-149e-41c1-b3c5-0070deab27d4) Shomolu (id=b098198f-55d1-46a6-ac1e-71ae35fdd913) Lagos Mainland (id=639f85d6-f581-4f31-b52b-98da63a57271) Apapa (id=4b715cfc-a9b3-4a05-b601-353b77619939) Mushin (id=ffe6c369-91e2-4984-97d1-bd24feff3732) Ikeja (id=c540f662-44c9-498d-8f3d-2a41a039a29f) Agege (id=fcaab4e8-075a-4ab6-a2d2-cd614e95dfc2) Oshodi/Isolo (id=457a7986-f212-4765-b432-e1c1b77d6d61) Ajeromi/Ifelodun (id=e2af9375-5eed-46f8-a258-515f22297084) Surulere (id=f0f8532f-4c5e-428a-981e-12a31b2a9044) Lagos Island (id=c58fd335-e502-421c-b8e3-f263c5fb0ff7) Eti Osa (id=0c5600b6-df25-4e75-a25e-ae9937299bad) Ìkòròdú (id=030cb2c1-697b-418e-b820-49393989a03b) Epe (id=61530733-5992-41f4-9212-43f1b460319c) Ibeju Lekki (id=70bd9c0f-fcdd-4bd2-81df-34d9351c997e) Ìjẹ̀bú Òde (id=a277c8df-124c-47b7-9ab5-092e7f3bcdab) Odogbolu (id=d8c2f7a3-d77d-4781-ba92-5359250c2516) Ijebu East (id=6b8e9727-a665-4667-a087-31276385e8b6)
7:44
Let me check - Overture's county-level division_area seems to include some counties that are actually in Ogun State (like Ipokia, Ado Odo/Ota, Ifo, Shagamu, Ìjẹ̀bú Òde, Odogbolu, Ikenne, Ijebu East) which just intersect the Lagos bbox. I need to properly clip them to the Lagos state boundary. Actually, the proper approach is: 1. Get Lagos state polygon 2. Get all buildings within Lagos bbox from Overture 3. Intersect buildings with Lagos state polygon (via spatial join) 4. Also intersect with LGA polygons 5. Calculate footprint area for each building 6. Filter to buildings > 1000 m² 7. Generate the summary per LGA But wait - these LGA division_areas from Overture don't perfectly match the Lagos state boundary. Some are partially in Lagos and partially in Ogun. I need to properly clip them. Let me think about this differently. The LGAs properly within Lagos State are: Alimosho, Amuwo Odofin, Agege, Ajeromi/Ifelodun, Apapa, Badagry, Epe, Eti Osa, Ibeju Lekki, Ifako/Ijaye, Ikeja, Ikorodu (Ìkòròdú), Kosofe, Lagos Island, Lagos Mainland, Mushin, Ojo, Oshodi/Isolo, Shomolu, Surulere That's 20 LGAs. Let me look up the official list. The 20 LGAs of Lagos State are: 1. Agege 2. Ajeromi-Ifelodun 3. Alimosho 4. Amuwo-Odofin 5. Apapa 6. Badagry 7. Epe 8. Eti-Osa 9. Ibeju-Lekki 10. Ifako-Ijaiye 11. Ikeja 12. Ikorodu 13. Kosofe 14. Lagos Island 15. Lagos Mainland 16. Mushin 17. Ojo 18. Oshodi-Isolo 19. Shomolu 20. Surulere So Overture's county data includes some extra counties from neighboring Ogun State. I should use a spatial join with Lagos State boundary to filter. Let me write the full script now.
7:44
Bash
cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import pandas as pd
# Let's check which of the 27 counties are actually within Lagos state boundary
import overturemaps
from shapely import wkb
import geopandas as gpd
# Lagos state geometry
lagos_bbox = (2.5, 6.2, 4.0, 6.8)
reader = overturemaps.record_batch_reader('division_area', bbox=lagos_bbox)
all_rows = []
for batch in reader:
all_rows.append(batch.to_pandas())
df = pd.concat(all_rows, ignore_index=True)
ng = df[df['country'] == 'NG']
lagos_row = ng[(ng['subtype'] == 'region') & (ng['names'].apply(lambda x: x.get('primary') if x else '') == 'Lagos')]
lagos_geom = wkb.loads(lagos_row.iloc[0]['geometry'])
counties = ng[ng['subtype'] == 'county'].copy()
county_geoms = [wkb.loads(g) for g in counties['geometry']]
# Get centroid of each county to determine if it's mostly inside Lagos
for i, (idx, row) in enumerate(counties.iterrows()):
geom = county_geoms[i]
centroid = geom.centroid
inside_lagos = lagos_geom.contains(centroid)
name = row['names'].get('primary') if row['names'] else 'N/A'
# Also check what proportion overlaps
intersection = lagos_geom.intersection(geom)
overlap_ratio = intersection.area / geom.area if geom.area > 0 else 0
print(f"{name}: centroid inside Lagos={inside_lagos}, overlap ratio={overlap_ratio:.3f}")
PYEOF
Ipokia: centroid inside Lagos=False, overlap ratio=0.000 Badagry: centroid inside Lagos=True, overlap ratio=1.000 Yewa South: centroid inside Lagos=False, overlap ratio=0.000 Ado Odo/Ota: centroid inside Lagos=False, overlap ratio=0.000 Amuwo Odofin: centroid inside Lagos=True, overlap ratio=1.000 Ifako/Ijaye: centroid inside Lagos=True, overlap ratio=1.000 Ifo: centroid inside Lagos=False, overlap ratio=0.000 Alimosho: centroid inside Lagos=True, overlap ratio=1.000 Ojo: centroid inside Lagos=True, overlap ratio=1.000 Shagamu: centroid inside Lagos=False, overlap ratio=0.000 Obafemi Owode: centroid inside Lagos=False, overlap ratio=0.000 Kosofe: centroid inside Lagos=True, overlap ratio=1.000 Shomolu: centroid inside Lagos=True, overlap ratio=1.000 Lagos Mainland: centroid inside Lagos=True, overlap ratio=1.000 Apapa: centroid inside Lagos=True, overlap ratio=1.000 Mushin: centroid inside Lagos=True, overlap ratio=1.000 Ikeja: centroid inside Lagos=True, overlap ratio=1.000 Agege: centroid inside Lagos=True, overlap ratio=1.000 Oshodi/Isolo: centroid inside Lagos=True, overlap ratio=1.000 Ajeromi/Ifelodun: centroid inside Lagos=True, overlap ratio=1.000 Surulere: centroid inside Lagos=True, overlap ratio=1.000 Lagos Island: centroid inside Lagos=True, overlap ratio=1.000 Eti Osa: centroid inside Lagos=True, overlap ratio=1.000 Ìkòròdú: centroid inside Lagos=True, overlap ratio=1.000 Epe: centroid inside Lagos=True, overlap ratio=1.000 Ibeju Lekki: centroid inside Lagos=True, overlap ratio=1.000 Ìjẹ̀bú Òde: centroid inside Lagos=False, overlap ratio=0.000 Odogbolu: centroid inside Lagos=False, overlap ratio=0.000 Ikenne: centroid inside Lagos=False, overlap ratio=0.000 Ijebu East: centroid inside Lagos=False, overlap ratio=0.000
8:38
So the actual Lagos State LGAs are those with centroid inside Lagos: Badagry, Amuwo Odofin, Ifako/Ijaye, Alimosho, Ojo, Kosofe, Shomolu, Lagos Mainland, Apapa, Mushin, Ikeja, Agege, Oshodi/Isolo, Ajeromi/Ifelodun, Surulere, Lagos Island, Eti Osa, Ìkòròdú, Epe, Ibeju Lekki. That's 20 LGAs. Wait - I should also check: what about the 21 officially recognized Lagos State LGAs? Let me compare with known Lagos LGAs. The 20 LGAs in Lagos State are: 1. Agege 2. Ajeromi-Ifelodun 3. Alimosho 4. Amuwo-Odofin 5. Apapa 6. Badagry 7. Epe 8. Eti-Osa 9. Ibeju-Lekki 10. Ifako-Ijaiye 11. Ikeja 12. Ikorodu (Ìkòròdú) 13. Kosofe 14. Lagos Island 15. Lagos Mainland 16. Mushin 17. Ojo 18. Oshodi-Isolo 19. Shomolu 20. Surulere So 20 official LGAs. We have 20 with centroids inside Lagos. That matches well! Now I need to get the buildings that exceed 1000 m² and intersect Lagos State. The approach: 1. Download all buildings in the Lagos State bounding box (2.7, 6.37, 4.35, 6.70) 2. Filter to those that are actually within the Lagos State geometry 3. Calculate footprint area (need to project to a metric CRS since geometry is in EPSG:4326) 4. Filter to area > 1000 m² 5. Assign each building to an LGA using spatial join 6. Write output files Let me also get the division_area geometries for the actual Lagos LGAs.
8:38
Write
/work/solve.py
content · 216 lines · py
"""
Extract buildings > 1000 m² in Lagos State (Nigeria) from Overture Maps,
assign to Local Government Areas (LGAs), produce two output files.
"""
import sys
import time
import logging
logging.basicConfig(level=logging.INFO, stream=sys.stderr,
format="%(asctime)s [%(levelname)s] %(message)s")
log = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# 1. Get Lagos State geometry and LGA boundaries
# ---------------------------------------------------------------------------
import overturemaps
import pandas as pd
import geopandas as gpd
from shapely import wkb
import pyarrow.compute as pc
import pyarrow as pa
import numpy as np
log.info("Reading Lagos State & LGA boundaries from Overture divisions...")
# Lagos State bounding box (slightly buffered)
LAGOS_BBOX = (2.68, 6.35, 4.38, 6.72)
# Read division_area for this bbox
reader = overturemaps.record_batch_reader("division_area", bbox=LAGOS_BBOX)
batches = []
for batch in reader:
batches.append(batch.to_pandas())
div_area = pd.concat(batches, ignore_index=True)
log.info("Read %d division_area features", len(div_area))
# Filter Nigeria
ng = div_area[div_area["country"] == "NG"].copy()
# --- Lagos State geometry -------------------------------------------------
lagos_row = ng[
(ng["subtype"] == "region")
& (ng["names"].apply(lambda x: x.get("primary") if isinstance(x, dict) else ""))
== "Lagos"
]
if len(lagos_row) == 0:
raise RuntimeError("Could not find Lagos region in division_area")
lagos_geom = wkb.loads(lagos_row.iloc[0]["geometry"])
lagos_gdf = gpd.GeoDataFrame(geometry=[lagos_geom], crs="EPSG:4326")
log.info("Lagos State geometry loaded (%.4f, %.4f, %.4f, %.4f)",
*lagos_geom.bounds)
# --- LGA boundaries (county level) ----------------------------------------
ng_counties = ng[ng["subtype"] == "county"].copy()
ng_counties["geometry"] = ng_counties["geometry"].apply(wkb.loads)
ng_counties["name"] = ng_counties["names"].apply(
lambda x: x.get("primary") if isinstance(x, dict) else ""
)
county_gdf = gpd.GeoDataFrame(
ng_counties[["id", "name"]],
geometry=ng_counties["geometry"],
crs="EPSG:4326",
)
# Keep only counties whose centroid lies within Lagos State (this correctly
# identifies the 20 official Lagos State LGAs)
centroids = county_gdf.geometry.centroid
mask = centroids.intersects(lagos_geom)
lga_gdf = county_gdf[mask].copy().reset_index(drop=True)
lga_gdf["lga"] = lga_gdf["name"]
log.info("Identified %d LGAs in Lagos State", len(lga_gdf))
for _, r in lga_gdf.iterrows():
log.info(" LGA: %s", r["lga"])
# ---------------------------------------------------------------------------
# 2. Download buildings for the Lagos bbox
# ---------------------------------------------------------------------------
log.info("Downloading Overture buildings for Lagos bbox...")
reader = overturemaps.record_batch_reader("building", bbox=LAGOS_BBOX)
if reader is None:
raise RuntimeError("Failed to get building reader from Overture")
build_batches = []
row_count = 0
for batch in reader:
build_batches.append(batch)
row_count += batch.num_rows
log.info(" Read %d building rows so far...", row_count)
log.info("Total building rows read: %d", row_count)
if row_count == 0:
raise RuntimeError("No buildings found in Lagos bbox")
# Convert to single table
build_table = pa.concat_tables(build_batches) if len(build_batches) > 1 else build_batches[0]
# ---------------------------------------------------------------------------
# 3. Convert to GeoDataFrame, clip to Lagos State, compute area
# ---------------------------------------------------------------------------
log.info("Converting buildings to GeoDataFrame...")
# Read geometry as WKB
geom_col = build_table.column("geometry").to_pylist()
heights = build_table.column("height").to_pylist()
ids = build_table.column("id").to_pylist()
# Build a DataFrame
build_df = pd.DataFrame({
"id": ids,
"height": [float(h) if h is not None and not np.isnan(h) else None for h in heights],
})
# Convert geometry
build_geoms = [wkb.loads(g) for g in geom_col]
build_gdf = gpd.GeoDataFrame(build_df, geometry=build_geoms, crs="EPSG:4326")
log.info("Building GeoDataFrame has %d rows", len(build_gdf))
# Clip to Lagos State boundary (spatial filter)
log.info("Clipping buildings to Lagos State boundary...")
build_in_lagos = build_gdf[build_gdf.geometry.intersects(lagos_geom)].copy()
log.info("Buildings intersecting Lagos: %d", len(build_in_lagos))
# Actually keep only buildings whose centroid is inside Lagos (cleaner)
centroids_b = build_in_lagos.geometry.centroid
build_in_lagos = build_in_lagos[centroids_b.intersects(lagos_geom)].copy()
log.info("Buildings with centroid in Lagos: %d", len(build_in_lagos))
# Compute area in a metric CRS (UTM zone 31N covers Lagos ~ 3-4°E, region is 2.7-4.4°E)
# Actually, for Nigeria, use Nigeria's UTM zones. Lagos is in UTM zone 31N (EPSG:32631)
log.info("Computing footprint areas (EPSG:32631)...")
build_metric = build_in_lagos.to_crs("EPSG:32631")
build_metric["footprint_area_m2"] = build_metric.geometry.area
# Filter to > 1000 m²
build_large = build_metric[build_metric["footprint_area_m2"] > 1000.0].copy()
log.info("Buildings > 1000 m²: %d", len(build_large))
# Convert back to 4326 for output
build_large = build_large.to_crs("EPSG:4326")
# ---------------------------------------------------------------------------
# 4. Spatial join to assign LGA
# ---------------------------------------------------------------------------
log.info("Assigning LGA via spatial join...")
# Use sjoin with intersects predicate
build_with_lga = gpd.sjoin(
build_large,
lga_gdf[["lga", "geometry"]],
how="left",
predicate="intersects",
)
# Keep only buildings that actually got an LGA (should be all of them)
build_with_lga = build_with_lga.dropna(subset=["lga"]).copy()
log.info("Buildings with LGA assigned: %d", len(build_with_lga))
# ---------------------------------------------------------------------------
# 5. Write lagos_buildings.geoparquet
# ---------------------------------------------------------------------------
out_cols = ["id", "height", "footprint_area_m2", "lga", "geometry"]
build_out = build_with_lga[out_cols].copy()
# Ensure height is float (null allowed)
build_out["height"] = build_out["height"].astype("float64")
log.info("Writing lagos_buildings.geoparquet (%d rows)...", len(build_out))
build_out.to_parquet(
"/work/lagos_buildings.geoparquet",
index=False,
compression="zstd",
)
log.info("Done writing lagos_buildings.geoparquet")
# ---------------------------------------------------------------------------
# 6. Per-LGA summary → lagos_building_summary.parquet
# ---------------------------------------------------------------------------
log.info("Computing per-LGA summary...")
summary_rows = []
for lga_name, grp in build_with_lga.groupby("lga"):
n_buildings = len(grp)
total_footprint = grp["footprint_area_m2"].sum()
has_height = grp["height"].notna()
n_with_height = has_height.sum()
if n_with_height > 0:
p50_height = float(grp.loc[has_height, "height"].median())
else:
p50_height = None
summary_rows.append({
"lga": lga_name,
"n_buildings": n_buildings,
"total_footprint_m2": total_footprint,
"n_with_height": n_with_height,
"p50_height_m": p50_height if p50_height is not None else None,
})
summary_df = pd.DataFrame(summary_rows)
# Ensure proper types
summary_df["n_buildings"] = summary_df["n_buildings"].astype("int64")
summary_df["total_footprint_m2"] = summary_df["total_footprint_m2"].astype("float64")
summary_df["n_with_height"] = summary_df["n_with_height"].astype("int64")
summary_df["p50_height_m"] = summary_df["p50_height_m"].astype("float64")
log.info("Summary:\n%s", summary_df.to_string())
summary_df.to_parquet(
"/work/lagos_building_summary.parquet",
index=False,
compression="zstd",
)
log.info("Done writing lagos_building_summary.parquet")
log.info("All tasks complete.")
Wrote /work/solve.py (8114 chars).
8:48
Let me run the script and see if it works correctly.
8:48
Bash
cd /work && python solve.py 2>&1
2026-06-18 00:44:13,465 [INFO] Reading Lagos State & LGA boundaries from Overture divisions...
2026-06-18 00:44:21,197 [INFO] Read 53 division_area features
/work/solve.py:42: Pandas4Warning: 'and' operations between boolean dtype and str are deprecated and will raise in a future version. Explicitly cast the strings to a boolean dtype before operating instead.
(ng["subtype"] == "region")
Traceback (most recent call last):
File "/work/solve.py", line 47, in <module>
raise RuntimeError("Could not find Lagos region in division_area")
RuntimeError: Could not find Lagos region in division_area
9:04
The issue is with the boolean filter. The line is:
```python
lagos_row = ng[
(ng["subtype"] == "region")
& (ng["names"].apply(lambda x: x.get("primary") if isinstance(x, dict) else ""))
== "Lagos"
]
```
The problem is operator precedence - `== "Lagos"` is being compared to the result of the `&` operation, not to the apply result. I need to wrap it in parentheses properly. Let me fix it.
9:04
Edit
/work/solve.py
Edited /work/solve.py.
9:12
Wait, the issue is operator precedence. Let me look again:
```python
lagos_mask = (
(ng["subtype"] == "region")
& (ng["names"].apply(lambda x: x.get("primary") if isinstance(x, dict) else "")) == "Lagos"
)
```
This is still wrong. The `&` has higher precedence than `==`, so it's doing:
`((ng["subtype"] == "region") & (ng["names"].apply(...))) == "Lagos"`
Which evaluates to False. Let me fix this properly.
9:12
Edit
/work/solve.py
Edited /work/solve.py.
9:15
The issue is operator precedence. Let me fix the boolean logic properly.
9:15
Read
/work/solve.py
1 """
2 Extract buildings > 1000 m² in Lagos State (Nigeria) from Overture Maps,
3 assign to Local Government Areas (LGAs), produce two output files.
4 """
5 import sys
6 import time
7 import logging
8
9 logging.basicConfig(level=logging.INFO, stream=sys.stderr,
10 format="%(asctime)s [%(levelname)s] %(message)s")
11 log = logging.getLogger(__name__)
12
13 # ---------------------------------------------------------------------------
14 # 1. Get Lagos State geometry and LGA boundaries
15 # ---------------------------------------------------------------------------
16 import overturemaps
17 import pandas as pd
18 import geopandas as gpd
19 from shapely import wkb
20 import pyarrow.compute as pc
21 import pyarrow as pa
22 import numpy as np
23
24 log.info("Reading Lagos State & LGA boundaries from Overture divisions...")
25
26 # Lagos State bounding box (slightly buffered)
27 LAGOS_BBOX = (2.68, 6.35, 4.38, 6.72)
28
29 # Read division_area for this bbox
30 reader = overturemaps.record_batch_reader("division_area", bbox=LAGOS_BBOX)
31 batches = []
32 for batch in reader:
33 batches.append(batch.to_pandas())
34 div_area = pd.concat(batches, ignore_index=True)
35 log.info("Read %d division_area features", len(div_area))
36
37 # Filter Nigeria
38 ng = div_area[div_area["country"] == "NG"].copy()
39
40 # --- Lagos State geometry -------------------------------------------------
41 lagos_names = ng["names"].apply(lambda x: x.get("primary") if isinstance(x, dict) else "")
42 lagos_mask = (ng["subtype"] == "region") & (lagos_names == "Lagos")
43 lagos_row = ng[lagos_mask]
44 if len(lagos_row) == 0:
45 raise RuntimeError("Could not find Lagos region in division_area")
46
47 lagos_geom = wkb.loads(lagos_row.iloc[0]["geometry"])
48 lagos_gdf = gpd.GeoDataFrame(geometry=[lagos_geom], crs="EPSG:4326")
49 log.info("Lagos State geometry loaded (%.4f, %.4f, %.4f, %.4f)",
50 *lagos_geom.bounds)
51
52 # --- LGA boundaries (county level) ----------------------------------------
53 ng_counties = ng[ng["subtype"] == "county"].copy()
54 ng_counties["geometry"] = ng_counties["geometry"].apply(wkb.loads)
55 ng_counties["name"] = ng_counties["names"].apply(
56 lambda x: x.get("primary") if isinstance(x, dict) else ""
57 )
58 county_gdf = gpd.GeoDataFrame(
59 ng_counties[["id", "name"]],
60 geometry=ng_counties["geometry"],
<system-reminder>Showed lines 1-60 of 214. Re-call Read with offset=61 to continue, or raise limit.</system-reminder>
9:22
The issue is that earlier, when I queried division_area, the bbox was (2.5, 6.2, 4.0, 6.8) and it returned 58 features with 40 in Nigeria. Now I'm using (2.68, 6.35, 4.38, 6.72) which might miss the Lagos region. Let me check what's happening. Actually, looking at the bounds from the earlier run: - Lagos bounds: (2.7059846, 6.366684, 4.3509056, 6.6967964) My new bbox (2.68, 6.35, 4.38, 6.72) should include it. But the count changed from 58 to 53 - maybe it's a different bbox. But the real issue is that the earlier query successfully found Lagos. Let me check if the bbox is the issue or if there's something else wrong. Wait, looking at the earlier output: "Total division_areas: 58", "Nigeria: 40". The new run says "Read 53 division_area features" - fewer because the bbox changed. But Lagos bounds (2.7059846, 6.366684, 4.3509056, 6.6967964) should still be within (2.68, 6.35, 4.38, 6.72). Let me just use the original bbox that worked.
9:22
Edit
/work/solve.py
Edited /work/solve.py.
9:31
The issue is that the original bbox (2.68, 6.35, 4.38, 6.72) was too tight - Lagos actual bounds from earlier were (2.7059846, 6.366684, 4.3509056, 6.6967964). But the bbox I used for the earlier successful query was (2.5, 6.2, 4.0, 6.8) and that worked. Let me also check the earlier Lagos bounds - the ymax was 6.6967964 but my bbox had 6.8 as max y. Actually wait, my earlier bbox was (2.5, 6.2, 4.0, 6.8) which worked fine. But in the script I changed it to (2.68, 6.35, 4.38, 6.72) which might not include the full Lagos state. Let me re-run now.
9:31
Bash
cd /work && python solve.py 2>&1
2026-06-18 00:44:56,337 [INFO] Reading Lagos State & LGA boundaries from Overture divisions...
2026-06-18 00:45:04,647 [INFO] Read 58 division_area features
2026-06-18 00:45:04,659 [INFO] Lagos State geometry loaded (2.7060, 6.3667, 4.3509, 6.6968)
/work/solve.py:66: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
centroids = county_gdf.geometry.centroid
2026-06-18 00:45:04,695 [INFO] Identified 20 LGAs in Lagos State
2026-06-18 00:45:04,695 [INFO] LGA: Badagry
2026-06-18 00:45:04,695 [INFO] LGA: Amuwo Odofin
2026-06-18 00:45:04,696 [INFO] LGA: Ifako/Ijaye
2026-06-18 00:45:04,696 [INFO] LGA: Alimosho
2026-06-18 00:45:04,696 [INFO] LGA: Ojo
2026-06-18 00:45:04,696 [INFO] LGA: Kosofe
2026-06-18 00:45:04,696 [INFO] LGA: Shomolu
2026-06-18 00:45:04,696 [INFO] LGA: Lagos Mainland
2026-06-18 00:45:04,696 [INFO] LGA: Apapa
2026-06-18 00:45:04,696 [INFO] LGA: Mushin
2026-06-18 00:45:04,696 [INFO] LGA: Ikeja
2026-06-18 00:45:04,696 [INFO] LGA: Agege
2026-06-18 00:45:04,696 [INFO] LGA: Oshodi/Isolo
2026-06-18 00:45:04,696 [INFO] LGA: Ajeromi/Ifelodun
2026-06-18 00:45:04,696 [INFO] LGA: Surulere
2026-06-18 00:45:04,696 [INFO] LGA: Lagos Island
2026-06-18 00:45:04,696 [INFO] LGA: Eti Osa
2026-06-18 00:45:04,697 [INFO] LGA: Ìkòròdú
2026-06-18 00:45:04,697 [INFO] LGA: Epe
2026-06-18 00:45:04,697 [INFO] LGA: Ibeju Lekki
2026-06-18 00:45:04,697 [INFO] Downloading Overture buildings for Lagos bbox...
2026-06-18 00:45:54,166 [INFO] Read 1087 building rows so far...
2026-06-18 00:45:54,168 [INFO] Read 4745 building rows so far...
2026-06-18 00:45:54,526 [INFO] Read 6621 building rows so far...
2026-06-18 00:45:56,647 [INFO] Read 8969 building rows so far...
2026-06-18 00:45:56,663 [INFO] Read 49683 building rows so far...
2026-06-18 00:45:56,663 [INFO] Read 75647 building rows so far...
2026-06-18 00:45:56,666 [INFO] Read 114931 building rows so far...
2026-06-18 00:45:58,878 [INFO] Read 140810 building rows so far...
2026-06-18 00:45:58,883 [INFO] Read 186488 building rows so far...
2026-06-18 00:45:58,885 [INFO] Read 222342 building rows so far...
2026-06-18 00:45:58,932 [INFO] Read 266133 building rows so far...
2026-06-18 00:45:58,933 [INFO] Read 310124 building rows so far...
2026-06-18 00:45:58,935 [INFO] Read 351451 building rows so far...
2026-06-18 00:45:58,935 [INFO] Read 361294 building rows so far...
2026-06-18 00:45:58,935 [INFO] Read 398337 building rows so far...
2026-06-18 00:45:58,935 [INFO] Read 411279 building rows so far...
2026-06-18 00:45:58,941 [INFO] Read 438681 building rows so far...
2026-06-18 00:45:58,942 [INFO] Read 452039 building rows so far...
2026-06-18 00:45:58,944 [INFO] Read 489482 building rows so far...
2026-06-18 00:45:58,944 [INFO] Read 526032 building rows so far...
2026-06-18 00:45:58,947 [INFO] Read 566064 building rows so far...
2026-06-18 00:45:58,947 [INFO] Read 583074 building rows so far...
2026-06-18 00:45:58,957 [INFO] Read 632767 building rows so far...
2026-06-18 00:45:58,957 [INFO] Read 665430 building rows so far...
2026-06-18 00:45:58,958 [INFO] Read 709450 building rows so far...
2026-06-18 00:45:58,958 [INFO] Read 753090 building rows so far...
2026-06-18 00:45:58,958 [INFO] Read 797485 building rows so far...
2026-06-18 00:45:58,959 [INFO] Read 844448 building rows so far...
2026-06-18 00:45:58,959 [INFO] Read 888335 building rows so far...
2026-06-18 00:45:58,959 [INFO] Read 925383 building rows so far...
2026-06-18 00:45:58,959 [INFO] Read 962843 building rows so far...
2026-06-18 00:45:58,959 [INFO] Read 999560 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1036481 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1077914 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1117263 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1156122 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1190752 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1229506 building rows so far...
2026-06-18 00:45:58,960 [INFO] Read 1264762 building rows so far...
2026-06-18 00:45:58,965 [INFO] Read 1283786 building rows so far...
2026-06-18 00:45:58,966 [INFO] Read 1318708 building rows so far...
2026-06-18 00:45:58,966 [INFO] Read 1339412 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1382537 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1390176 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1428363 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1441026 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1478930 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1517878 building rows so far...
2026-06-18 00:45:58,969 [INFO] Read 1554437 building rows so far...
2026-06-18 00:45:58,970 [INFO] Read 1586619 building rows so far...
2026-06-18 00:46:03,552 [INFO] Read 1628496 building rows so far...
2026-06-18 00:46:03,554 [INFO] Read 1664892 building rows so far...
2026-06-18 00:46:03,577 [INFO] Read 1704854 building rows so far...
2026-06-18 00:46:03,582 [INFO] Read 1747098 building rows so far...
2026-06-18 00:46:03,597 [INFO] Read 1781370 building rows so far...
2026-06-18 00:46:03,597 [INFO] Read 1812979 building rows so far...
2026-06-18 00:46:03,597 [INFO] Read 1853244 building rows so far...
2026-06-18 00:46:03,597 [INFO] Read 1897463 building rows so far...
2026-06-18 00:46:03,597 [INFO] Read 1941257 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 1987201 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2023619 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2060038 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2097301 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2139342 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2183466 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2216671 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2252454 building rows so far...
2026-06-18 00:46:03,599 [INFO] Read 2286147 building rows so far...
2026-06-18 00:46:03,600 [INFO] Read 2328531 building rows so far...
2026-06-18 00:46:03,600 [INFO] Read 2361752 building rows so far...
2026-06-18 00:46:03,600 [INFO] Read 2397361 building rows so far...
2026-06-18 00:46:03,603 [INFO] Read 2443383 building rows so far...
2026-06-18 00:46:03,603 [INFO] Read 2483828 building rows so far...
2026-06-18 00:46:12,703 [INFO] Read 2523031 building rows so far...
2026-06-18 00:46:12,715 [INFO] Read 2565139 building rows so far...
2026-06-18 00:46:12,716 [INFO] Read 2606645 building rows so far...
2026-06-18 00:46:12,719 [INFO] Read 2646484 building rows so far...
2026-06-18 00:46:12,720 [INFO] Read 2679007 building rows so far...
2026-06-18 00:46:12,720 [INFO] Read 2709702 building rows so far...
2026-06-18 00:46:12,720 [INFO] Read 2748944 building rows so far...
2026-06-18 00:46:12,720 [INFO] Read 2789944 building rows so far...
2026-06-18 00:46:12,720 [INFO] Read 2833125 building rows so far...
2026-06-18 00:46:12,721 [INFO] Read 2878180 building rows so far...
2026-06-18 00:46:12,721 [INFO] Read 2890870 building rows so far...
2026-06-18 00:46:12,722 [INFO] Read 2893597 building rows so far...
2026-06-18 00:46:12,722 [INFO] Read 2936957 building rows so far...
2026-06-18 00:46:12,724 [INFO] Read 2982404 building rows so far...
2026-06-18 00:46:12,724 [INFO] Read 3014974 building rows so far...
2026-06-18 00:46:12,724 [INFO] Read 3052127 building rows so far...
2026-06-18 00:46:12,725 [INFO] Read 3082144 building rows so far...
2026-06-18 00:46:12,763 [INFO] Read 3117726 building rows so far...
2026-06-18 00:46:12,764 [INFO] Read 3132853 building rows so far...
2026-06-18 00:46:12,766 [INFO] Read 3168408 building rows so far...
2026-06-18 00:46:12,766 [INFO] Read 3206767 building rows so far...
2026-06-18 00:46:12,767 [INFO] Read 3242874 building rows so far...
2026-06-18 00:46:12,768 [INFO] Read 3287606 building rows so far...
2026-06-18 00:46:12,769 [INFO] Read 3325428 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3361667 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3405232 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3441861 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3480287 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3512621 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3544805 building rows so far...
2026-06-18 00:46:12,771 [INFO] Read 3579554 building rows so far...
2026-06-18 00:46:12,772 [INFO] Read 3612610 building rows so far...
2026-06-18 00:46:12,779 [INFO] Read 3657312 building rows so far...
2026-06-18 00:46:12,779 [INFO] Read 3692350 building rows so far...
2026-06-18 00:46:12,779 [INFO] Read 3737553 building rows so far...
2026-06-18 00:46:12,779 [INFO] Read 3775903 building rows so far...
2026-06-18 00:46:12,780 [INFO] Read 3812748 building rows so far...
2026-06-18 00:46:12,780 [INFO] Read 3853335 building rows so far...
2026-06-18 00:46:12,780 [INFO] Read 3882656 building rows so far...
2026-06-18 00:46:12,780 [INFO] Read 3926067 building rows so far...
2026-06-18 00:46:12,782 [INFO] Read 3928673 building rows so far...
2026-06-18 00:46:12,789 [INFO] Read 3968558 building rows so far...
2026-06-18 00:46:12,790 [INFO] Read 3971183 building rows so far...
2026-06-18 00:46:12,790 [INFO] Read 3999376 building rows so far...
2026-06-18 00:46:12,812 [INFO] Read 4029102 building rows so far...
2026-06-18 00:46:12,812 [INFO] Read 4066253 building rows so far...
2026-06-18 00:46:12,812 [INFO] Read 4075378 building rows so far...
2026-06-18 00:46:12,812 [INFO] Read 4100886 building rows so far...
2026-06-18 00:46:12,812 [INFO] Read 4129816 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4164744 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4199680 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4206650 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4219629 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4251636 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4277747 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4288541 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4312466 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4312915 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4320371 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4333538 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4349963 building rows so far...
2026-06-18 00:46:12,825 [INFO] Read 4352445 building rows so far...
2026-06-18 00:47:35,480 [INFO] Total building rows read: 4352445
Traceback (most recent call last):
File "/work/solve.py", line 96, in <module>
build_table = pa.concat_tables(build_batches) if len(build_batches) > 1 else build_batches[0]
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "pyarrow/table.pxi", line 6259, in pyarrow.lib.concat_tables
TypeError: Cannot convert pyarrow.lib.RecordBatch to pyarrow.lib.Table
12:18
The issue is that `build_batches` contains `RecordBatch` objects, not `Table` objects. I need to convert them to tables first, or use `pa.Table.from_batches()`. Let me fix this.
12:18
Read
/work/solve.py
85 for batch in reader:
86 build_batches.append(batch)
87 row_count += batch.num_rows
88 log.info(" Read %d building rows so far...", row_count)
89
90 log.info("Total building rows read: %d", row_count)
91
92 if row_count == 0:
93 raise RuntimeError("No buildings found in Lagos bbox")
94
95 # Convert to single table
96 build_table = pa.concat_tables(build_batches) if len(build_batches) > 1 else build_batches[0]
97
98 # ---------------------------------------------------------------------------
99 # 3. Convert to GeoDataFrame, clip to Lagos State, compute area
100 # ---------------------------------------------------------------------------
101 log.info("Converting buildings to GeoDataFrame...")
102
103 # Read geometry as WKB
104 geom_col = build_table.column("geometry").to_pylist()
105 heights = build_table.column("height").to_pylist()
106 ids = build_table.column("id").to_pylist()
107
108 # Build a DataFrame
109 build_df = pd.DataFrame({
110 "id": ids,
111 "height": [float(h) if h is not None and not np.isnan(h) else None for h in heights],
112 })
113
114 # Convert geometry
<system-reminder>Showed lines 85-114 of 214. Re-call Read with offset=115 to continue, or raise limit.</system-reminder>
12:24
The issue is that `record_batch_reader` returns RecordBatch, not Table. `concat_tables` expects Tables, not RecordBatches. I need to convert each RecordBatch to a Table first, or use `pa.Table.from_batches()`. Let me also avoid loading 4.3M rows into memory all at once - I should process in a streaming manner. Let me rewrite the approach to be more memory-efficient.
12:24
Edit
/work/solve.py
Edited /work/solve.py.
12:31
Wait, but `build_tables = [pa.Table.from_batches([b]) for b in build_batches]` is creating one table per batch, but each batch already is a RecordBatch. A more efficient way would be: ```python build_table = pa.Table.from_batches(build_batches) ``` Let me fix this.
12:31
Edit
/work/solve.py
Edited /work/solve.py.
12:36
Actually, `pa.Table.from_batches` takes a list of RecordBatch objects and returns a single Table. Let me verify - yes, `pa.Table.from_batches(batches)` expects a list of RecordBatch objects. But there's a subtlety - it expects all batches to have the same schema. They should since they come from the same reader. Let me run it now.
12:36
Bash
cd /work && python solve.py 2>&1
2026-06-18 00:48:02,159 [INFO] Reading Lagos State & LGA boundaries from Overture divisions...
2026-06-18 00:48:13,108 [INFO] Read 58 division_area features
2026-06-18 00:48:13,129 [INFO] Lagos State geometry loaded (2.7060, 6.3667, 4.3509, 6.6968)
/work/solve.py:66: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
centroids = county_gdf.geometry.centroid
2026-06-18 00:48:13,165 [INFO] Identified 20 LGAs in Lagos State
2026-06-18 00:48:13,166 [INFO] LGA: Badagry
2026-06-18 00:48:13,166 [INFO] LGA: Amuwo Odofin
2026-06-18 00:48:13,166 [INFO] LGA: Ifako/Ijaye
2026-06-18 00:48:13,166 [INFO] LGA: Alimosho
2026-06-18 00:48:13,166 [INFO] LGA: Ojo
2026-06-18 00:48:13,166 [INFO] LGA: Kosofe
2026-06-18 00:48:13,166 [INFO] LGA: Shomolu
2026-06-18 00:48:13,167 [INFO] LGA: Lagos Mainland
2026-06-18 00:48:13,167 [INFO] LGA: Apapa
2026-06-18 00:48:13,167 [INFO] LGA: Mushin
2026-06-18 00:48:13,167 [INFO] LGA: Ikeja
2026-06-18 00:48:13,167 [INFO] LGA: Agege
2026-06-18 00:48:13,167 [INFO] LGA: Oshodi/Isolo
2026-06-18 00:48:13,167 [INFO] LGA: Ajeromi/Ifelodun
2026-06-18 00:48:13,167 [INFO] LGA: Surulere
2026-06-18 00:48:13,167 [INFO] LGA: Lagos Island
2026-06-18 00:48:13,167 [INFO] LGA: Eti Osa
2026-06-18 00:48:13,168 [INFO] LGA: Ìkòròdú
2026-06-18 00:48:13,168 [INFO] LGA: Epe
2026-06-18 00:48:13,168 [INFO] LGA: Ibeju Lekki
2026-06-18 00:48:13,168 [INFO] Downloading Overture buildings for Lagos bbox...
2026-06-18 00:48:43,622 [INFO] Read 1087 building rows so far...
2026-06-18 00:48:54,501 [INFO] Read 4745 building rows so far...
2026-06-18 00:48:54,501 [INFO] Read 6621 building rows so far...
2026-06-18 00:48:54,502 [INFO] Read 8969 building rows so far...
2026-06-18 00:48:54,531 [INFO] Read 49683 building rows so far...
2026-06-18 00:48:54,532 [INFO] Read 75647 building rows so far...
2026-06-18 00:48:54,538 [INFO] Read 114931 building rows so far...
2026-06-18 00:48:54,539 [INFO] Read 140810 building rows so far...
2026-06-18 00:48:54,539 [INFO] Read 186488 building rows so far...
2026-06-18 00:48:54,541 [INFO] Read 222342 building rows so far...
2026-06-18 00:48:54,565 [INFO] Read 266133 building rows so far...
2026-06-18 00:48:54,567 [INFO] Read 310124 building rows so far...
2026-06-18 00:48:54,567 [INFO] Read 351451 building rows so far...
2026-06-18 00:48:54,568 [INFO] Read 361294 building rows so far...
2026-06-18 00:48:54,568 [INFO] Read 398337 building rows so far...
2026-06-18 00:48:54,569 [INFO] Read 411279 building rows so far...
2026-06-18 00:48:54,569 [INFO] Read 438681 building rows so far...
2026-06-18 00:48:54,569 [INFO] Read 452039 building rows so far...
2026-06-18 00:48:54,584 [INFO] Read 489482 building rows so far...
2026-06-18 00:48:54,587 [INFO] Read 526032 building rows so far...
2026-06-18 00:48:54,588 [INFO] Read 566064 building rows so far...
2026-06-18 00:48:54,588 [INFO] Read 583074 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 632767 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 665430 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 709450 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 753090 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 797485 building rows so far...
2026-06-18 00:48:54,589 [INFO] Read 844448 building rows so far...
2026-06-18 00:48:54,592 [INFO] Read 888335 building rows so far...
2026-06-18 00:48:54,593 [INFO] Read 925383 building rows so far...
2026-06-18 00:48:54,593 [INFO] Read 962843 building rows so far...
2026-06-18 00:48:54,593 [INFO] Read 999560 building rows so far...
2026-06-18 00:48:54,594 [INFO] Read 1036481 building rows so far...
2026-06-18 00:48:54,595 [INFO] Read 1077914 building rows so far...
2026-06-18 00:48:54,598 [INFO] Read 1117263 building rows so far...
2026-06-18 00:48:54,599 [INFO] Read 1156122 building rows so far...
2026-06-18 00:48:54,610 [INFO] Read 1190752 building rows so far...
2026-06-18 00:48:54,611 [INFO] Read 1229506 building rows so far...
2026-06-18 00:48:54,611 [INFO] Read 1264762 building rows so far...
2026-06-18 00:48:54,611 [INFO] Read 1283786 building rows so far...
2026-06-18 00:48:54,611 [INFO] Read 1318708 building rows so far...
2026-06-18 00:48:54,611 [INFO] Read 1339412 building rows so far...
2026-06-18 00:48:54,616 [INFO] Read 1382537 building rows so far...
2026-06-18 00:48:54,617 [INFO] Read 1390176 building rows so far...
2026-06-18 00:48:54,632 [INFO] Read 1428363 building rows so far...
2026-06-18 00:48:54,633 [INFO] Read 1441026 building rows so far...
2026-06-18 00:48:54,634 [INFO] Read 1478930 building rows so far...
2026-06-18 00:48:54,634 [INFO] Read 1517878 building rows so far...
2026-06-18 00:48:54,658 [INFO] Read 1554437 building rows so far...
2026-06-18 00:48:54,658 [INFO] Read 1586619 building rows so far...
2026-06-18 00:48:54,658 [INFO] Read 1628496 building rows so far...
2026-06-18 00:48:54,658 [INFO] Read 1664892 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1704854 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1747098 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1781370 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1812979 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1853244 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1897463 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1941257 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 1987201 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 2023619 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 2060038 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 2097301 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 2139342 building rows so far...
2026-06-18 00:48:54,659 [INFO] Read 2183466 building rows so far...
2026-06-18 00:49:02,511 [INFO] Read 2216671 building rows so far...
2026-06-18 00:49:02,511 [INFO] Read 2252454 building rows so far...
2026-06-18 00:49:02,539 [INFO] Read 2286147 building rows so far...
2026-06-18 00:49:02,540 [INFO] Read 2328531 building rows so far...
2026-06-18 00:49:02,542 [INFO] Read 2361752 building rows so far...
2026-06-18 00:49:02,543 [INFO] Read 2397361 building rows so far...
2026-06-18 00:49:02,548 [INFO] Read 2443383 building rows so far...
2026-06-18 00:49:02,551 [INFO] Read 2483828 building rows so far...
2026-06-18 00:49:02,564 [INFO] Read 2523031 building rows so far...
2026-06-18 00:49:02,567 [INFO] Read 2565139 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2606645 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2646484 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2679007 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2709702 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2748944 building rows so far...
2026-06-18 00:49:02,568 [INFO] Read 2789944 building rows so far...
2026-06-18 00:49:02,571 [INFO] Read 2833125 building rows so far...
2026-06-18 00:49:02,572 [INFO] Read 2878180 building rows so far...
2026-06-18 00:49:02,572 [INFO] Read 2890870 building rows so far...
2026-06-18 00:49:02,572 [INFO] Read 2893597 building rows so far...
2026-06-18 00:49:02,572 [INFO] Read 2936957 building rows so far...
2026-06-18 00:49:02,573 [INFO] Read 2982404 building rows so far...
2026-06-18 00:49:02,573 [INFO] Read 3014974 building rows so far...
2026-06-18 00:49:02,573 [INFO] Read 3052127 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3082144 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3117726 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3132853 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3168408 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3206767 building rows so far...
2026-06-18 00:49:02,580 [INFO] Read 3242874 building rows so far...
2026-06-18 00:49:02,583 [INFO] Read 3287606 building rows so far...
2026-06-18 00:49:06,719 [INFO] Read 3325428 building rows so far...
2026-06-18 00:49:06,738 [INFO] Read 3361667 building rows so far...
2026-06-18 00:49:06,748 [INFO] Read 3405232 building rows so far...
2026-06-18 00:49:06,751 [INFO] Read 3441861 building rows so far...
2026-06-18 00:49:06,752 [INFO] Read 3480287 building rows so far...
2026-06-18 00:49:06,752 [INFO] Read 3512621 building rows so far...
2026-06-18 00:49:06,754 [INFO] Read 3544805 building rows so far...
2026-06-18 00:49:06,755 [INFO] Read 3579554 building rows so far...
2026-06-18 00:49:06,755 [INFO] Read 3612610 building rows so far...
2026-06-18 00:49:06,755 [INFO] Read 3657312 building rows so far...
2026-06-18 00:49:06,767 [INFO] Read 3692350 building rows so far...
2026-06-18 00:49:06,772 [INFO] Read 3737553 building rows so far...
2026-06-18 00:49:06,773 [INFO] Read 3775903 building rows so far...
2026-06-18 00:49:06,784 [INFO] Read 3812748 building rows so far...
2026-06-18 00:49:06,784 [INFO] Read 3853335 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3882656 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3926067 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3928673 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3968558 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3971183 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 3999376 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 4029102 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 4066253 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 4075378 building rows so far...
2026-06-18 00:49:06,785 [INFO] Read 4100886 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4129816 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4164744 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4199680 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4206650 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4219629 building rows so far...
2026-06-18 00:49:06,788 [INFO] Read 4251636 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4277747 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4288541 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4312466 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4312915 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4320371 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4333538 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4349963 building rows so far...
2026-06-18 00:49:06,789 [INFO] Read 4352445 building rows so far...
2026-06-18 00:50:12,866 [INFO] Total building rows read: 4352445
2026-06-18 00:50:12,872 [INFO] Converting buildings to GeoDataFrame...
2026-06-18 00:50:35,610 [INFO] Building GeoDataFrame has 4352445 rows
2026-06-18 00:50:35,610 [INFO] Clipping buildings to Lagos State boundary...
2026-06-18 00:50:47,503 [INFO] Buildings intersecting Lagos: 2215411
/work/solve.py:126: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
centroids_b = build_in_lagos.geometry.centroid
2026-06-18 00:50:53,331 [INFO] Buildings with centroid in Lagos: 2214537
2026-06-18 00:50:53,331 [INFO] Computing footprint areas (EPSG:32631)...
2026-06-18 00:50:57,899 [INFO] Buildings > 1000 m²: 7003
2026-06-18 00:50:57,918 [INFO] Assigning LGA via spatial join...
2026-06-18 00:50:57,979 [INFO] Buildings with LGA assigned: 7026
2026-06-18 00:50:57,981 [INFO] Writing lagos_buildings.geoparquet (7026 rows)...
2026-06-18 00:50:58,023 [INFO] Done writing lagos_buildings.geoparquet
2026-06-18 00:50:58,023 [INFO] Computing per-LGA summary...
2026-06-18 00:50:58,041 [INFO] Summary:
lga n_buildings total_footprint_m2 n_with_height p50_height_m
0 Agege 55 1.178889e+05 0 NaN
1 Ajeromi/Ifelodun 60 1.078803e+05 0 NaN
2 Alimosho 353 6.476227e+05 0 NaN
3 Amuwo Odofin 836 1.751259e+06 0 NaN
4 Apapa 286 6.083739e+05 0 NaN
5 Badagry 122 2.061790e+05 0 NaN
6 Epe 322 7.846644e+05 0 NaN
7 Eti Osa 1063 1.853554e+06 237 20.0
8 Ibeju Lekki 370 1.145752e+06 0 NaN
9 Ifako/Ijaye 118 2.441200e+05 0 NaN
10 Ikeja 926 3.185193e+06 46 15.0
11 Kosofe 165 3.437419e+05 0 NaN
12 Lagos Island 117 2.261127e+05 9 20.0
13 Lagos Mainland 123 2.295373e+05 0 NaN
14 Mushin 463 1.063793e+06 1 35.0
15 Ojo 232 4.085015e+05 0 NaN
16 Oshodi/Isolo 417 7.918409e+05 3 27.0
17 Shomolu 124 2.006563e+05 3 17.0
18 Surulere 340 7.783158e+05 0 NaN
19 Ìkòròdú 534 1.857791e+06 4 7.5
2026-06-18 00:50:58,046 [INFO] Done writing lagos_building_summary.parquet
2026-06-18 00:50:58,046 [INFO] All tasks complete.
15:46
The script completed successfully. Let me verify the outputs.
15:46
Bash
cd /work && ls -lh lagos_buildings.geoparquet lagos_building_summary.parquet
-rw-r--r-- 1 agent agent 4.1K Jun 18 00:50 lagos_building_summary.parquet -rw-r--r-- 1 agent agent 1.1M Jun 18 00:50 lagos_buildings.geoparquet
15:46
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
import pandas as pd
# Check lagos_buildings.geoparquet
build = gpd.read_parquet('lagos_buildings.geoparquet')
print('=== lagos_buildings.geoparquet ===')
print(f'CRS: {build.crs}')
print(f'Rows: {len(build)}')
print(f'Columns: {list(build.columns)}')
print(f'Dtypes:\n{build.dtypes}')
print()
print(build.head(3).to_string())
print()
# Check spatial extent
print(f'X range: {build.total_bounds[0]:.4f} to {build.total_bounds[2]:.4f}')
print(f'Y range: {build.total_bounds[1]:.4f} to {build.total_bounds[3]:.4f}')
print()
# Check lagos_building_summary.parquet
summary = pd.read_parquet('lagos_building_summary.parquet')
print('=== lagos_building_summary.parquet ===')
print(f'Rows: {len(summary)}')
print(f'Columns: {list(summary.columns)}')
print(f'Dtypes:\n{summary.dtypes}')
print()
print(summary.to_string())
"
=== lagos_buildings.geoparquet ===
CRS: {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "GeographicCRS", "name": "WGS 84", "datum_ensemble": {"name": "World Geodetic System 1984 ensemble", "members": [{"name": "World Geodetic System 1984 (Transit)"}, {"name": "World Geodetic System 1984 (G730)"}, {"name": "World Geodetic System 1984 (G873)"}, {"name": "World Geodetic System 1984 (G1150)"}, {"name": "World Geodetic System 1984 (G1674)"}, {"name": "World Geodetic System 1984 (G1762)"}, {"name": "World Geodetic System 1984 (G2139)"}, {"name": "World Geodetic System 1984 (G2296)"}], "ellipsoid": {"name": "WGS 84", "semi_major_axis": 6378137, "inverse_flattening": 298.257223563}, "accuracy": "2.0", "id": {"authority": "EPSG", "code": 6326}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "scope": "Horizontal component of 3D system.", "area": "World.", "bbox": {"south_latitude": -90, "west_longitude": -180, "north_latitude": 90, "east_longitude": 180}, "id": {"authority": "EPSG", "code": 4326}}
Rows: 7026
Columns: ['id', 'height', 'footprint_area_m2', 'lga', 'geometry']
Dtypes:
id str
height float64
footprint_area_m2 float64
lga str
geometry geometry
dtype: object
id height footprint_area_m2 lga geometry
0 e24f1bf5-e957-4867-859b-a6a578fd9b06 NaN 1072.458049 Badagry POLYGON ((2.70836 6.37804, 2.70816 6.37804, 2.70817 6.37759, 2.70837 6.37759, 2.70836 6.37804))
1 5df57f70-f431-495f-98cb-eff96e1ca151 NaN 1052.426977 Badagry POLYGON ((2.72675 6.37881, 2.72672 6.37907, 2.72646 6.37904, 2.72645 6.37908, 2.7263 6.37906, 2.72632 6.37888, 2.72657 6.37891, 2.72657 6.37896, 2.72664 6.37897, 2.72665 6.37889, 2.72652 6.37888, 2.72654 6.37878, 2.72675 6.37881))
2 cfb03953-1851-4ec2-a087-3283cd510753 NaN 1077.286127 Badagry POLYGON ((2.7137 6.37984, 2.7136 6.37984, 2.71361 6.37962, 2.71356 6.37962, 2.71356 6.37955, 2.71398 6.37957, 2.71398 6.37964, 2.71393 6.37964, 2.71392 6.37983, 2.71382 6.37983, 2.71383 6.37966, 2.71382 6.37966, 2.71381 6.37983, 2.71372 6.37983, 2.71373 6.37963, 2.71371 6.37963, 2.7137 6.37984))
X range: 2.7082 to 4.0041
Y range: 6.3776 to 6.6962
=== lagos_building_summary.parquet ===
Rows: 20
Columns: ['lga', 'n_buildings', 'total_footprint_m2', 'n_with_height', 'p50_height_m']
Dtypes:
lga str
n_buildings int64
total_footprint_m2 float64
n_with_height int64
p50_height_m float64
dtype: object
lga n_buildings total_footprint_m2 n_with_height p50_height_m
0 Agege 55 1.178889e+05 0 NaN
1 Ajeromi/Ifelodun 60 1.078803e+05 0 NaN
2 Alimosho 353 6.476227e+05 0 NaN
3 Amuwo Odofin 836 1.751259e+06 0 NaN
4 Apapa 286 6.083739e+05 0 NaN
5 Badagry 122 2.061790e+05 0 NaN
6 Epe 322 7.846644e+05 0 NaN
7 Eti Osa 1063 1.853554e+06 237 20.0
8 Ibeju Lekki 370 1.145752e+06 0 NaN
9 Ifako/Ijaye 118 2.441200e+05 0 NaN
10 Ikeja 926 3.185193e+06 46 15.0
11 Kosofe 165 3.437419e+05 0 NaN
12 Lagos Island 117 2.261127e+05 9 20.0
13 Lagos Mainland 123 2.295373e+05 0 NaN
14 Mushin 463 1.063793e+06 1 35.0
15 Ojo 232 4.085015e+05 0 NaN
16 Oshodi/Isolo 417 7.918409e+05 3 27.0
17 Shomolu 124 2.006563e+05 3 17.0
18 Surulere 340 7.783158e+05 0 NaN
19 Ìkòròdú 534 1.857791e+06 4 7.5
16:03
Let me verify the outputs against the requirements:
1. **`lagos_buildings.geoparquet`** - EPSG:4326 GeoParquet with columns `id`, `height`, `footprint_area_m2`, `lga`, and `geometry`. ✓
- CRS is EPSG:4326 ✓
- Columns match ✓
- Buildings are filtered to > 1000 m² ✓
2. **`lagos_building_summary.parquet`** - plain Parquet, no geometry, columns: `lga`, `n_buildings`, `total_footprint_m2`, `n_with_height`, `p50_height_m`. ✓
- No geometry ✓
- Columns match ✓
Coordinate magnitudes check: Longitude around 2.7-4.0° (degrees), latitude around 6.3-6.7° (degrees). These are correct for Lagos, Nigeria.
Wait, I notice something about the `p50_height_m` column. Let me look at the log output: for Ikeja, `p50_height_m` shows 15.0 in the summary table but 20.0 in the log output. Let me check. Looking at the log:
```
10 Ikeja 926 3.185193e+06 46 20.0
```
But in the final printed output:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
Hmm, that's different! The log output was from the log.info line which showed the DataFrame, while the print after reading with pandas shows 15.0. Wait, actually let me re-read the output.
The log line says:
```
10 Ikeja 926 3.185193e+06 46 20.0
```
But the pandas read shows:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
Wait, no. Looking at the log output again:
```
2026-06-18 00:50:58,041 [INFO] Summary:
lga n_buildings total_footprint_m2 n_with_height p50_height_m
0 Agege 55 1.178889e+05 0 NaN
...
10 Ikeja 926 3.185193e+06 46 20.0
...
```
And the pandas read output:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
Wait, they're different! That's suspicious. Let me re-read the outputs more carefully.
Actually looking at the log output more carefully:
```
10 Ikeja 926 3.185193e+06 46 20.0
```
But from the second output file read:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
Hmm, let me look again... Actually the log line says:
```
10 Ikeja 926 3.185193e+06 46 20.0
```
Wait, I need to re-read. The output is:
From the script logging:
```
10 Ikeja 926 3.185193e+06 46 20.0
```
From the subsequent verification:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
These are different! But the verification was done after reading the saved parquet file... Let me check by re-reading the stored data to see if there's a discrepancy.
Actually wait - let me look more carefully at the summary output from the script:
```
lga n_buildings total_footprint_m2 n_with_height p50_height_m
0 Agege 55 1.178889e+05 0 NaN
...
10 Ikeja 926 3.185193e+06 46 15.0
```
Hmm, the log shows 20.0 but the print shows 15.0? Let me re-read the first log output:
Looking carefully:
```
2026-06-18 00:50:58,041 [INFO] Summary:
lga n_buildings total_footprint_m2 n_with_height p50_height_m
...
10 Ikeja 926 3.185193e+06 46 20.0
```
And the second read:
```
10 Ikeja 926 3.185193e+06 46 15.0
```
This is odd. Let me check by reading the file directly.
Actually, maybe I'm misreading the log output due to text wrapping or alignment. The columns in the log output might not align perfectly. Let me read the file directly to verify.
16:03
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import pandas as pd
summary = pd.read_parquet('lagos_building_summary.parquet')
# Show all rows with exact values
for _, row in summary.iterrows():
print(f'{row[\"lga\"]:25s} | n={row[\"n_buildings\"]:5d} | total={row[\"total_footprint_m2\"]:12.1f} | n_height={row[\"n_with_height\"]:3d} | p50={row[\"p50_height_m\"]}')
# Also check the buildings geoparquet
build = pd.read_parquet('lagos_buildings.geoparquet')
print(f'\nBuildings file: {len(build)} rows')
print(f'CRS is included: {\"geometry\" in build.columns}')
print(f'{build[\"footprint_area_m2\"].describe()}')
print(f'Min area: {build[\"footprint_area_m2\"].min():.2f} (should be > 1000)')
"
Agege | n= 55 | total= 117888.9 | n_height= 0 | p50=nan Ajeromi/Ifelodun | n= 60 | total= 107880.3 | n_height= 0 | p50=nan Alimosho | n= 353 | total= 647622.7 | n_height= 0 | p50=nan Amuwo Odofin | n= 836 | total= 1751258.9 | n_height= 0 | p50=nan Apapa | n= 286 | total= 608373.9 | n_height= 0 | p50=nan Badagry | n= 122 | total= 206179.0 | n_height= 0 | p50=nan Epe | n= 322 | total= 784664.4 | n_height= 0 | p50=nan Eti Osa | n= 1063 | total= 1853554.3 | n_height=237 | p50=20.0 Ibeju Lekki | n= 370 | total= 1145752.1 | n_height= 0 | p50=nan Ifako/Ijaye | n= 118 | total= 244120.0 | n_height= 0 | p50=nan Ikeja | n= 926 | total= 3185193.0 | n_height= 46 | p50=15.0 Kosofe | n= 165 | total= 343741.9 | n_height= 0 | p50=nan Lagos Island | n= 117 | total= 226112.7 | n_height= 9 | p50=20.0 Lagos Mainland | n= 123 | total= 229537.3 | n_height= 0 | p50=nan Mushin | n= 463 | total= 1063792.5 | n_height= 1 | p50=35.0 Ojo | n= 232 | total= 408501.5 | n_height= 0 | p50=nan Oshodi/Isolo | n= 417 | total= 791840.9 | n_height= 3 | p50=27.0 Shomolu | n= 124 | total= 200656.3 | n_height= 3 | p50=17.0 Surulere | n= 340 | total= 778315.8 | n_height= 0 | p50=nan Ìkòròdú | n= 534 | total= 1857791.1 | n_height= 4 | p50=7.5 Buildings file: 7026 rows CRS is included: True count 7026.000000 mean 2355.931892 std 4212.055597 min 1000.044933 25% 1165.479167 50% 1450.887603 75% 2223.802564 max 181048.180771 Name: footprint_area_m2, dtype: float64 Min area: 1000.04 (should be > 1000)
16:12
Everything looks good. Let me verify the key requirements: 1. **lagos_buildings.geoparquet** - EPSG:4326, columns: `id`, `height`, `footprint_area_m2`, `lga`, `geometry` ✓ 2. **lagos_building_summary.parquet** - plain Parquet (no geometry), columns: `lga`, `n_buildings`, `total_footprint_m2`, `n_with_height`, `p50_height_m` ✓ The output files are in the working directory. Let me verify the file listing.
16:12