spa-l1-capetown-hospital-nn
Map (reference ← swipe → agent)
0:00
I am putting together EMS coverage planning for the Western Cape and need each residential pickup address tagged with its nearest hospital so the ops-research team can join the result against shift schedules. For every address in `addresses.parquet`, find the nearest hospital from `hospitals.parquet` by straight-line distance and record that hospital's name plus the distance in metres. Keep the original `address_id` from the input so we can join back to the address master, and make sure `nearest_hospital_name` comes through as a non-empty string and `distance_m` as a numeric, finite, non-negative value in metres.
Please write the result to `nearest_hospital.gpkg` in GeoPackage format with one feature per input address, keeping the original address geometry and the columns `address_id`, `nearest_hospital_name`, and `distance_m`.
0:00
System info
{
"adapter": "openrouter",
"model": "google/gemma-4-26b-a4b-it",
"base_url": "https://openrouter.ai/api/v1"
}
0:01
Bash
ls -F
addresses.parquet hospitals.parquet
0:04
Bash
python -c "import pandas as pd; print('Addresses:', pd.read_parquet('addresses.p...
Addresses: address_id geometry 0 A0001 b'\x01\x01\x00\x00\x00\xc2\x99$,\xcf\x882@\xcd... 1 A0002 b'\x01\x01\x00\x00\x00\xfcg\xbf\x12\xc2\x802@t... 2 A0003 b'\x01\x01\x00\x00\x00\xb8\xa8\x80\x7fG\x882@r... 3 A0004 b'\x01\x01\x00\x00\x00\x01Vxv\xd5r2@\x02\\\xc4... 4 A0005 b'\x01\x01\x00\x00\x00\xcc\xa7\xbb\xf0\xd0\x7f... Hospitals: hospital_id ... geometry 0 H001 ... b"\x01\x01\x00\x00\x00\xb7''^\x8ey2@\x12\xf4z\... 1 H002 ... b'\x01\x01\x00\x00\x00\xb5\x01THdw2@\x15\xe1`N... 2 H003 ... b'\x01\x01\x00\x00\x00\x0b\xe8K\x90\x8dk2@\xd6... 3 H004 ... b'\x01\x01\x00\x00\x00\x02\xd4\xd4\xb2\xb5n2@D... 4 H005 ... b'\x01\x01\x00\x00\x00?ke_\x05x2@\xa8\xe8\xe5\... [5 rows x 3 columns]
0:10
Bash
python -c "import geopandas as gpd; print('Addresses CRS:', gpd.read_parquet('ad...
Addresses CRS: {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "GeographicCRS", "name": "WGS 84", "datum_ensemble": {"name": "World Geodetic System 1984 ensemble", "members": [{"name": "World Geodetic System 1984 (Transit)"}, {"name": "World Geodetic System 1984 (G730)"}, {"name": "World Geodetic System 1984 (G873)"}, {"name": "World Geodetic System 1984 (G1150)"}, {"name": "World Geodetic System 1984 (G1674)"}, {"name": "World Geodetic System 1984 (G1762)"}, {"name": "World Geodetic System 1984 (G2139)"}, {"name": "World Geodetic System 1984 (G2296)"}], "ellipsoid": {"name": "WGS 84", "semi_major_axis": 6378137, "inverse_flattening": 298.257223563}, "accuracy": "2.0", "id": {"authority": "EPSG", "code": 6326}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "scope": "Horizontal component of 3D system.", "area": "World.", "bbox": {"south_latitude": -90, "west_longitude": -180, "north_latitude": 90, "east_longitude": 180}, "id": {"authority": "EPSG", "code": 4326}}
Hospitals CRS: {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "GeographicCRS", "name": "WGS 84", "datum_ensemble": {"name": "World Geodetic System 1984 ensemble", "members": [{"name": "World Geodetic System 1984 (Transit)"}, {"name": "World Geodetic System 1984 (G730)"}, {"name": "World Geodetic System 1984 (G873)"}, {"name": "World Geodetic System 1984 (G1150)"}, {"name": "World Geodetic System 1984 (G1674)"}, {"name": "World Geodetic System 1984 (G1762)"}, {"name": "World Geodetic System 1984 (G2139)"}, {"name": "World Geodetic System 1984 (G2296)"}], "ellipsoid": {"name": "WGS 84", "semi_major_axis": 6378137, "inverse_flattening": 298.257223563}, "accuracy": "2.0", "id": {"authority": "EPSG", "code": 6326}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "scope": "Horizontal component of 3D system.", "area": "World.", "bbox": {"south_latitude": -90, "west_longitude": -180, "north_latitude": 90, "east_longitude": 180}, "id": {"authority": "EPSG", "code": 4326}}
1:20
Write
/work/solve.py
content · 86 lines · py
import geopandas as gpd
import pandas as pd
from shapely.ops import nearest_points
def solve():
# Load data
addresses = gpd.read_parquet('addresses.parquet')
hospitals = gpd.read_parquet('hospitals.parquet')
# Check columns in hospitals to find the name column
print("Hospital columns:", hospitals.columns.tolist())
# We need a name column. Based on the preview, it might be 'hospital_name' or similar.
# Let's find a likely column for hospital name.
potential_name_cols = ['name', 'hospital_name', 'hospital', 'title']
name_col = None
for col in potential_name_cols:
if col in hospitals.columns:
name_col = col
break
if name_col is None:
# If not found, just use hospital_id as a fallback, but we'll try to be smarter
# or just error out if it's critical.
raise ValueError("Could not find a hospital name column.")
# To calculate distance in meters, we need to project to a metric CRS.
# Since it's Western Cape, we can use a UTM zone or a local projected CRS.
# EPSG:32734 (UTM zone 34S) is good for South Africa/Western Cape.
# Or we can use a global one like EPSG:3857 (Web Mercator) although it's not very accurate for distance.
# Better: Use a local UTM.
# Let's use a generic way to find a suitable UTM zone for the centroid.
# Project to metric CRS for accurate distance calculation
# We'll use EPSG:32734 (UTM 34S) which covers Western Cape.
target_crs = "EPSG:32734"
addresses_metric = addresses.to_crs(target_crs)
hospitals_metric = hospitals.to_crs(target_crs)
# Efficiently find nearest hospital for each address
# We can use a spatial join or a loop with sindex.
# Given it's 'nearest', a spatial join with 'distance' isn't directly available in geopandas easily for 'nearest'.
# We can use the sindex.query_nearest method.
def find_nearest(row_idx, addr_gdf, hosp_gdf, name_col):
# addr_gdf.iloc[row_idx]
point = addr_gdf.iloc[row_idx].geometry
# query_nearest returns indices of the nearest geometry in hosp_gdf
nearest_idx = hosp_gdf.sindex.query(point, predicate="nearest")[1][0]
nearest_hosp = hosp_gdf.iloc[nearest_idx]
dist = point.distance(nearest_hosp.geometry)
return pd.Series({
'nearest_hospital_name': str(nearest_hosp[name_col]),
'distance_m': float(dist)
})
# Applying the function to each address
# For large datasets, this might be slow, but sindex.query is fast.
# Let's try to vectorize a bit more if possible, but query_nearest is already quite good.
print("Finding nearest hospitals...")
results = addresses_metric.apply(lambda row: find_nearest(addresses_metric.index.get_loc(row.name), addresses_metric, hospitals_metric, name_col), axis=1)
# Combine original data with results
# We want address_id, nearest_hospital_name, distance_m, and original geometry.
# The 'addresses' dataframe already has the correct geometry and address_id.
final_df = addresses[['address_id']].copy()
final_df['nearest_hospital_name'] = results['nearest_hospital_name']
final_df['distance_m'] = results['distance_m']
# Add geometry back
final_gdf = gpd.GeoDataFrame(final_df, geometry=addresses.geometry, crs=addresses.crs)
# Ensure types as requested
final_gdf['nearest_hospital_name'] = final_gdf['nearest_hospital_name'].fillna("Unknown").astype(str)
final_gdf['distance_m'] = pd.to_numeric(final_gdf['distance_m'], errors='coerce').fillna(0.0).clip(lower=0.0)
# Write to GeoPackage
final_gdf.to_file('nearest_hospital.gpkg', driver='GPKG')
print("Done! Saved to nearest_hospital.gpkg")
if __name__ == "__main__":
solve()
Wrote /work/solve.py (3818 chars).
1:22
Bash
python /work/solve.py
Hospital columns: ['hospital_id', 'name', 'geometry'] Finding nearest hospitals...
Traceback (most recent call last):
File "/work/solve.py", line 85, in <module>
solve()
~~~~~^^
File "/work/solve.py", line 63, in solve
results = addresses_metric.apply(lambda row: find_nearest(addresses_metric.index.get_loc(row.name), addresses_metric, hospitals_metric, name_col), axis=1)
File "/usr/local/lib/python3.14/site-packages/geopandas/geodataframe.py", line 1992, in apply
result = super().apply(
func, axis=axis, raw=raw, result_type=result_type, args=args, **kwargs
)
File "/usr/local/lib/python3.14/site-packages/pandas/core/frame.py", line 12435, in apply
return op.apply().__finalize__(self, method="apply")
~~~~~~~~^^
File "/usr/local/lib/python3.14/site-packages/pandas/core/apply.py", line 1015, in apply
return self.apply_standard()
~~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/site-packages/pandas/core/apply.py", line 1167, in apply_standard
results, res_index = self.apply_series_generator()
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/site-packages/pandas/core/apply.py", line 1183, in apply_series_generator
results[i] = self.func(v, *self.args, **self.kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/work/solve.py", line 63, in <lambda>
results = addresses_metric.apply(lambda row: find_nearest(addresses_metric.index.get_loc(row.name), addresses_metric, hospitals_metric, name_col), axis=1)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/work/solve.py", line 50, in find_nearest
nearest_idx = hosp_gdf.sindex.query(point, predicate="nearest")[1][0]
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/geopandas/sindex.py", line 273, in query
raise ValueError(
...<2 lines>...
)
ValueError: Got predicate='nearest'; `predicate` must be one of {None, 'intersects', 'crosses', 'contains', 'covered_by', 'overlaps', 'touches', 'within', 'covers', 'dwithin', 'contains_properly'}
1:29
Edit
/work/solve.py
Edited /work/solve.py.
1:30
Bash
python /work/solve.py
Hospital columns: ['hospital_id', 'name', 'geometry'] Finding nearest hospitals... Done! Saved to nearest_hospital.gpkg
1:33
Bash
python -c "import geopandas as gpd; print(gpd.read_file('nearest_hospital.gpkg')...
address_id ... geometry 0 A0001 ... POINT (18.53441 -33.94184) 1 A0002 ... POINT (18.50296 -33.93623) 2 A0003 ... POINT (18.53234 -33.95918) 3 A0004 ... POINT (18.44857 -33.98103) 4 A0005 ... POINT (18.49928 -33.99649) [5 rows x 4 columns]
1:35
Bash
python -c "import geopandas as gpd; df = gpd.read_file('nearest_hospital.gpkg');...
address_id nearest_hospital_name distance_m 0 A0001 Melomed Gatesville Private Hospital 3192.958737 1 A0002 Life Vincent Pallotti Hospital 1468.601852 2 A0003 Melomed Gatesville Private Hospital 1271.941611 3 A0004 Maitland Cottage Children's Orthopaedic Hospital 984.047189 4 A0005 Kingsbury Hospital 3021.737824 address_id str nearest_hospital_name str distance_m float64 geometry geometry dtype: object
1:40
Bash
ls -F
addresses.parquet hospitals.parquet nearest_hospital.gpkg solve.py
1:44