dc-l3-vienna-overpass-historical
Map (reference ← swipe → agent)
0:00
Fetch Vienna's current district polygons (Bezirke) and their 2014-01-01 counterparts from Overpass and identify which parts of the district boundaries have changed between then and now. District names changed casing and spelling between snapshots, so normalise them to match. Classify each resulting geometry fragment by how it changed: `added_since_2014` (in current but not 2014), `removed_since_2014` (in 2014 but not current), or `unchanged` (present in both snapshots). For each district flag whether its geometry borders any area that was added or removed. Output vienna_boundary_changes.geojson, MultiPolygon, each feature with change_type, district_name (normalised), and touches_changed (boolean).
0:00
System info
{
"adapter": "openrouter",
"model": "deepseek/deepseek-v4-flash",
"base_url": "https://openrouter.ai/api/v1"
}
0:07
Let me break down this task: 1. Fetch Vienna's current district polygons (Bezirke) from Overpass 2. Fetch Vienna's 2014-01-01 district polygons from Overpass (using the OSM history API or a historical extract) 3. Normalise district names to match between snapshots 4. Compare geometries to find changes (added, removed, unchanged) 5. For each district, flag if its geometry borders any added/removed area 6. Output as GeoJSON with MultiPolygon geometry, and fields: change_type, district_name, touches_changed Let me start by exploring what data is available and then write a Python script. First, let me think about the Overpass queries needed: For current districts in Vienna, I can query for `relation[admin_level=9][boundary=administrative][name:de~"^Wien"]` or `relation[admin_level=9][boundary=administrative]["name:de"~"^[0-9]+\."]` or specifically the districts of Vienna. Vienna's districts are admin_level=9 relations with names like "1. Bezirk, Innere Stadt", "2. Bezirk, Leopoldstadt", etc. For historical data from 2014-01-01, I'd need to use Overpass's `date` parameter or the OSM history API. Overpass supports `[date:"2014-01-01T00:00:00Z"]` to query historical data. Let me write a script to do this. Actually, let me first check what's in the working directory.
0:07
Bash
ls /work/
(no output)
0:17
Let me plan this task: 1. Fetch Vienna's current district boundaries from Overpass 2. Fetch Vienna's district boundaries from a 2014-01-01 snapshot (Overpass can query historical data with `[date:"2014-01-01T00:00:00Z"]`) 3. Normalise district names (handle casing/spelling differences) 4. Compute geometry differences (symmetric difference) 5. Classify each fragment as added, removed, or unchanged 6. For each district, determine if it touches any changed area 7. Output as GeoJSON with MultiPolygon geometries Let me write a Python script to do all this. I'll use Overpass API with the `[date:"2014-01-01T00:00:00Z"]` syntax for historical data. Let me first check what's available in the environment.
0:17
Bash
pip list 2>/dev/null | grep -iE "geopandas|shapely|pyproj|osmnx|overturemaps|ove...
geopandas 1.1.3 osmnx 2.1.0 overturemaps 1.0.0 pyproj 3.7.2 requests 2.34.2 shapely 2.1.2
1:48
Let me plan this out:
1. Fetch current Vienna district polygons from Overpass
2. Fetch Vienna district polygons as of 2014-01-01 from Overpass
3. Normalise district names (handle casing/spelling differences)
4. Find geometry changes between the two snapshots
5. Classify fragments: added_since_2014, removed_since_2014, unchanged
6. Determine if each district borders any changed area
7. Output as GeoJSON with MultiPolygon geometry
Let me first explore what Overpass data looks like for Vienna districts. I'll use the Overpass API.
Vienna districts are admin_level=9 (Bezirke) in the city of Vienna (relation id 109166).
Let me write the script step by step.
First, let me understand the Overpass query for current districts and historical (2014-01-01) districts.
For historical OSM data, I can use the Overpass API with a date parameter, or use the OSM history database. Overpass Turbo supports date-based queries using the `[date:"2014-01-01T00:00:00Z"]` syntax after the `[out:json]`.
Let me write a comprehensive script.
The approach:
1. Query Overpass for current Vienna districts (admin_level=9 in relation 109166)
2. Query Overpass for historical Vienna districts (same query but with date parameter)
3. Parse the multipolygon relations for each district
4. Normalise district names - in 2014 they might have been all-caps or different spelling
5. Compute the symmetric difference of the geometries
6. Classify fragments
Actually, let me think about this more carefully. Vienna districts (Bezirke) can be found as relations with admin_level=9 that are part of the Vienna city relation (admin_level=8, relation id 109166).
Let me write the script:
```python
import geopandas as gpd
import pandas as pd
import requests
import time
from shapely.geometry import shape, MultiPolygon, Polygon
from shapely.ops import unary_union
import json
# Overpass query for current Vienna districts
# Get all admin_level=9 relations that are part of Vienna (relation 109166)
# Or get all admin_level=9 within the Vienna boundary
overpass_url = "https://overpass-api.de/api/interpreter"
# Current districts
query_current = """
[out:json][timeout:120];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9]["name"];
out body geom;
"""
# Historical districts (2014-01-01)
query_2014 = """
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9]["name"];
out body geom;
"""
```
Wait, I need to be careful about the Overpass date syntax. Let me check.
The date parameter in Overpass goes after `[out:json]`: `[out:json][date:"2014-01-01T00:00:00Z"]`
But I need to make sure the area is also resolved at the historical date. Let me think...
Actually, the area(3600109166) refers to the current Vienna relation. For historical queries, I might need to find the relation as it existed in 2014.
Let me try a different approach - just query the districts by their relation IDs or by their proximity to Vienna.
Actually, the simplest approach might be to query all relations with admin_level=9 that have their center within Vienna's bounding box.
Let me try:
```
[out:json][timeout:120];
rel(47.0,16.1,48.4,16.7)[admin_level=9]["name"];
out body geom;
```
This would get all admin_level=9 relations in the greater Vienna area. Then I can filter for those that are actually Vienna districts.
Actually, the most robust approach is to query by the relation IDs of the known 23 districts. But I don't know all the IDs.
Let me try the area approach, which should work:
For current data:
```
[out:json][timeout:120];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9];
out body geom;
```
For historical data, I need the vienna boundary as it was in 2014 too. Let me use `is_in` or just query all admin_level=9 in a bounding box and filter.
Actually, let me just use a bounding box covering Vienna and filter by name or relationship.
Simpler approach:
1. Get the Vienna city boundary current and historical
2. Use it to filter districts
Let me try querying the districts directly by getting all admin_level=9 whose center is within Vienna.
Actually, the cleanest way: query Vienna's relation members (the districts are members with role 'subarea'), and also get their geometries.
Let me try:
```
[out:json][timeout:120];
rel(109166);
map_to_area;
nwr(area)[admin_level=9];
out body geom;
```
Hmm, this might work but `map_to_area` converts the relation to an area which can then be used.
Let me think about a cleaner approach.
For current: get all relations with admin_level=9 in Vienna
```
[out:json][timeout:120];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9];
out body geom;
```
For historical:
```
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9];
out body geom;
```
But the area reference 3600109166 might not exist in the historical database. Let me use a bbox approach instead.
Let me try a different method. I'll query for all admin_level=9 relations in the area around Vienna.
The bounding box of Vienna is roughly: 48.1, 16.2, 48.35, 16.6
Let me write the query:
```
[out:json][timeout:120];
(
rel(48.1,16.2,48.35,16.6)[admin_level=9];
);
out body geom;
```
And for historical:
```
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
(
rel(48.1,16.2,48.35,16.6)[admin_level=9];
);
out body geom;
```
Then I'll filter by name to only include Vienna districts (those ending with "Wien" or containing "Wien" in the name, or having certain known prefixes like 1., 2., etc.)
Actually, looking at the OSM data, Vienna districts have names like "1. Bezirk, Innere Stadt" or "Wien" in their admin names. Let me try getting all and filtering.
Let me just write the script and handle errors gracefully. I'll use multiple Overpass endpoints as fallbacks.
Let me write the complete script now:
```python
import requests
import time
import json
import geopandas as gpd
import pandas as pd
from shapely.geometry import shape, MultiPolygon, Polygon, box
from shapely.ops import unary_union
from shapely.validation import make_valid
def query_overpass(query, endpoints, max_retries=3):
for endpoint in endpoints:
for attempt in range(max_retries):
try:
r = requests.post(endpoint, data={'data': query}, timeout=180)
if r.status_code == 200:
return r.json()
time.sleep(2)
except Exception as e:
time.sleep(2)
raise Exception("All endpoints failed")
endpoints = [
"https://overpass-api.de/api/interpreter",
"https://overpass.kumi.systems/api/interpreter",
"https://lz4.overpass-api.de/api/interpreter"
]
# Current districts
query_current = """
[out:json][timeout:180];
(
rel(48.1,16.15,48.35,16.65)[admin_level=9];
);
out body geom;
"""
# Historical districts (2014-01-01)
query_2014 = """
[out:json][timeout:180][date:"2014-01-01T00:00:00Z"];
(
rel(48.1,16.15,48.35,16.65)[admin_level=9];
);
out body geom;
"""
print("Fetching current districts...")
data_current = query_overpass(query_current, endpoints)
print("Fetching 2014 districts...")
data_2014 = query_overpass(query_2014, endpoints)
```
Wait, the date syntax needs to be correct. In Overpass, the date parameter goes like this:
`[out:json][date:"2014-01-01T00:00:00Z"]`
Let me also think about how to parse the relations. A relation in OSM can have `members` which are ways with roles, and for multipolygons we need to assemble the geometry from the ways.
When using `out body geom;`, Overpass returns the geometry of each element. For relations, `geom` gives the geometry of each member way, which we then need to assemble.
Actually, `out geom` on a relation in Overpass might not give the full multipolygon geometry assembled. Let me use `out body;` and then `convert` to get the full geometry, or use `out body geom` which returns the member geometries and assemble them.
Actually, the simplest approach for Overpass is to use `out body geom;` which will include the geometry of ways that are members of the relation. Then I need to assemble the multipolygon from the ways.
Alternatively, I can use Overpass's `make` or `convert` to create geometry from relations. Let me use a different approach:
```
[out:json][timeout:180];
rel(48.1,16.15,48.35,16.65)[admin_level=9];
out body;
>;
out skel qt;
```
This gives all nodes and ways. Then I'd assemble manually. But this is complex for multipolygons.
Actually, OSMnx can help with this! Let me use osmnx to get the geometries.
Wait, osmnx's `geometries_from_place` or `geometries_from_bbox` with the right tags could work.
Let me use osmnx:
```python
import osmnx as ox
# Current districts
gdf_current = ox.geometries_from_bbox(48.35, 48.1, 16.65, 16.15,
tags={'admin_level': '9'})
# For 2014, we'd need to query Overpass with date parameter...
```
Hmm, osmnx doesn't easily support historical queries. Let me stick with raw Overpass.
Let me try a different approach for the Overpass query. I'll use `out body geom;` on relations and then reconstruct the multipolygons.
Actually, the cleanest way is:
```
[out:json][timeout:180];
area(3600109166)->.vienna;
rel(area.vienna)[admin_level=9];
out body;
>;
out skel qt;
```
Wait, I keep going in circles. Let me just use the approach where I:
1. Get all relations with admin_level=9 in the bbox
2. For each relation, get its member ways
3. Assemble the polygon from the way coordinates
Let me use the simpler Overpass `geom` output which gives coordinates for each element.
For the relations query with `out geom`, Overpass actually returns the aggregated geometry of the relation. Let me test this.
Actually, looking at the Overpass documentation more carefully:
`out geom` on a relation returns the geometry of each element in the result set. For relations, this means the member geometries are included, but not assembled into a multipolygon.
But there's a special syntax: `out geom;` returns geometry for each element. For relations with type=multipolygon, each member way's geometry is included, and we need to assemble them.
A cleaner approach: use the `map_to_area` function and then query for areas, or use the `is_in` approach.
Let me try yet another approach:
```
[out:json][timeout:180];
rel(48.1,16.15,48.35,16.65)[admin_level=9]["boundary"="administrative"];
out body;
>;
out skel qt;
```
This returns all nodes and ways that make up the relations. Then I assemble the polygons from the ways.
Let me just write the full script now. I'll use `osmnx` for the current data (since it handles multipolygon assembly), and Overpass for historical.
Actually, let me simplify. I'll use Overpass directly with the `convert` statement to create point geometries for each relation, then use the full detail.
OK, let me just go with a practical approach:
1. Use Overpass queries to get both current and historical data
2. Parse the responses to build GeoDataFrames
3. Compute differences
For assembling relation geometries from Overpass, I'll:
- Get relations with `out body geom;`
- Iterate through members of each relation
- For each way member, get its geometry (from the `geom` field which is included when using `out geom`)
- Assemble the polygon from the rings (outer/inner by role)
Let me write the full script. I'll handle the complexity step by step.
Actually, let me try a much simpler approach. Let me use Overpass's `make` or `convert` to convert relations to areas, or use the `area` query type.
Hmm, actually there's `out geom` which for relations returns a geometry collection. Let me test with `out center` first to get the center points, and then query each relation individually.
No, this is getting too complicated. Let me just try the query and see what we get back:
```
[out:json][timeout:180];
rel(48.1,16.15,48.35,16.65)[admin_level=9];
out body geom;
```
This should return relations with their member ways and the geometry of each way (since `geom` on the relation level gives member geometry). Then I assemble.
Let me write the script and test it.
Actually, I just realized there's a much simpler way. Let me use the Overpass Turbo approach where you can get the geometry of each relation as a multipolygon by using the `convert` statement:
```
[out:json][timeout:180];
rel(48.1,16.15,48.35,16.65)[admin_level=9];
convert rel ::geom=geom(), ::id=id(), name=::t("name"), admin_level=::t("admin_level");
out geom;
```
Wait, `geom()` in convert gives the geometry of the relation (assembled). Let me try this.
Actually, I'm not sure if `geom()` assembles multipolygons correctly for relations. Let me try a different tactic.
I'll use the overpass API to get ways that form administrative boundaries for admin_level=9, then use those to build district polygons.
Actually, you know what, let me just fetch the data using `out body geom;` on relations, parse the JSON, and assemble the multipolygons manually from the member ways. It's the most reliable approach.
Here's my plan for the script:
1. Query current relations with `out body geom;`
2. Query 2014 relations with `out body geom;` and date parameter
3. For each relation, assemble the polygon from member ways
4. Create GeoDataFrames
5. Normalize names
6. Compute geometry differences
7. Classify and save
Let me write the script now. I'll handle the Overpass response parsing carefully.
For the Overpass `out body geom` on relations, the response looks like:
```json
{
"elements": [
{
"type": "relation",
"id": 12345,
"tags": {"name": "...", "admin_level": "9"},
"members": [
{"type": "way", "ref": 67890, "role": "outer", "geometry": [{"lat": ..., "lon": ...}, ...]},
{"type": "way", "ref": 67891, "role": "inner", "geometry": [...]}
]
}
]
}
```
Each member has a `geometry` array when using `out geom`. So I can assemble the polygon from these.
Let me write the script now.
For 2014, the district names might be different. Let me think about what they were:
- Currently: "1. Bezirk, Innere Stadt"
- In 2014: Might be "01., Innere Stadt" or "Wien 1., Innere Stadt" or "1. Bezirk Innere Stadt"
Common patterns:
- Some use "Wien" prefix: "Wien 1., Innere Stadt"
- Some use just the number: "1., Innere Stadt"
- Some are all caps in older data
I'll need to normalize names. A good approach:
- Extract the district number (1-23) from the name
- Use a mapping from number to canonical name
- Or normalize by lowercasing, stripping whitespace, and matching
Let me use a mapping approach:
1. Extract the number (1-23) from the name
2. Map to a standard name like "1. Bezirk, Innere Stadt"
Actually, let me just use the district number as the key for matching, and use a standard name for the output.
Let me write the script. I'll also handle the case where not all 23 districts are found (some might be missing from one snapshot).
Let me also think about the "touches_changed" flag. For each district, I need to check if its boundary touches any area that was added or removed. This means: for each district polygon, check if it shares a boundary (touches) with any fragment that has change_type "added_since_2014" or "removed_since_2014".
Let me write the script now.
```python
import requests
import time
import json
import sys
from shapely.geometry import shape, Polygon, MultiPolygon, LineString, Point
from shapely.ops import unary_union, polygonize
from shapely.validation import make_valid
import geopandas as gpd
import pandas as pd
import re
# Overpass endpoints
ENDPOINTS = [
"https://overpass-api.de/api/interpreter",
"https://overpass.kumi.systems/api/interpreter",
"https://lz4.overpass-api.de/api/interpreter"
]
def query_overpass(query, max_retries=3):
for endpoint in ENDPOINTS:
for attempt in range(max_retries):
try:
r = requests.post(endpoint, data={'data': query}, timeout=300)
if r.status_code == 200:
return r.json()
msg = f"HTTP {r.status_code}"
print(f" {endpoint} attempt {attempt+1}: {msg}")
except Exception as e:
print(f" {endpoint} attempt {attempt+1}: {e}")
time.sleep(3)
raise RuntimeError("All Overpass endpoints failed")
def assemble_polygon_from_ways(members):
"""Assemble a (Multi)Polygon from relation member ways with geometry."""
outer_rings = []
inner_rings = []
for member in members:
if member.get('type') != 'way' or 'geometry' not in member:
continue
coords = [(g['lon'], g['lat']) for g in member['geometry']]
if len(coords) < 3:
continue
ring = coords
if member.get('role') == 'inner':
inner_rings.append(ring)
else:
outer_rings.append(ring)
if not outer_rings:
return None
# Build polygons
polygons = []
for outer in outer_rings:
if len(outer) < 3:
continue
try:
poly = Polygon(outer)
# Cut inner rings
for inner in inner_rings:
if len(inner) < 3:
continue
try:
inner_poly = Polygon(inner)
if inner_poly.within(poly):
poly = poly.difference(inner_poly)
except:
pass
if poly.is_valid and not poly.is_empty:
polygons.append(poly)
except:
pass
if not polygons:
return None
if len(polygons) == 1:
return polygons[0]
return MultiPolygon(polygons)
def extract_district_number(name):
"""Extract district number from name string."""
# Try to find a number like "1.", "01.", "1" at start or after "Wien"
m = re.search(r'(\d+)\.?\s*(?:Bezirk)?', name)
if m:
return int(m.group(1))
# Try just a number at start
m = re.match(r'(\d{1,2})\s*[,\.]', name)
if m:
return int(m.group(1))
return None
def normalize_district_name(name):
"""Normalize district name to a standard form."""
# Known Vienna districts
districts = {
1: "1. Bezirk, Innere Stadt",
2: "2. Bezirk, Leopoldstadt",
3: "3. Bezirk, Landstraße",
4: "4. Bezirk, Wieden",
5: "5. Bezirk, Margareten",
6: "6. Bezirk, Mariahilf",
7: "7. Bezirk, Neubau",
8: "8. Bezirk, Josefstadt",
9: "9. Bezirk, Alsergrund",
10: "10. Bezirk, Favoriten",
11: "11. Bezirk, Simmering",
12: "12. Bezirk, Meidling",
13: "13. Bezirk, Hietzing",
14: "14. Bezirk, Penzing",
15: "15. Bezirk, Rudolfsheim-Fünfhaus",
16: "16. Bezirk, Ottakring",
17: "17. Bezirk, Hernals",
18: "18. Bezirk, Währing",
19: "19. Bezirk, Döbling",
20: "20. Bezirk, Brigittenau",
21: "21. Bezirk, Floridsdorf",
22: "22. Bezirk, Donaustadt",
23: "23. Bezirk, Liesing",
}
num = extract_district_number(name)
if num and num in districts:
return districts[num]
return name
def parse_districts(data):
"""Parse Overpass response into GeoDataFrame of districts."""
features = []
for element in data.get('elements', []):
if element.get('type') != 'relation':
continue
tags = element.get('tags', {})
name = tags.get('name', '')
if not name:
continue
geom = assemble_polygon_from_ways(element.get('members', []))
if geom is None:
continue
norm_name = normalize_district_name(name)
features.append({
'geometry': make_valid(geom),
'name': name,
'normalized_name': norm_name,
})
gdf = gpd.GeoDataFrame(features, crs='EPSG:4326')
return gdf
print("=== Fetching current Vienna districts ===")
query_current = """
[out:json][timeout:300];
(
rel(48.1,16.15,48.35,16.65)[admin_level=9]["boundary"="administrative"];
);
out body geom;
"""
data_current = query_overpass(query_current)
print("=== Fetching 2014-01-01 Vienna districts ===")
query_2014 = """
[out:json][timeout:300][date:"2014-01-01T00:00:00Z"];
(
rel(48.1,16.15,48.35,16.65)[admin_level=9]["boundary"="administrative"];
);
out body geom;
"""
data_2014 = query_overpass(query_2014)
print("Parsing district geometries...")
gdf_current = parse_districts(data_current)
gdf_2014 = parse_districts(data_2014)
print(f"Current districts: {len(gdf_current)}")
print(f"2014 districts: {len(gdf_2014)}")
print(gdf_current[['name', 'normalized_name']])
print(gdf_2014[['name', 'normalized_name']])
```
This should give me the districts. Then I need to compute the differences.
For computing differences:
1. Union all current district geometries
2. Union all 2014 district geometries
3. Difference current - 2014 = added
4. Difference 2014 - current = removed
5. Intersection = unchanged
Then for each fragment, determine which district(s) it belongs to.
For "touches_changed": for each district, check if its boundary touches any added or removed fragment.
Let me refine the approach:
Actually, computing symmetric difference on the union of all districts will give me the boundary changes, but I need to classify each fragment by which district it belongs to.
Better approach:
1. For each district (matched by normalized name), compute the geometry difference
2. Collect all fragments classified by change_type and district association
Let me think about this differently. The "boundary changes" are really about the district boundaries. So I should:
1. For each matched district pair (current vs 2014), compute:
- Added: current_geom.difference(2014_geom) # area in current but not 2014
- Removed: 2014_geom.difference(current_geom) # area in 2014 but not current
- Unchanged: current_geom.intersection(2014_geom) # area in both
2. Each fragment gets:
- change_type: added_since_2014 / removed_since_2014 / unchanged
- district_name: normalized name of the district
- touches_changed: whether this fragment's geometry touches any added/removed fragment
Wait, "for each district flag whether its geometry borders any area that was added or removed". So touches_changed is per-district, not per-fragment.
Actually re-reading: "Classify each resulting geometry fragment by how it changed... For each district flag whether its geometry borders any area that was added or removed."
So each feature in the output is a geometry fragment, with:
- change_type: added_since_2014, removed_since_2014, or unchanged
- district_name: the normalised district name
- touches_changed: boolean - does this district's overall geometry border any added/removed area?
Hmm, but "touches_changed" is per district. So all fragments from the same district would have the same touches_changed value.
Let me re-read: "For each district flag whether its geometry borders any area that was added or removed."
I think this means: for each district, check if its boundary (the current boundary for current districts, or the 2014 boundary for 2014 districts?) touches any area that was added or removed. If yes, touches_changed=True for all fragments of that district.
Actually, I think "each district" refers to the current state. So for each current district, check if its boundary touches any added or removed area. If so, all fragments of that district get touches_changed=True.
Let me implement this step by step.
Let me also think about what happens when districts don't match between the two snapshots. The normalised names should handle this.
Let me write the complete script:
```python
import requests
import time
import json
import re
import sys
from shapely.geometry import shape, Polygon, MultiPolygon, GeometryCollection
from shapely.ops import unary_union
from shapely.validation import make_valid
import geopandas as gpd
import pandas as pd
# ... [query functions and parsing] ...
# Match districts by normalized name
merged = gdf_current.merge(gdf_2014, on='normalized_name', how='outer', suffixes=('_cur', '_2014'))
fragments = []
for _, row in merged.iterrows():
norm_name = row['normalized_name']
cur_geom = row.get('geometry_cur')
his_geom = row.get('geometry_2014')
if cur_geom is not None and his_geom is not None:
# Both exist - compute differences
added = cur_geom.difference(his_geom) if not cur_geom.equals(his_geom) else None
removed = his_geom.difference(cur_geom) if not cur_geom.equals(his_geom) else None
unchanged = cur_geom.intersection(his_geom)
for geom_part, ctype in [(added, 'added_since_2014'), (removed, 'removed_since_2014'), (unchanged, 'unchanged')]:
if geom_part is not None and not geom_part.is_empty:
if geom_part.geom_type == 'GeometryCollection':
for g in geom_part.geoms:
if g.geom_type in ('Polygon', 'MultiPolygon') and not g.is_empty:
fragments.append({'geometry': g, 'change_type': ctype, 'district_name': norm_name})
elif geom_part.geom_type in ('Polygon', 'MultiPolygon'):
fragments.append({'geometry': geom_part, 'change_type': ctype, 'district_name': norm_name})
else:
# Try to extract polygons
pass
elif cur_geom is not None:
# Only in current
fragments.append({'geometry': cur_geom, 'change_type': 'added_since_2014', 'district_name': norm_name})
elif his_geom is not None:
# Only in 2014
fragments.append({'geometry': his_geom, 'change_type': 'removed_since_2014', 'district_name': norm_name})
```
Wait, but the "unchanged" might overlap between districts (at boundaries). Let me think about this...
For the unchanged part, it's the intersection of current and 2014 for the same district. This is fine.
For the touches_changed flag, I need to check for each district if its geometry touches any added or removed area. I'll create a dissolved geometry of all added and removed areas, then check each district's current geometry against that.
Let me finalize the script. I'll be careful about geometry types and validity.
Here's the refined approach:
1. Match districts by normalized name
2. For each matched district:
a. Compute added (current - historical), removed (historical - current), unchanged (intersection)
b. Add each to fragments list
3. For unmatched districts (only in one snapshot):
a. If only current: all of it is added
b. If only historical: all of it is removed
4. Compute touches_changed per district
For touches_changed:
- Collect all added and removed geometries into a single dissolved geometry
- For each district (current geometry), check if it touches any of the changed geometries
Let me write the final script now.
One more thing - I need to handle the `out geom` response properly. Let me also check what the response looks like by first running a quick test query.
Actually, let me just write the full script and run it. If there are issues, I'll debug.
Also, for 2014, the districts might have been stored differently - possibly as ways directly rather than relations. Let me also query for ways with admin_level=9 in 2014, just in case.
Actually, let me just try it with relations first. In 2014, Vienna had the same 23 districts, and they were likely stored as relations.
Hmm, but what if in 2014, some districts didn't have the `boundary=administrative` tag? Let me remove that filter from the 2014 query.
Let me finalize the script and run it.
1:48
1:48
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
r = requests.get('https://overpass-api.de/api/interpreter', timeout=10)
print('Overpass reachable:', r.status_code)
"
Overpass reachable: 406
2:01
The Overpass API is reachable (406 is expected for a GET without a query). Let me now write the solution. Plan: 1. Query Overpass for Vienna's current district boundaries (Bezirke). In OSM, Vienna's districts are relations with `admin_level=9` or `10` and `boundary=administrative`, within the city of Vienna (`relation` for Vienna city is `boundary=administrative` `name=Wien`). 2. Query for 2014-01-01 snapshot using Overpass's date-based retrieval. 3. Normalise district names (handle casing and spelling differences). 4. Compute symmetric difference to get changed areas. 5. Classify into `added_since_2014`, `removed_since_2014`, `unchanged`. 6. Determine for each district whether its geometry borders any changed area. 7. Output as GeoJSON. Let me first find the correct OSM relation IDs and query structure. Vienna's districts: In OSM, Vienna city is relation 109166. The districts are administered as sub-areas. Let me look for the right approach. Actually, I can query for all admin_level=10 relations within Vienna (or admin_level=9 - some places use level 9 for districts). Let me first explore. Let me write a comprehensive script. For the 2014 snapshot, Overpass supports `[date:"2014-01-01T00:00:00Z"]` to retrieve data as it was on that date. Let me write step by step.
2:01
Write
/work/explore.py
content · 21 lines · py
"""
Explore how Vienna's districts are tagged in OSM to craft proper queries.
"""
import requests
import json
# First find Vienna city relation
query1 = """
[out:json];
area[name="Wien"][admin_level=4][boundary=administrative];
(rel(area)[admin_level=9][boundary=administrative];
rel(area)[admin_level=10][boundary=administrative];);
out tags;
"""
r = requests.post("https://overpass-api.de/api/interpreter", data={"data": query1}, timeout=60)
data = r.json()
print(f"Found {len(data.get('elements', []))} district relations")
for el in data.get("elements", []):
tags = el.get("tags", {})
print(f" id={el['id']} admin_level={tags.get('admin_level')} name='{tags.get('name')}' ref='{tags.get('ref')}'")
Wrote /work/explore.py (713 chars).
2:06
Let me run this to see what districts exist.
2:06
Bash
cd /work && python3 explore.py
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
return complexjson.loads(self.text, **kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/work/explore.py", line 17, in <module>
data = r.json()
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2:20
The Overpass API is returning a 406, which means the request wasn't acceptable. Let me check what's happening and try a different approach - maybe using GET instead of POST, or a different query format.
2:20
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
r = requests.get('https://overpass-api.de/api/interpreter?data=[out:json];area[name=\"Wien\"][admin_level=4];out%20tags;',
timeout=60, headers={'Accept': 'application/json'})
print('Status:', r.status_code)
print('Body[:500]:', r.text[:500])
"
Status: 406 Body[:500]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
2:30
The Overpass API is returning 406 errors with GET requests. Let me try POST requests which are the standard way to query Overpass.
2:30
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
query = '[out:json];area[name=\"Wien\"][admin_level=4];out tags;'
r = requests.post('https://overpass-api.de/api/interpreter',
data={'data': query},
headers={'Content-Type': 'application/x-www-form-urlencoded'},
timeout=60)
print('Status:', r.status_code)
print('Body[:800]:', r.text[:800])
"
Status: 406 Body[:800]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
2:38
The Overpass API is returning 406 Not Acceptable. Let me try with a different content type or try the POST method differently. Let me also try a different endpoint.
2:38
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
# Try the Overpass Turbo-style form POST
query = '[out:json];area[name=\"Wien\"][admin_level=4];out tags;'
r = requests.post('https://overpass-api.de/api/interpreter',
data=query,
headers={'Content-Type': 'text/plain'},
timeout=60)
print('Status:', r.status_code)
print('Body[:800]:', r.text[:800])
"
Status: 406 Body[:800]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
2:51
The Overpass API is returning 406 - Not Acceptable. Let me try using the `requests` library with a proper Accept header, or try the Overpass Turbo API. Actually, the queries I was sending might be too simple or the POST body format might be wrong. The standard Overpass API endpoint expects `data` as the POST body parameter. Let me double-check. Actually, I think the issue is that the server needs the `Accept` header to be set to `application/json`. Let me try several approaches.
2:51
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
# Try proper POST with form data
query = '[out:json];area[name=\"Wien\"][admin_level=4];out tags;'
r = requests.post('https://overpass-api.de/api/interpreter',
data={'data': query},
headers={'Accept': 'application/json'},
timeout=60)
print('Status:', r.status_code)
print('Body[:500]:', r.text[:500])
# Try the kumi mirror
r2 = requests.post('https://overpass.kumi.systems/api/interpreter',
data={'data': query},
headers={'Accept': 'application/json'},
timeout=60)
print('Status kumi:', r2.status_code)
print('Body kumi[:500]:', r2.text[:500])
"
Status: 406 Body[:500]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 534, in _make_request
response = conn.getresponse()
File "/usr/local/lib/python3.14/site-packages/urllib3/connection.py", line 571, in getresponse
httplib_response = super().getresponse()
File "/usr/local/lib/python3.14/http/client.py", line 1459, in getresponse
response.begin()
~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 336, in begin
version, status, reason = self._read_status()
~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 297, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/socket.py", line 729, in readinto
return self._sock.recv_into(b)
~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/ssl.py", line 1304, in recv_into
return self.read(nbytes, buffer)
~~~~~~~~~^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/ssl.py", line 1138, in read
return self._sslobj.read(len, buffer)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
TimeoutError: The read operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 696, in send
resp = conn.urlopen(
method=request.method,
...<9 lines>...
chunked=chunked,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 842, in urlopen
retries = retries.increment(
method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
)
File "/usr/local/lib/python3.14/site-packages/urllib3/util/retry.py", line 498, in increment
raise reraise(type(error), error, _stacktrace)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 788, in urlopen
response = self._make_request(
conn,
...<10 lines>...
**response_kw,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 536, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 367, in _raise_timeout
raise ReadTimeoutError(
self, url, f"Read timed out. (read timeout={timeout_value})"
) from err
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=60)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 14, in <module>
r2 = requests.post('https://overpass.kumi.systems/api/interpreter',
data={'data': query},
headers={'Accept': 'application/json'},
timeout=60)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 134, in post
return request("post", url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 71, in request
return session.request(method=method, url=url, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 651, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 784, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 742, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=60)
4:01
The overpass-api.de is returning 406 because of the Accept header. Let me try without the Accept header, or try a different approach. The 406 is likely because the response content type negotiation is failing with the default Accept header that requests sends. Let me also try the lz4 mirror.
4:01
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
query = '[out:json];area[name=\"Wien\"][admin_level=4];out tags;'
# Try without custom headers, just plain POST with data param
r = requests.post('https://overpass-api.de/api/interpreter',
data={'data': query},
timeout=60)
print('Status 1:', r.status_code)
print('Body[:300]:', r.text[:300])
# Try lz4 mirror
r2 = requests.post('https://lz4.overpass-api.de/api/interpreter',
data={'data': query},
timeout=60)
print('Status 2:', r2.status_code)
print('Body[:300]:', r2.text[:300])
# Try with text/plain body
r3 = requests.post('https://overpass-api.de/api/interpreter',
data=query,
timeout=60)
print('Status 3:', r3.status_code)
print('Body[:300]:', r3.text[:300])
"
Status 1: 406 Body[:300]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6 Status 2: 406 Body[:300]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6 Status 3: 406 Body[:300]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6
4:10
The overpass API is returning 406 for all requests. Let me try using osmnx which has built-in Overpass handling - it uses the correct approach. Actually, maybe the issue is that these IPs are being blocked. Let me try osmnx's `features_from_place` which handles the API interaction internally. Alternatively, I could use osmium to download a local OSM extract of Vienna. But let me first try osmnx which handles the overpass API properly.
4:10
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print(ox.__version__)
# Try to get Vienna districts
try:
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': ['9','10']})
print(gdf.columns.tolist())
print(gdf[['name','admin_level']].head(10))
except Exception as e:
print(f'Error: {e}')
" 2>&1 | head -50
2.1.0
['geometry', 'ISO3166-1:alpha2', 'admin_level', 'capital', 'description', 'name', 'name:ab', 'name:ace', 'name:af', 'name:am', 'name:an', 'name:ar', 'name:arc', 'name:arz', 'name:as', 'name:ast', 'name:av', 'name:az', 'name:ba', 'name:bar', 'name:bat-smg', 'name:be', 'name:be-tarask', 'name:bg', 'name:bi', 'name:bn', 'name:bo', 'name:br', 'name:bs', 'name:ca', 'name:ce', 'name:ceb', 'name:ch', 'name:ckb', 'name:co', 'name:crh', 'name:cs', 'name:cu', 'name:cv', 'name:cy', 'name:de', 'name:diq', 'name:ee', 'name:el', 'name:en', 'name:eo', 'name:es', 'name:et', 'name:eu', 'name:ext', 'name:fa', 'name:ff', 'name:fi', 'name:fr', 'name:frp', 'name:fur', 'name:fy', 'name:ga', 'name:gag', 'name:gd', 'name:gl', 'name:gn', 'name:gv', 'name:ha', 'name:he', 'name:hi', 'name:hif', 'name:hr', 'name:ht', 'name:hu', 'name:hy', 'name:ia', 'name:id', 'name:ie', 'name:ilo', 'name:is', 'name:it', 'name:ja', 'name:jv', 'name:ka', 'name:kab', 'name:kk', 'name:kn', 'name:ko', 'name:koi', 'name:krc', 'name:ku', 'name:kv', 'name:ky', 'name:la', 'name:lez', 'name:li', 'name:lij', 'name:lmo', 'name:ln', 'name:lt', 'name:lv', 'name:mg', 'name:mhr', 'name:mi', 'name:mk', 'name:ml', 'name:mn', 'name:mr', 'name:mrj', 'name:ms', 'name:mt', 'name:my', 'name:nah', 'name:nds', 'name:nds-nl', 'name:ne', 'name:nl', 'name:no', 'name:nv', 'name:oc', 'name:or', 'name:os', 'name:pa', 'name:pap', 'name:pih', 'name:pl', 'name:pms', 'name:pnb', 'name:ps', 'name:pt', 'name:rm', 'name:rmy', 'name:ro', 'name:roa-tara', 'name:ru', 'name:rue', 'name:sah', 'name:sc', 'name:scn', 'name:sco', 'name:sd', 'name:sh', 'name:si', 'name:sk', 'name:sl', 'name:sn', 'name:so', 'name:sq', 'name:sr', 'name:st', 'name:sv', 'name:sw', 'name:szl', 'name:ta', 'name:te', 'name:tg', 'name:th', 'name:tk', 'name:tl', 'name:tr', 'name:tt', 'name:tw', 'name:tzl', 'name:udm', 'name:ug', 'name:uk', 'name:ur', 'name:uz', 'name:vec', 'name:vep', 'name:vi', 'name:vro', 'name:wa', 'name:war', 'name:wo', 'name:wuu', 'name:yi', 'name:yo', 'name:yue', 'name:zh', 'name:zh-Hans', 'name:zh-Hant', 'population', 'population:date', 'source:population', 'website', 'wikidata', 'wikipedia', 'ISO3166-1', 'alt_name:eo', 'currency', 'flag', 'name:ak', 'name:chr', 'name:da', 'name:dz', 'name:fo', 'name:gu', 'name:haw', 'name:hsb', 'name:io', 'name:jbo', 'name:kl', 'name:ksh', 'name:kw', 'name:lb', 'name:lo', 'name:nn', 'name:se', 'name:smn', 'name:sms', 'name:tok', 'name:vo', 'name:zu', 'official_name', 'official_name:be', 'official_name:ckb', 'official_name:cs', 'official_name:el', 'official_name:eo', 'official_name:hu', 'source', 'boundary', 'image', 'ref', 'alt_name', 'width', 'wikimedia_commons', 'note', 'long_name', 'addr:city', 'addr:country', 'addr:housenumber', 'addr:postcode', 'addr:street', 'government', 'office', 'wheelchair', 'opening_hours', 'highway', 'maxspeed', 'sidewalk', 'source:maxspeed', 'lane_markings', 'lit', 'surface', 'oneway', 'oneway:bicycle', 'cycleway:left', 'cycleway:right', 'lanes', 'sidewalk:left', 'sidewalk:right', 'destination', 'destination:symbol', 'lanes:bus', 'sidewalk:left:surface', 'smoothness', 'turn:lanes', 'cycleway', 'cycleway:lane', 'parking:both', 'parking:left:orientation', 'parking:right:orientation', 'cycleway:both', 'parking:both:orientation', 'sidewalk:both', 'maxspeed:type', 'sidewalk:right:surface', 'cycleway:both:lane', 'network', 'embedded_rails', 'sidewalk:both:surface', 'est_width:carriageway', 'parking:left', 'parking:right', 'source:est_width:carriageway', 'name:etymology:wikidata', 'bridge', 'lanes:backward', 'lanes:forward', 'layer', 'turn:lanes:backward', 'turn:lanes:forward', 'lcn', 'cycleway:left:lane', 'name:etymology', 'name:start_date', 'maxweight:signed', 'bicycle', 'cycleway:left:oneway', 'loc_name', 'motor_vehicle:conditional', 'old_ref', 'traffic_sign', 'electrified', 'frequency', 'gauge', 'operator', 'operator:wikidata', 'railway', 'railway:preferred_direction', 'voltage', 'oneway:psv', 'odbl', 'boundary_type', 'cycleway:right:lane', 'shoulder', 'check_date', 'oneway:bus', 'source:width', 'amenity', 'building', 'building:levels', 'check_date:cycleway', 'check_date:smoothness', 'name:wikipedia', 'maxheight', 'level', 'destination:backward', 'destination:ref:backward', 'destination:forward', 'destination:ref:forward', 'parking:right:restriction', 'type', 'ISO3166-1:alpha3', 'ISO3166-1:numeric', 'TMC:cid_58:tabcd_1:Class', 'TMC:cid_58:tabcd_1:LCLversion', 'TMC:cid_58:tabcd_1:LocationCode', 'default_language', 'driving_side', 'int_name', 'land_area', 'name:als', 'name:ang', 'name:ay', 'name:bcl', 'name:bh', 'name:bpy', 'name:bug', 'name:bxr', 'name:cbk-zam', 'name:cdo', 'name:chm', 'name:chy', 'name:csb', 'name:din', 'name:dsb', 'name:dv', 'name:frr', 'name:hak', 'name:ig', 'name:kaa', 'name:kbd', 'name:kg', 'name:lad', 'name:lld', 'name:ltg', 'name:lzh', 'name:mdf', 'name:mwl', 'name:na', 'name:nan', 'name:new', 'name:nov', 'name:nrm', 'name:om', 'name:pag', 'name:pam', 'name:pcd', 'name:pdc', 'name:pfl', 'name:pi', 'name:pnt', 'name:prefix:at', 'name:qu', 'name:roa-rup', 'name:rw', 'name:sa', 'name:sm', 'name:ss', 'name:stq', 'name:su', 'name:tet', 'name:tpi', 'name:ts', 'name:vls', 'name:xal', 'name:xmf', 'name:za', 'name:zea', 'official_name:be-tarask', 'official_name:ca', 'official_name:mi', 'ref:at:gkz', 'timezone', 'ISO3166-2', 'alt_name:cs', 'coat_of_arms', 'ref:nuts', 'ref:nuts:2', 'ref:at:okz', 'admin_level:6', 'admin_level:8', 'short_name', 'admin_title', 'designation', 'official_status', 'ref:at:kgnr', 'street_names:wikipedia', 'old_name', 'wikipedia:ru', 'roof:levels', 'related:wikipedia']
name admin_level
element id
node 2518271394 Bezirksgesundheitsamt 9
2518271399 Mag Elf 9
2518271406 Bürgerdienst Kundenzentrum Simmering 9
relation 16239 Österreich 2
63619 Bezirk St. Pölten 6
77189 Niederösterreich 4
103662 Deutsch-Wagram 8
103894 Aderklaa 8
103999 Raasdorf 8
104072 Bezirk Gänserndorf 6
5:52
I see that osmnx returns features from all over Austria when querying for "Vienna, Austria". I need to filter specifically for Vienna districts. Let me look more carefully at what's available. The query returned results from all levels including admin_level=9 (which seems to be some smaller administrative unit) and admin_level=10. Let me try a more specific approach. Actually, let me use osmnx to get the geometry of Vienna city first, then use it as a bounding box. Or better yet, let me query specifically for the Vienna district relations. Vienna's districts are tagged as admin_level=10 in Austria (the city of Vienna is admin_level=5 or admin_level=4 depending on tagging, and districts inside are admin_level=9 or 10). Let me check the Overpass API via the OSMnx approach. Actually, OSMnx works! It uses Overpass under the hood. Let me use it properly.
5:52
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import geopandas as gpd
# Get Vienna boundary as a geometry filter
vienna = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '4'})
print('Vienna features:', len(vienna))
print(vienna[['name','admin_level', 'geometry']].head())
# Filter to just the Wien city boundary
wien = vienna[vienna['name'] == 'Wien']
print('Wien entries:', len(wien))
wien_geom = wien.unary_union
print('Vienna geometry type:', wien_geom.geom_type)
" 2>&1
Vienna features: 895
name ... geometry
element id ...
relation 16239 Österreich ... MULTIPOLYGON (((10.45439 47.55576, 10.45437 47...
63619 Bezirk St. Pölten ... POLYGON ((15.2636 47.99808, 15.26399 47.99822,...
77189 Niederösterreich ... POLYGON ((14.45642 48.04341, 14.45692 48.04397...
103662 Deutsch-Wagram ... POLYGON ((16.50226 48.29035, 16.50346 48.29319...
103894 Aderklaa ... POLYGON ((16.5067 48.2736, 16.50667 48.27361, ...
[5 rows x 3 columns]
Wien entries: 1
<string>:13: DeprecationWarning: The 'unary_union' attribute is deprecated, use the 'union_all()' method instead.
Vienna geometry type: Polygon
6:15
Now I can get the districts. Let me query for Vienna districts (admin_level=10 in Austria or admin_level=9). Actually, in Vienna proper, the districts (Bezirke) are typically tagged with admin_level=10. Let me try getting them.
6:15
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import geopandas as gpd
# Get Vienna districts - admin_level=10 in Austria are the Bezirke
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '10'})
print('District count:', len(gdf))
cols = [c for c in gdf.columns if c in ['name','admin_level','ref','name:de','wikidata','boundary','geometry']]
print(gdf[cols].head(30))
# Also check what admin_level values exist
print()
print('admin_level values:', gdf['admin_level'].value_counts())
print('name values:')
for n in gdf['name']:
print(f' {n}')
" 2>&1
District count: 895
geometry ... ref
element id ...
relation 16239 MULTIPOLYGON (((10.45439 47.55576, 10.45437 47... ... NaN
63619 POLYGON ((15.2636 47.99808, 15.26399 47.99822,... ... PL
77189 POLYGON ((14.45642 48.04341, 14.45692 48.04397... ... NaN
103662 POLYGON ((16.50226 48.29035, 16.50346 48.29319... ... NaN
103894 POLYGON ((16.5067 48.2736, 16.50667 48.27361, ... ... NaN
103999 POLYGON ((16.54697 48.263, 16.54912 48.26776, ... ... NaN
104072 POLYGON ((16.50226 48.29035, 16.50346 48.29319... ... GF
104864 POLYGON ((16.53627 48.20044, 16.53653 48.20071... ... NaN
106250 POLYGON ((15.71468 48.39353, 15.71533 48.39373... ... TU
106740 MULTIPOLYGON (((15.98309 48.42725, 15.9834 48.... ... KO
109166 POLYGON ((16.18183 48.17112, 16.18724 48.1736,... ... NaN
111817 POLYGON ((16.43313 48.13788, 16.43622 48.13798... ... NaN
111818 POLYGON ((16.56484 48.10869, 16.56529 48.10963... ... NaN
111829 POLYGON ((16.41282 48.11824, 16.4124 48.1191, ... ... NaN
111838 POLYGON ((16.37535 48.09939, 16.37683 48.10248... ... NaN
111858 POLYGON ((16.04245 48.10984, 16.04196 48.10995... ... MD
111881 POLYGON ((16.35065 48.10496, 16.35062 48.10502... ... NaN
111887 POLYGON ((16.07967 48.13, 16.07962 48.13006, 1... ... NaN
111930 POLYGON ((16.14875 48.1554, 16.14891 48.15576,... ... NaN
111931 POLYGON ((16.13534 48.10559, 16.13499 48.1062,... ... NaN
111946 POLYGON ((16.22006 48.1041, 16.2205 48.10439, ... ... NaN
111948 POLYGON ((16.24931 48.1017, 16.24938 48.10177,... ... NaN
111956 POLYGON ((16.31221 48.11953, 16.31259 48.11963... ... NaN
112050 POLYGON ((16.37535 48.09939, 16.37683 48.10248... ... BL
112467 POLYGON ((16.11939 48.21432, 16.12029 48.21453... ... NaN
114764 POLYGON ((16.33432 48.32724, 16.33495 48.32774... ... NaN
114780 POLYGON ((16.36683 48.35527, 16.36774 48.35517... ... NaN
114940 POLYGON ((16.13251 48.25756, 16.13252 48.25749... ... NaN
114956 POLYGON ((16.20148 48.26782, 16.20155 48.26812... ... NaN
123913 POLYGON ((16.42136 48.32638, 16.42087 48.32638... ... NaN
[30 rows x 7 columns]
admin_level values: admin_level
9 412
10 325
4 107
8 36
6 12
2 1
Name: count, dtype: int64
name values:
Österreich
Bezirk St. Pölten
Niederösterreich
Deutsch-Wagram
Aderklaa
Raasdorf
Bezirk Gänserndorf
Groß-Enzersdorf
Bezirk Tulln
Bezirk Korneuburg
Wien
Schwechat
Fischamend
Lanzendorf
Leopoldsdorf
Bezirk Mödling
Hennersdorf
Breitenfurt bei Wien
Laab im Walde
Kaltenleutgeben
Perchtoldsdorf
Brunn am Gebirge
Vösendorf
Bezirk Bruck an der Leitha
Purkersdorf
Langenzersdorf
Hagenbrunn
Mauerbach
Klosterneuburg
Gerasdorf bei Wien
Katastralgemeinde Gerasdorf
Katastralgemeinde Mauer
Alsergrund
Hietzing
Innere Stadt
Josefstadt
Leopoldstadt
Mariahilf
Meidling
Neubau
Penzing
Rudolfsheim-Fünfhaus
Währing
Landstraße
Brigittenau
Donaustadt
Döbling
Favoriten
Floridsdorf
Hernals
Liesing
Margareten
Ottakring
Simmering
Wieden
Katastralgemeinde Inzersdorf Stadt
Katastralgemeinde Favoriten
Katastralgemeinde Oberlaa Stadt
Katastralgemeinde Rothneusiedl
Katastralgemeinde Oberlaa Land
Katastralgemeinde Unterlaa
Katastralgemeinde Inzersdorf
Katastralgemeinde Siebenhirten
Katastralgemeinde Erlaa
Katastralgemeinde Atzgersdorf
Wieden
Katastralgemeinde Margarethen
Katastralgemeinde Rodaun
Katastralgemeinde Liesing
Katastralgemeinde Kalksburg
Katastralgemeinde Altmannsdorf
Katastralgemeinde Innere Stadt
Katastralgemeinde Rosenberg
Katastralgemeinde Albern
Katastralgemeinde Hetzendorf
Katastralgemeinde Schönbrunn
Katastralgemeinde Meidling
Katastralgemeinde Gaudenzdorf
Katastralgemeinde Auhof
Katastralgemeinde Weidlingau
Katastralgemeinde Speising
Katastralgemeinde Lainz
Katastralgemeinde Hietzing
Katastralgemeinde Unter St. Veit
Katastralgemeinde Ober St. Veit
Katastralgemeinde Hacking
Katastralgemeinde Mariahilf
Katastralgemeinde Neubau
Katastralgemeinde Josefstadt
Katastralgemeinde Alsergrund
Katastralgemeinde Kaiserebersdorf
Katastralgemeinde Simmering
Katastralgemeinde Landstraße
Katastralgemeinde Leopoldstadt
Katastralgemeinde Brigittenau
Katastralgemeinde Kaiserebersdorf Herrschaft
Katastralgemeinde Landjägermeisteramt
Katastralgemeinde Eßling
Katastralgemeinde Süssenbrunn
Katastralgemeinde Breitenlee
Katastralgemeinde Aspern
Katastralgemeinde Hirschstetten
Katastralgemeinde Stadlau
Katastralgemeinde Kaisermühlen
Katastralgemeinde Kagran
Katastralgemeinde Floridsdorf
Katastralgemeinde Donaufeld
Katastralgemeinde Leopoldau
Katastralgemeinde Stammersdorf
Katastralgemeinde Großjedlersdorf I
Katastralgemeinde Großjedlersdorf II
Katastralgemeinde Strebersdorf
Katastralgemeinde Jedlesee
Katastralgemeinde Schwarze Lackenau
Katastralgemeinde Kahlenbergerdorf
Katastralgemeinde Josefsdorf
Katastralgemeinde Nußdorf
Katastralgemeinde Heiligenstadt
Katastralgemeinde Unterdöbling
Katastralgemeinde Grinzing
Katastralgemeinde Oberdöbling
Katastralgemeinde Untersievering
Katastralgemeinde Obersievering
Katastralgemeinde Salmannsdorf
Katastralgemeinde Neustift am Walde
Katastralgemeinde Pötzleinsdorf
Katastralgemeinde Gersthof
Katastralgemeinde Weinhaus
Katastralgemeinde Währing
Katastralgemeinde Neuwaldegg
Katastralgemeinde Dornbach
Katastralgemeinde Hernals
Katastralgemeinde Hadersdorf
Katastralgemeinde Ottakring
Katastralgemeinde Neulerchenfeld
Katastralgemeinde Sechshaus
Katastralgemeinde Rudolfsheim
Katastralgemeinde Fünfhaus
Katastralgemeinde Penzing
Katastralgemeinde Breitensee
Katastralgemeinde Unterbaumgarten
Katastralgemeinde Oberbaumgarten
Katastralgemeinde Hütteldorf
Katastralgemeinde Breitenfurt
Katastralgemeinde Hagenbrunn
Katastralgemeinde Schönau an der Donau
Katastralgemeinde Mühlleiten
Katastralgemeinde Großenzersdorf
Katastralgemeinde Stallingerfeld
Katastralgemeinde Raasdorf
Katastralgemeinde Leopoldsdorf
Katastralgemeinde Rustenfeld
Katastralgemeinde Oberlanzendorf
Katastralgemeinde Unterlanzendorf
Katastralgemeinde Kledering
Katastralgemeinde Rannersdorf
Katastralgemeinde Schwechat
Katastralgemeinde Mannswörth
Katastralgemeinde Fischamend Dorf
Katastralgemeinde Langenzersdorf
Katastralgemeinde Kapellerfeld
Katastralgemeinde Weidlingbach
Katastralgemeinde Weidling
Katastralgemeinde Klosterneuburg
Katastralgemeinde Aderklaa
Katastralgemeinde Mauerbach
Katastralgemeinde Laab im Walde
Katastralgemeinde Purkersdorf
Katastralgemeinde Kaltenleutgeben
Katastralgemeinde Perchtoldsdorf
Katastralgemeinde Vösendorf
Katastralgemeinde Brunn am Gebirge
Reichmanngasse
Fenzlgasse
Josef-Weinheber-Platz
Steinbruchstraße
Flötzersteig
Burggasse
Kleine Neugasse
Blechturmgasse
Rosa-Fischer-Gasse
Friedmanngasse
Lerchenfelder Straße
Lerchenfelder Straße
Donauturmstraße
nan
Rudolf-Waisenhorn-Gasse
Starkfriedgasse
Emil-Behring-Weg
Rudolf-Waisenhorn-Gasse
Gräßlplatz
Paragonstraße
Rudolf-Waisenhorn-Gasse
Drygalskiweg
Gregor-Mendel-Straße
Peter-Jordan-Straße
Peter-Jordan-Straße
Geblergasse
Klopstockgasse
Döblerhofstraße
Kreuzgassenbrücke
Rudolf-Waisenhorn-Gasse
Wattgasse
Büdingergasse
Peter-Jordan-Straße
Klimtgasse
Wittgensteinstraße
Kirchfeldgasse
nan
Peter-Jordan-Straße
Hermann-Pacher-Gasse
Haslingergasse
Kettenbrückengasse
Zanderbrücke
Mariahilfer Straße
Mariahilfer Straße
Ziegelofengasse
Rosa-Fischer-Gasse
nan
nan
Peter-Jordan-Straße
nan
nan
nan
nan
Sebaldgasse
Mariahilfer Straße
Mariahilfer Straße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Wienerbergstraße
Guglgasse
Haslingergasse
Mittersteig
nan
Donauturmstraße
Wundtgasse
Wittgensteinstraße
Atzgersdorfer Straße
Blumengasse
Starkfriedgasse
Mittersteig
nan
nan
nan
nan
nan
nan
nan
nan
nan
Donauturmstraße
Donauturmstraße
Schlechtastraße
Schlechtastraße
Rosa-Fischer-Gasse
Lerchenfelder Straße
Lerchenfelder Straße
Mariahilfer Straße
Hasenauerstraße
Lerchenfelder Straße
nan
Rudolf-Waisenhorn-Gasse
nan
nan
nan
Wienerbergstraße
Ottakringer Straße
nan
nan
Spiegelgrundstraße
nan
nan
nan
nan
nan
Babenbergerstraße
Prinz-Eugen-Straße
Veronikagasse
Speisinger Straße
Wattgasse
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Wienerbergstraße
nan
nan
nan
nan
Erdbergstraße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Wilhelminenstraße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Wundtgasse
nan
nan
nan
Wittgensteinstraße
Wittgensteinstraße
nan
nan
nan
nan
nan
Margaretenstraße
nan
Wichtelgasse
Veronikagasse
Leopold-Ernst-Gasse
nan
Hildebrandgasse
Khevenhüllerstraße
nan
nan
Sebaldgasse
nan
nan
nan
Oskar-Grissemann-Straße
Wundtgasse
Wundtgasse
Kettenbrücke
Atzgersdorfer Straße
nan
Wienerbergstraße
Wienerbergstraße
Lerchenfelder Straße
Flötzersteig
Rosenackerstraße
Flötzersteig
nan
nan
nan
Ottakringer Straße
nan
Mariahilfer Straße
Mariahilfer Straße
Mariahilfer Straße
Veronikagasse
Ottakringer Straße
nan
nan
nan
Kirchfeldgasse
Kirchfeldgasse
Mariahilfer Straße
Mariahilfer Straße
Paragonstraße
Paragonstraße
Mariahilfer Straße
nan
nan
nan
Wienerbergstraße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Kirchfeldgasse
nan
nan
nan
nan
nan
nan
Atzgersdorfer Straße
nan
nan
nan
nan
nan
nan
nan
Klimtgasse
nan
nan
nan
nan
nan
Karplusgasse
nan
nan
nan
Wienerbergstraße
Wienerbergstraße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Donauturmstraße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Drygalskiweg
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Hermann-Pacher-Gasse
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Mariahilfer Straße
Gregor-Mendel-Straße
Haslingergasse
Speisinger Straße
Speisinger Straße
nan
Ottakringer Straße
Peter-Jordan-Straße
Kleine Neugasse
Mittersteig
Blechturmgasse
Ottakringer Straße
Ottakringer Straße
Ottakringer Straße
Lerchenfelder Straße
Lerchenfelder Straße
Lerchenfelder Straße
Ottakringer Straße
nan
Donauturmstraße
nan
nan
Wittgensteinstraße
Wittgensteinstraße
Donauturmstraße
Peter-Jordan-Straße
Schlechtastraße
Atzgersdorfer Straße
Rudolf-Waisenhorn-Gasse
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Erdbergstraße
Erdbergstraße
Erdbergstraße
Erdbergstraße
nan
nan
nan
nan
Speisinger Straße
Paragonstraße
Erdbergstraße
Erdbergstraße
Emil-Behring-Weg
Döblerhofstraße
Drygalskiweg
Drygalskiweg
Erdbergstraße
Erdbergstraße
nan
Rudolf-Waisenhorn-Gasse
Rudolf-Waisenhorn-Gasse
nan
nan
nan
nan
Erdbergstraße
nan
Mariahilfer Straße
Flötzersteig
Ottakringer Straße
Ottakringer Straße
Atzgersdorfer Straße
Erdbergstraße
Erdbergstraße
nan
nan
Hermann-Pacher-Gasse
Hermann-Pacher-Gasse
Peter-Jordan-Straße
nan
nan
nan
nan
Ottakringer Straße
Ottakringer Straße
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
Ottakringer Straße
nan
nan
Peter-Jordan-Straße
Wattgasse
Wattgasse
Wattgasse
Speisinger Straße
Ottakringer Straße
Ottakringer Straße
Kirchfeldgasse
Kirchfeldgasse
Wundtgasse
Atzgersdorfer Straße
Atzgersdorfer Straße
Atzgersdorfer Straße
Atzgersdorfer Straße
nan
nan
nan
nan
nan
Wittgensteinstraße
Wittgensteinstraße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
Prinz-Eugen-Straße
6:44
The `features_from_place` is returning everything, not just the districts. I need to be more specific. Let me query directly for the Vienna districts (Bezirke). In OSM, Vienna's districts are relations with admin_level=10 that are within the Vienna boundary. Let me use a proper Overpass query via osmnx, or just use a direct Overpass query. But Overpass API is returning 406 errors... Let me try osmnx's internal approach. Actually, looking at the data output, I can see the district names: Alsergrund, Hietzing, Innere Stadt, Josefstadt, Leopoldstadt, Mariahilf, Meidling, Neubau, Penzing, Rudolfsheim-Fünfhaus, Währing, Landstraße, Brigittenau, Donaustadt, Döbling, Favoriten, Floridsdorf, Hernals, Liesing, Margareten, Ottakring, Simmering, Wieden. These are 23 districts. Let me filter to just the districts more precisely.
6:44
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import geopandas as gpd
# Get Vienna districts - filter by name to get just the 23 Bezirke
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '10'})
# Known Vienna district names (Bezirke)
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
districts = gdf[gdf['name'].isin(district_names)]
print('Districts found:', len(districts))
print(districts[['name','admin_level']])
print(districts.crs)
" 2>&1
Districts found: 24
name admin_level
element id
relation 1990590 Alsergrund 9
1990591 Hietzing 9
1990592 Innere Stadt 9
1990593 Josefstadt 9
1990594 Leopoldstadt 9
1990595 Mariahilf 9
1990596 Meidling 9
1990597 Neubau 9
1990598 Penzing 9
1990599 Rudolfsheim-Fünfhaus 9
1990600 Währing 9
1991416 Landstraße 9
1991433 Brigittenau 9
1991434 Donaustadt 9
1991435 Döbling 9
1991436 Favoriten 9
1991437 Floridsdorf 9
1991438 Hernals 9
1991439 Liesing 9
1991440 Margareten 9
1991441 Ottakring 9
1991442 Simmering 9
1991443 Wieden 9
5248033 Wieden 10
epsg:4326
6:57
So Vienna's districts are at admin_level=9, not 10. There's a duplicate Wieden (id 5248033 at level 10). Let me get just the admin_level=9 ones. Now I need to get the historical 2014-01-01 version. OSM has a feature called "attics" - we can query Overpass with a date parameter. Let me check if Overpass supports the `[date:"2014-01-01T00:00:00Z"]` syntax. Let me try querying Overpass directly with a proper approach using the Overpass API.
6:57
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
# Simple test query
query = '''
[out:json][timeout:30];
area[name=\"Wien\"][admin_level=4]->.wien;
rel(area.wien)[boundary=administrative][admin_level=9];
out geom;
'''
r = requests.get('https://overpass-api.de/api/interpreter',
params={'data': query},
timeout=60)
print('Status:', r.status_code)
if r.status_code == 200:
data = r.json()
print('Elements:', len(data.get('elements', [])))
for el in data['elements'][:5]:
print(f\" id={el['id']} tags={el.get('tags',{})}\")
" 2>&1
Status: 406
7:02
The Overpass API is returning 406 for GET requests with params. Let me try POST.
7:02
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
query = '[out:json];area[name=\"Wien\"][admin_level=4]->.wien;rel(area.wien)[boundary=administrative][admin_level=9];out geom;'
r = requests.post('https://overpass-api.de/api/interpreter',
data=query,
timeout=60,
headers={'Content-Type': 'application/octet-stream'})
print('Status:', r.status_code)
if r.status_code == 200:
data = r.json()
print('Elements:', len(data.get('elements', [])))
else:
print('Body:', r.text[:300])
" 2>&1
Status: 406 Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6
7:06
The Overpass API is returning 406 errors. This is likely because the server doesn't like the Accept headers from requests. Let me try using curl instead to debug.
7:06
Bash
cd /work && curl -s -X POST -d '[out:json];area[name="Wien"][admin_level=4]->.wi...
bash: line 1: curl: command not found
7:13
The Overpass API is returning 406 errors for all my requests. Let me try osmnx more directly - osmnx uses overpass internally and it works with `features_from_place`. Let me use osmnx to get the data since it has a working overpass interface. Actually, osmnx worked fine earlier when I called `features_from_place`. The issue was just that Overpass API wasn't accepting my requests directly. But osmnx works as a client. Let me use osmnx to fetch both the current and historical district data. For historical data from 2014-01-01, I need to use Overpass's date/time feature with `[date:"2014-01-01T00:00:00Z"]`. Let me try using osmnx with a custom endpoint or raw Overpass query. Actually, osmnx uses `overpass` library internally. Let me check how I can use the overpass date query with osmnx. Actually, I should use the `overpass` Python package or directly use the requests library properly. Let me check the overpass package availability.
7:13
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
# Check if osmnx can use a custom overpass URL with date
# osmnx uses overpass package internally
import overpass
print('overpass available')
# Check api.py for customizing
import inspect
print(inspect.getfile(ox))
" 2>&1
Traceback (most recent call last):
File "<string>", line 5, in <module>
import overpass
ModuleNotFoundError: No module named 'overpass'
7:19
Let me look at how osmnx makes its Overpass API calls - it uses requests internally. The issue might be with the Accept header. Let me check what osmnx does and try to replicate the requests it makes.
7:19
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect # Find the overpass request function src = inspect.getsource(ox._overpass_from_place) print(src[:2000]) " 2>&1
Traceback (most recent call last):
File "<string>", line 6, in <module>
src = inspect.getsource(ox._overpass_from_place)
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'osmnx' has no attribute '_overpass_from_place'
7:26
Let me rethink the approach. The Overpass API is returning 406 errors - this could be because I'm not formatting the query correctly for the POST request. Let me look at how osmnx's features_from_place works internally and use the same approach. Since osmnx can fetch data fine, let me use it directly but see if there's a way to query historical data using the `date` parameter or similar. Actually, for the 2014-01-01 data, I'll need to use Overpass's `date` parameter to get the historical state. But since osmnx works (it fetched data successfully), let me use osmnx to get the current data, and then for 2014 I'll craft an Overpass query with the `[date:"2014-01-01T00:00:00Z"]` format. Let me look at how osmnx sends the request.
7:26
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect # Look at how features_from_place works src = inspect.getsource(ox.features_from_place) print(src[:3000]) " 2>&1
def features_from_place(
query: str | dict[str, str] | list[str | dict[str, str]],
tags: dict[str, bool | str | list[str]],
*,
which_result: int | None | list[int | None] = None,
) -> gpd.GeoDataFrame:
"""
Download OSM features within the boundaries of some place(s).
The query must be geocodable and OSM must have polygon boundaries for the
geocode result. If OSM does not have a polygon for this place, you can
instead get features within it using the `features_from_address`
function, which geocodes the place name to a point and gets the features
within some distance of that point.
If OSM does have polygon boundaries for this place but you're not finding
it, try to vary the query string, pass in a structured query dict, or vary
the `which_result` argument to use a different geocode result. If you know
the OSM ID of the place, you can retrieve its boundary polygon using the
`geocode_to_gdf` function, then pass it to the `features_from_polygon`
function.
You can use the `settings` module to retrieve a snapshot of historical OSM
data as of a certain date, or to configure the Overpass server timeout,
memory allocation, and other custom settings. This function searches for
features using tags. For more details, see:
https://wiki.openstreetmap.org/wiki/Map_features
Parameters
----------
query
The query or queries to geocode to retrieve place boundary polygon(s).
tags
Tags for finding elements in the selected area. Results are the union,
not intersection of the tags and each result matches at least one tag.
The keys are OSM tags (e.g. `building`, `landuse`, `highway`, etc) and
the values can be either `True` to retrieve all elements matching the
tag, or a string to retrieve a single `tag:value` combination, or a
list of strings to retrieve multiple values for the tag. For example,
`tags = {'building': True}` would return all buildings in the area.
Or, `tags = {'amenity':True, 'landuse':['retail','commercial'],
'highway':'bus_stop'}` would return all amenities, any landuse=retail,
any landuse=commercial, and any highway=bus_stop.
which_result
Which search result to return. If None, auto-select the first
(Multi)Polygon or raise an error if OSM doesn't return one.
Returns
-------
gdf
The features, multi-indexed by element type and OSM ID.
"""
# extract the geometry from the GeoDataFrame to use in query
polygon = geocoder.geocode_to_gdf(query, which_result=which_result).union_all()
msg = "Constructed place geometry polygon(s) to query Overpass"
utils.log(msg, level=lg.INFO)
# create GeoDataFrame using this polygon(s) geometry
return features_from_polygon(polygon, tags)
7:31
I see - `features_from_place` uses `features_from_polygon` internally. Let me look at `features_from_polygon` to understand how it queries Overpass.
7:31
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import inspect
# Look at features_from_polygon
src = inspect.getsource(ox.features_from_polygon)
# Just the first part
lines = src.split('\n')[:40]
print('\n'.join(lines))
" 2>&1
def features_from_polygon(
polygon: Polygon | MultiPolygon,
tags: dict[str, bool | str | list[str]],
) -> gpd.GeoDataFrame:
"""
Download OSM features within the boundaries of a (Multi)Polygon.
You can use the `settings` module to retrieve a snapshot of historical OSM
data as of a certain date, or to configure the Overpass server timeout,
memory allocation, and other custom settings. This function searches for
features using tags. For more details, see:
https://wiki.openstreetmap.org/wiki/Map_features
Parameters
----------
polygon
The geometry within which to retrieve features. Coordinates should be
in unprojected latitude-longitude degrees (EPSG:4326).
tags
Tags for finding elements in the selected area. Results are the union,
not intersection of the tags and each result matches at least one tag.
The keys are OSM tags (e.g. `building`, `landuse`, `highway`, etc) and
the values can be either `True` to retrieve all elements matching the
tag, or a string to retrieve a single `tag:value` combination, or a
list of strings to retrieve multiple values for the tag. For example,
`tags = {'building': True}` would return all buildings in the area.
Or, `tags = {'amenity':True, 'landuse':['retail','commercial'],
'highway':'bus_stop'}` would return all amenities, any landuse=retail,
any landuse=commercial, and any highway=bus_stop.
Returns
-------
gdf
The features, multi-indexed by element type and OSM ID.
"""
# verify that the geometry is valid and is a Polygon/MultiPolygon
if not polygon.is_valid:
msg = "The geometry of `polygon` is invalid."
raise ValueError(msg)
7:38
Let me try a different approach. Since the Overpass API uses GET requests with query params and is returning 406, maybe the issue is with how Python's requests library is sending the request. Let me try the raw approach.
7:38
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import urllib.request, urllib.parse
query = '[out:json];area[name=\"Wien\"][admin_level=4]->.wien;rel(area.wien)[boundary=administrative][admin_level=9];out geom;'
data = urllib.parse.urlencode({'data': query}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
try:
resp = urllib.request.urlopen(req, timeout=60)
print('Status:', resp.status)
body = resp.read().decode()
print('Body[:500]:', body[:500])
except urllib.error.HTTPError as e:
print('HTTP Error:', e.code, e.reason)
print('Body:', e.read().decode()[:500])
except Exception as e:
print('Error:', e)
" 2>&1
HTTP Error: 406 Not Acceptable Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
7:42
The overpass-api.de server is returning 406 for all requests. Let me try other mirrors or investigate if this is an IP issue. Let me try using `osmnx` which successfully fetched from Overpass earlier - it seems to have worked fine. Let me check if osmnx uses a different endpoint or method. Actually, looking back at my successful run, osmnx did work and returned data. Let me check what settings osmnx uses.
7:42
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print('overpass endpoint:', ox.settings.overpass_endpoint)
print('overpass timeout:', ox.settings.overpass_timeout)
print('overpass rate limit:', ox.settings.overpass_rate_limit)
" 2>&1
Traceback (most recent call last):
File "<string>", line 3, in <module>
print('overpass endpoint:', ox.settings.overpass_endpoint)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'osmnx.settings' has no attribute 'overpass_endpoint'
7:47
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox print(dir(ox.settings)) " 2>&1
['Any', 'TYPE_CHECKING', '__annotations__', '__builtins__', '__cached__', '__conditional_annotations__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'all_oneway', 'annotations', 'bidirectional_network_types', 'cache_folder', 'cache_only_mode', 'data_folder', 'default_access', 'default_crs', 'doh_url_template', 'elevation_url_template', 'http_accept_language', 'http_referer', 'http_user_agent', 'imgs_folder', 'lg', 'log_console', 'log_file', 'log_filename', 'log_level', 'log_name', 'logs_folder', 'max_query_area_size', 'nominatim_key', 'nominatim_url', 'overpass_memory', 'overpass_rate_limit', 'overpass_settings', 'overpass_url', 'requests_kwargs', 'requests_timeout', 'use_cache', 'useful_tags_node', 'useful_tags_way']
7:52
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print('overpass_url:', ox.settings.overpass_url)
print('overpass_settings:', ox.settings.overpass_settings)
" 2>&1
overpass_url: https://overpass-api.de/api
overpass_settings: [out:json][timeout:{timeout}]{maxsize}
7:59
Let me try to understand the 406 issue. The Overpass API might need User-Agent. Let me set that in osmnx and try.
7:59
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import requests
# Try with explicit Accept header and User-Agent
headers = {
'Accept': '*/*',
'User-Agent': 'Mozilla/5.0 (compatible; GISanalysis/1.0)',
'Content-Type': 'application/x-www-form-urlencoded'
}
query = '[out:json];area[name=\"Wien\"][admin_level=4]->.wien;rel(area.wien)[boundary=administrative][admin_level=9];out geom;'
r = requests.post('https://overpass-api.de/api/interpreter',
data={'data': query},
headers=headers,
timeout=60)
print('Status:', r.status_code)
print('Body[:500]:', r.text[:500])
" 2>&1
Status: 406 Body[:500]: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
8:03
Hmm, the Overpass API is returning 406 Not Acceptable. Let me check if osmnx.features_from_polygon works, since that internally uses the overpass API through a different mechanism.
8:03
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
from shapely.geometry import Polygon
# Try osmnx to see if it works with the default settings
import osmnx as ox
# Get Vienna boundary
vienna = ox.geocode_to_gdf('Vienna, Austria')
print(vienna)
" 2>&1
geometry ... display_name 0 POLYGON ((16.18183 48.17112, 16.1819 48.17103,... ... Vienna, Austria [1 rows x 17 columns]
8:12
OK, so osmnx can geocode and get Vienna's boundary polygon. But when trying features_from_place, it returned tons of stuff including streets etc. The issue is filtering just the right things. Let me look more carefully at what's happening. The problem is that `features_from_place` is returning ALL features matching tags in Vienna, not just admin_level=9 district relations. And osmnx is hitting overpass via its own mechanism which seems to work. Let me try to query with more specific tags. Let me instead use osmnx to get just the boundary of Vienna, then use the overpass query directly through osmnx's mechanism but targeted. Actually wait - `features_from_place` DID work and returned data, it just returned too much. Let me filter properly.
8:12
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import geopandas as gpd
# Get Vienna boundary polygon
vienna = ox.geocode_to_gdf('Vienna, Austria')
vienna_geom = vienna.geometry.iloc[0]
print('Vienna geometry:', vienna_geom.geom_type)
# Now query for district relations specifically within Vienna
gdf = ox.features_from_polygon(vienna_geom, tags={'boundary': 'administrative', 'admin_level': '9'})
print('Total features:', len(gdf))
print('Unique names:', gdf['name'].unique())
" 2>&1
<exit_code>-1</exit_code> <error>Command timed out after 120000ms</error>
10:17
It seems the Overpass API is timing out. Let me try a different approach - use osmnx with a smaller polygon or try directly with the Overpass endpoint via osmnx settings. The issue might be that the Overpass endpoint is blocking requests. Let me try the alternative endpoints. Actually, let me try a simpler approach. Instead of using osmnx which seems to hang, let me write the Overpass query manually and use urllib/requests with proper handling. The 406 error suggests the Overpass API doesn't like the accept headers. Let me try using raw HTTP without any extra headers.
10:17
Bash
cd /work && timeout 60 python3 -c "
cd /work && timeout 60 python3 -c "
import requests
# Minimal approach - no extra headers
query = '[out:json];area[name=\"Wien\"][admin_level=4]->.wien;rel(area.wien)[boundary=administrative][admin_level=9];(._;>;);out geom;'
r = requests.post('https://overpass-api.de/api/interpreter', data=query, timeout=60)
print('Status:', r.status_code, 'len:', len(r.text))
if r.status_code == 200:
print(r.text[:500])
else:
print(r.text[:500])
" 2>&1
Status: 406 len: 371 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
10:24
The Overpass API seems to be returning 406 errors. Let me try a different approach - maybe the issue is with my query format. Let me try with the proper OSMnx approach that seems to work. Actually, osmnx.features_from_place did work earlier. The issue is with the direct Overpass API call. Let me use osmnx to get the current districts and also try the historical data with osmnx's date parameter.
10:24
Bash
cd /work && timeout 120 python3 -c "
cd /work && timeout 120 python3 -c "
import osmnx as ox
import geopandas as gpd
# Use features_from_place with admin_level=9
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '9'})
# Filter to known district names
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
districts = gdf[gdf['name'].isin(district_names)]
print('Districts found:', len(districts))
print(districts[['name','admin_level']])
# Save current districts
import os
districts.to_file('/work/districts_current.gpkg', layer='districts', driver='GPKG')
print('Saved to /work/districts_current.gpkg')
" 2>&1
Districts found: 24
name admin_level
element id
relation 1990590 Alsergrund 9
1990591 Hietzing 9
1990592 Innere Stadt 9
1990593 Josefstadt 9
1990594 Leopoldstadt 9
1990595 Mariahilf 9
1990596 Meidling 9
1990597 Neubau 9
1990598 Penzing 9
1990599 Rudolfsheim-Fünfhaus 9
1990600 Währing 9
1991416 Landstraße 9
1991433 Brigittenau 9
1991434 Donaustadt 9
1991435 Döbling 9
1991436 Favoriten 9
1991437 Floridsdorf 9
1991438 Hernals 9
1991439 Liesing 9
1991440 Margareten 9
1991441 Ottakring 9
1991442 Simmering 9
1991443 Wieden 9
5248033 Wieden 10
Saved to /work/districts_current.gpkg
10:44
Good, it works with osmnx.features_from_place. Now I need to get the historical data from 2014-01-01. Let me check osmnx's settings for historical data.
10:44
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print('overpass_settings:', ox.settings.overpass_settings)
# Check if there's a date option
import inspect
src = inspect.getsource(ox.settings)
print(src[:2000])
" 2>&1
overpass_settings: [out:json][timeout:{timeout}]{maxsize}
"""
Global settings that can be configured by the user.
all_oneway : bool
Only use if subsequently saving graph to an OSM XML file via the
`save_graph_xml` function. If True, forces all ways to be added as one-way
ways, preserving the original order of the nodes in the OSM way. This also
retains the original OSM way's oneway tag's string value as edge attribute
values, rather than converting them to True/False bool values. Default is
`False`.
bidirectional_network_types : list[str]
Network types for which a fully bidirectional graph will be created.
Default is `["walk"]`.
cache_folder : str | Path
Path to folder to save/load HTTP response cache files, if the `use_cache`
setting is True. Default is `"./cache"`.
cache_only_mode : bool
If True, download network data from Overpass then raise a
`CacheOnlyModeInterrupt` error for user to catch. This prevents graph
building from taking place and instead just saves Overpass response to
cache. Useful for sequentially caching lots of raw data (as you can
only query Overpass one request at a time) then using the local cache to
quickly build many graphs simultaneously with multiprocessing. Default is
`False`.
data_folder : str | Path
Path to folder to save/load graph files by default. Default is `"./data"`.
default_access : str
Filter for the OSM "access" tag. Default is `'["access"!~"private"]'`.
Note that also filtering out "access=no" ways prevents including
transit-only bridges (e.g., Tilikum Crossing) from appearing in drivable
road network (e.g., `'["access"!~"private|no"]'`). However, some drivable
tollroads have "access=no" plus a "access:conditional" tag to clarify when
it is accessible, so we can't filter out all "access=no" ways by default.
Best to be permissive here then remove complicated combinations of tags
programatically after the full graph is downloaded and constructed.
default_crs : str
Default coordi
10:51
Let me look at how osmnx handles historical data and how it makes requests, specifically the overpass_settings to see if we can add a date parameter.
10:51
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect # Find the function that builds the Overpass query from osmnx import _overpass src = inspect.getsource(_overpass) print(src[:4000]) " 2>&1
"""Tools to work with the Overpass API."""
from __future__ import annotations
import datetime as dt
import logging as lg
import time
from collections import OrderedDict
from typing import TYPE_CHECKING
from typing import Any
import numpy as np
import requests
from requests.exceptions import ConnectionError as RequestsConnectionError
from . import _http
from . import projection
from . import settings
from . import utils
from . import utils_geo
from ._errors import InsufficientResponseError
if TYPE_CHECKING:
from collections.abc import Iterator
from shapely import MultiPolygon
from shapely import Polygon
def _get_network_filter(network_type: str) -> str:
"""
Create a filter to query Overpass for the specified network type.
The filter queries Overpass for every OSM way with a "highway" tag but
excludes ways that are incompatible with the requested network type. You
can choose from the following types:
"all" retrieves all public and private-access ways currently in use,
excluding those that represent areas either explicitly (area=yes) or by
convention (rest_area, services).
"all_public" retrieves all public ways currently in use.
"bike" retrieves public bikeable ways and excludes foot ways, motor ways,
and anything tagged biking=no.
"drive" retrieves public drivable streets and excludes service roads,
anything tagged motor=no, and certain non-service roads tagged as
providing certain services (such as alleys or driveways).
"drive_service" retrieves public drivable streets including service roads
but excludes certain services (such as parking or emergency access).
"walk" retrieves public walkable ways and excludes cycle ways, motor ways,
and anything tagged foot=no. It includes service roads like parking lot
aisles and alleys that you can walk on even if they are unpleasant walks.
Parameters
----------
network_type
{"all", "all_public", "bike", "drive", "drive_service", "walk"}
What type of street network to retrieve.
Returns
-------
way_filter
The Overpass query filter.
"""
# define built-in queries to send to the API. specifying way["highway"]
# means that all ways returned must have a highway tag. the filters then
# remove ways by tag/value.
filters = {}
# driving: filter out un-drivable roads, service roads, private ways, and
# anything tagged motor=no. also filter out any non-service roads that are
# tagged as providing certain services
filters["drive"] = (
f'["highway"]["area"!~"yes"]{settings.default_access}'
f'["highway"!~"abandoned|bridleway|bus_guideway|construction|corridor|'
f"cycleway|elevator|escalator|footway|no|path|pedestrian|planned|platform|"
f'proposed|raceway|razed|rest_area|service|services|steps|track"]'
f'["motor_vehicle"!~"no"]["motorcar"!~"no"]'
f'["service"!~"alley|driveway|emergency_access|parking|parking_aisle|private"]'
)
# drive+service: allow ways tagged 'service' but filter out certain types
filters["drive_service"] = (
f'["highway"]["area"!~"yes"]{settings.default_access}'
f'["highway"!~"abandoned|bridleway|bus_guideway|construction|corridor|'
f"cycleway|elevator|escalator|footway|no|path|pedestrian|planned|platform|"
f'proposed|raceway|razed|rest_area|services|steps|track"]'
f'["motor_vehicle"!~"no"]["motorcar"!~"no"]'
f'["service"!~"emergency_access|parking|parking_aisle|private"]'
)
# walking: filter out cycle ways, motor ways, private ways, and anything
# tagged foot=no. allow service roads, permitting things like parking lot
# aisles, alleys, etc that you *can* walk on even if they're not exactly
# pleasant walks. some cycleways may allow pedestrians, but this filter
# ignores such cycleways.
filters["walk"] = (
f'["highway"]["area"!~"yes"]{settings.default_access
10:56
Let me understand how osmnx's overpass query works internally so I can adapt it to get historical data. Let me look at the specific function that constructs the query for features.
10:56
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect # Find the function that runs the Overpass query for features from osmnx import _overpass src = inspect.getsource(_overpass) print(src[8000:12000]) " 2>&1
e
if status_first_part == "Slot":
utc_time_str = status.split(" ")[3]
pattern = "%Y-%m-%dT%H:%M:%SZ,"
utc_time = dt.datetime.strptime(utc_time_str, pattern).replace(tzinfo=dt.UTC)
utc_now = dt.datetime.now(tz=dt.UTC)
seconds = int(np.ceil((utc_time - utc_now).total_seconds()))
pause = max(seconds, 1)
# if first token is 'Currently', it is currently running a query so
# check back in recursion_pause seconds
elif status_first_part == "Currently":
time.sleep(recursion_pause)
pause = _get_overpass_pause(base_endpoint)
# any other status is unrecognized: log error, return default duration
else:
msg = f"Unrecognized server status: {status!r}"
utils.log(msg, level=lg.ERROR)
return default_pause
return pause
def _make_overpass_settings() -> str:
"""
Make settings string to send in Overpass query.
Returns
-------
overpass_settings
The `settings.overpass_settings` string formatted with "timeout" and
"maxsize" values.
"""
maxsize = "" if settings.overpass_memory is None else f"[maxsize:{settings.overpass_memory}]"
return settings.overpass_settings.format(timeout=settings.requests_timeout, maxsize=maxsize)
def _make_overpass_polygon_coord_strs(polygon: Polygon | MultiPolygon) -> list[str]:
"""
Subdivide query polygon and return list of coordinate strings.
Project to UTM, divide `polygon` up into sub-polygons if area exceeds a
max size (in meters), project back to lat-lon, then get a list of
polygon(s) exterior coordinates. Ignore interior ("holes") coordinates.
Parameters
----------
polygon
The (Multi)Polygon to convert to exterior coordinate strings.
Returns
-------
coord_strs
Exterior coordinates of polygon(s).
"""
# subdivide the polygon if its area exceeds max size
# this results in a multipolygon of 1+ constituent polygons
poly_proj, crs_proj = projection.project_geometry(polygon)
multi_poly_proj = utils_geo._consolidate_subdivide_geometry(poly_proj)
multi_poly, _ = projection.project_geometry(multi_poly_proj, crs=crs_proj, to_latlong=True)
# then extract each's exterior coords to the string format Overpass
# expects, rounding lats and lons to 6 decimals (approx 5 to 10 cm
# resolution) so we can hash and cache URL strings consistently
coord_strs = []
for geom in multi_poly.geoms:
x, y = geom.exterior.xy
coord_list = [f"{xy[1]:.6f}{' '}{xy[0]:.6f}" for xy in zip(x, y, strict=True)]
coord_strs.append(" ".join(coord_list))
return coord_strs
def _create_overpass_features_query( # noqa: PLR0912
polygon_coord_str: str,
tags: dict[str, bool | str | list[str]],
) -> str:
"""
Create an Overpass features query string based on tags.
Parameters
----------
polygon_coord_str
The lat lon coordinates.
tags
Tags used for finding elements in the search area.
Returns
-------
query
The Overpass features query.
"""
# create overpass settings string
overpass_settings = _make_overpass_settings()
# make sure every value in dict is bool, str, or list of str
err_msg = "`tags` must be a dict with values of bool, str, or list of str."
if not isinstance(tags, dict): # pragma: no cover
raise TypeError(err_msg)
tags_dict: dict[str, bool | str | list[str]] = {}
for key, value in tags.items():
if isinstance(value, bool):
tags_dict[key] = value
elif isinstance(value, str):
tags_dict[key] = [value]
elif isinstance(value, list):
if not all(isinstance(s, str) for s in value): # pragma: no cover
raise TypeError(err_msg)
tags_dict[key] = value
else: # pragma: no cover
raise Ty
11:00
Let me find where osmnx makes the actual overpass query and see if there's a date parameter I can use.
11:00
(no output)
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import inspect
from osmnx import _overpass
src = inspect.getsource(_overpass)
# find the part that actually runs the query
idx = src.find('def overpass_json_request')
if idx >= 0:
print(src[idx:idx+3000])
" 2>&1
11:08
Let me try a different approach. Since the Overpass API is returning 406 errors when I try to use it directly, but osmnx's `features_from_place` works (since it uses a specific URL pattern that seems to work), let me use osmnx for both queries. For historical data (2014-01-01), I can use osmnx's `settings` to set the date. Let me check if osmnx supports historical queries.
11:08
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print('overpass_settings:', ox.settings.overpass_settings)
# Try setting date
import datetime as dt
ox.settings.overpass_settings = '[out:json][timeout:{timeout}][maxsize:{maxsize}][date:\"2014-01-01T00:00:00Z\"]'
print('New settings:', ox.settings.overpass_settings)
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '9'})
print('Districts found:', len(gdf))
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
districts = gdf[gdf['name'].isin(district_names)]
print('Filtered districts:', len(districts))
if len(districts) > 0:
print(districts[['name','admin_level']])
else:
print('All names found:')
print(gdf['name'].unique())
" 2>&1
overpass_settings: [out:json][timeout:{timeout}]{maxsize}
New settings: [out:json][timeout:{timeout}][maxsize:{maxsize}][date:"2014-01-01T00:00:00Z"]
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
return complexjson.loads(self.text, **kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/osmnx/_http.py", line 314, in _parse_response
response_json: dict[str, Any] | list[dict[str, Any]] = response.json()
~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 10, in <module>
gdf = ox.features_from_place('Vienna, Austria', tags={'boundary': 'administrative', 'admin_level': '9'})
File "/usr/local/lib/python3.14/site-packages/osmnx/features.py", line 279, in features_from_place
return features_from_polygon(polygon, tags)
File "/usr/local/lib/python3.14/site-packages/osmnx/features.py", line 332, in features_from_polygon
return _create_gdf(response_jsons, polygon, tags)
File "/usr/local/lib/python3.14/site-packages/osmnx/features.py", line 406, in _create_gdf
for response_json in response_jsons:
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/osmnx/_overpass.py", line 432, in _download_overpass_features
yield _overpass_request(OrderedDict(data=query_str))
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/osmnx/_overpass.py", line 488, in _overpass_request
response_json = _http._parse_response(response)
File "/usr/local/lib/python3.14/site-packages/osmnx/_http.py", line 320, in _parse_response
raise ResponseStatusCodeError(msg) from e
osmnx._errors.ResponseStatusCodeError: 'overpass-api.de' responded: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 1: parse error: Value expected - ']' found. </p>
<p><strong style="color:#FF0000">Error</strong>: line 1: static error: For the attribute "element-limit" of the element "osm-script" the only allowed values are positive integers optionally with suffix "ki", "Mi", or "Gi". </p>
</body>
</html>
11:41
The date format in overpass_settings doesn't work well with osmnx's template. Let me fix it. The `maxsize` tag was being placed incorrectly. Let me set the settings properly.
Actually, I need a different approach. Let me use the Overpass API directly with a well-formed query. The 406 error I was getting earlier might be due to some issue with the Python requests library sending unexpected headers. Let me try osmnx's approach but with a properly formatted date query.
Actually, looking at the error, the `maxsize` template variable was being inserted as `{maxsize}` but when I set the full string with `[maxsize:{maxsize}]`, the template substitution should work. Let me check what osmnx does.
Actually the issue is that the default `overpass_settings` already uses `{maxsize}` which gets substituted. My custom setting has `[maxsize:{maxsize}]` and the problem is the date format. Let me try a different approach.
Let me just directly query Overpass using the urllib approach that works.
11:41
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
# Reset settings
ox.settings.overpass_settings = '[out:json][timeout:{timeout}]{maxsize}'
# Now try date as a separate approach
# Use features_from_polygon with a direct polygon
vienna = ox.geocode_to_gdf('Vienna, Austria')
geom = vienna.geometry.iloc[0]
print('Got Vienna geometry:', geom.geom_type)
print('Bounds:', geom.bounds)
# First get current districts
gdf_current = ox.features_from_polygon(geom, tags={'boundary': 'administrative', 'admin_level': '9'})
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
current = gdf_current[gdf_current['name'].isin(district_names)]
print('Current districts found:', len(current))
current.to_file('/work/districts_current.gpkg')
print('Saved current districts')
" 2>&1
Got Vienna geometry: Polygon Bounds: (16.181831, 48.1179069, 16.5775132, 48.3226679) Current districts found: 24 Saved current districts
11:48
The Overpass date feature uses `[date:"2014-01-01T00:00:00Z"]` in the settings. But the `[maxsize]` without a value caused the error. Let me check how osmnx formats the settings and handle the historical query properly.
11:48
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
print('overpass_settings template:', ox.settings.overpass_settings)
print('overpass_memory:', ox.settings.overpass_memory)
print('requests_timeout:', ox.settings.requests_timeout)
" 2>&1
overpass_settings template: [out:json][timeout:{timeout}]{maxsize}
overpass_memory: None
requests_timeout: 180
11:54
I need to use the `[date:"2014-01-01T00:00:00Z"]` syntax in Overpass. The issue was that I was replacing the entire settings template incorrectly - the `{maxsize}` in the template was being left as literal text. Let me look at how OSMnx constructs the settings string and use the date parameter properly.
Looking at the code, `_make_overpass_settings` does:
```python
maxsize = "" if settings.overpass_memory is None else f"[maxsize:{settings.overpass_memory}]"
return settings.overpass_settings.format(timeout=settings.requests_timeout, maxsize=maxsize)
```
So I need to include `{maxsize}` in the template, or just set overpass_memory. Let me try a different approach - directly format the query with date.
11:54
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
# Proper way: use the date parameter in the overpass_settings template
ox.settings.overpass_settings = '[out:json][timeout:{timeout}]{maxsize}[date:\"2014-01-01T00:00:00Z\"]'
vienna = ox.geocode_to_gdf('Vienna, Austria')
gdf = ox.features_from_polygon(vienna.geometry.iloc[0], tags={'boundary': 'administrative', 'admin_level': '9'})
print('Historical districts found:', len(gdf))
print(gdf[['name','admin_level']].head())
" 2>&1
<exit_code>-1</exit_code> <error>Command timed out after 120000ms</error>
14:03
The queries are timing out. Let me try a more targeted approach - querying the districts by their specific relation IDs rather than a broad tag search. I know the district relation IDs from the previous query. Let me also try using the OSM API to get the historical versions of these relations. Actually, Overpass API with date parameter can be very slow for historical data. Let me think about alternative approaches. For the 2014-01-01 snapshot, I could: 1. Use the Overpass API with a date parameter (but this is timing out) 2. Use the OSM API to get historical relation versions 3. Use OSM historical extracts (like Geofabrik archives) Actually, let me try building more efficient Overpass queries. The key insight is that Overpass has a `[date:]` parameter that can be slow but the query needs to be efficient. Let me first reset the settings, then query with a more specific, targeted approach.
14:03
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import requests
import json
# Reset settings
ox.settings.overpass_settings = '[out:json][timeout:{timeout}]{maxsize}'
# Get the current districts first
vienna = ox.geocode_to_gdf('Vienna, Austria')
current = ox.features_from_polygon(vienna.geometry.iloc[0], tags={'boundary': 'administrative', 'admin_level': '9'})
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
current = current[current['name'].isin(district_names)]
print('Current districts:', len(current))
for idx, row in current.iterrows():
print(f\" {idx[1]}: {row['name']}\")
" 2>&1
Current districts: 24 1990590: Alsergrund 1990591: Hietzing 1990592: Innere Stadt 1990593: Josefstadt 1990594: Leopoldstadt 1990595: Mariahilf 1990596: Meidling 1990597: Neubau 1990598: Penzing 1990599: Rudolfsheim-Fünfhaus 1990600: Währing 1991416: Landstraße 1991433: Brigittenau 1991434: Donaustadt 1991435: Döbling 1991436: Favoriten 1991437: Floridsdorf 1991438: Hernals 1991439: Liesing 1991440: Margareten 1991441: Ottakring 1991442: Simmering 1991443: Wieden 5248033: Wieden
14:14
We have Wieden appearing twice (one at admin_level=9 and one at admin_level=10). Let me drop the admin_level=10 duplicate and save the current districts. Now for the historical (2014-01-01) data, I need to use OSM's attic data feature with the date parameter. Let me try a different approach - use the Overpass API directly with the attic data query. The issue before was with the overpass_settings format. Let me construct the query properly.
14:14
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import geopandas as gpd
# Get the current districts, deduplicate Wieden
vienna = ox.geocode_to_gdf('Vienna, Austria')
current = ox.features_from_polygon(vienna.geometry.iloc[0], tags={'boundary': 'administrative', 'admin_level': '9'})
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
current = current[current['name'].isin(district_names)]
# Drop the Wieden duplicate with admin_level=10
current = current[current['admin_level'] != '10']
print('Current districts:', len(current))
print(current[['name','admin_level']])
current.to_file('/work/districts_current.gpkg', driver='GPKG')
print('Saved current')
" 2>&1
Current districts: 23
name admin_level
element id
relation 1990590 Alsergrund 9
1990591 Hietzing 9
1990592 Innere Stadt 9
1990593 Josefstadt 9
1990594 Leopoldstadt 9
1990595 Mariahilf 9
1990596 Meidling 9
1990597 Neubau 9
1990598 Penzing 9
1990599 Rudolfsheim-Fünfhaus 9
1990600 Währing 9
1991416 Landstraße 9
1991433 Brigittenau 9
1991434 Donaustadt 9
1991435 Döbling 9
1991436 Favoriten 9
1991437 Floridsdorf 9
1991438 Hernals 9
1991439 Liesing 9
1991440 Margareten 9
1991441 Ottakring 9
1991442 Simmering 9
1991443 Wieden 9
Saved current
14:21
Now I need to get the historical (2014-01-01) districts. Let me try the date parameter.
14:21
14:21
Bash
cd /work && timeout 180 python3 -c "
cd /work && timeout 180 python3 -c "
import osmnx as ox
import geopandas as gpd
# Use date parameter in overpass settings for 2014-01-01
ox.settings.overpass_settings = '[out:json][timeout:{timeout}][maxsize:{maxsize}][date:\"2014-01-01T00:00:00Z\"]'
ox.settings.overpass_memory = 536870912 # 512 MB
vienna = ox.geocode_to_gdf('Vienna, Austria')
print('Vienna geometry obtained')
try:
gdf = ox.features_from_polygon(vienna.geometry.iloc[0], tags={'boundary': 'administrative', 'admin_level': '9'})
print('Historical features:', len(gdf))
print(gdf[['name','admin_level']].head(30))
except Exception as e:
print(f'Error: {e}')
" 2>&1
Vienna geometry obtained
Error: 'overpass-api.de' responded: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 1: parse error: Value expected - '[' found. </p>
<p><strong style="color:#FF0000">Error</strong>: line 1: static error: For the attribute "element-limit" of the element "osm-script" the only allowed values are positive integers optionally with suffix "ki", "Mi", or "Gi". </p>
<p><strong style="color:#FF0000">Error</strong>: line 1: parse error: ';' expected - ']' found. </p>
</body>
</html>
14:28
The issue is with how osmnx formats the query. The `[maxsize:{maxsize}]` when maxsize is a number like 536870912, it needs the format like `[maxsize:536870912]` without quotes. Let me check how the settings are formatted.
Actually looking at the error, it says "Value expected - '[' found" which means the date parameter `[date:"2014-01-01T00:00:00Z"]` is being placed somewhere where it can't be parsed. Let me look at the osmnx code more carefully to understand how the query is built.
14:28
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import inspect
from osmnx import _overpass
src = inspect.getsource(_overpass._make_overpass_settings)
print(src)
print('---')
src2 = inspect.getsource(_overpass._create_overpass_features_query)
print(src2)
" 2>&1
def _make_overpass_settings() -> str:
"""
Make settings string to send in Overpass query.
Returns
-------
overpass_settings
The `settings.overpass_settings` string formatted with "timeout" and
"maxsize" values.
"""
maxsize = "" if settings.overpass_memory is None else f"[maxsize:{settings.overpass_memory}]"
return settings.overpass_settings.format(timeout=settings.requests_timeout, maxsize=maxsize)
---
def _create_overpass_features_query( # noqa: PLR0912
polygon_coord_str: str,
tags: dict[str, bool | str | list[str]],
) -> str:
"""
Create an Overpass features query string based on tags.
Parameters
----------
polygon_coord_str
The lat lon coordinates.
tags
Tags used for finding elements in the search area.
Returns
-------
query
The Overpass features query.
"""
# create overpass settings string
overpass_settings = _make_overpass_settings()
# make sure every value in dict is bool, str, or list of str
err_msg = "`tags` must be a dict with values of bool, str, or list of str."
if not isinstance(tags, dict): # pragma: no cover
raise TypeError(err_msg)
tags_dict: dict[str, bool | str | list[str]] = {}
for key, value in tags.items():
if isinstance(value, bool):
tags_dict[key] = value
elif isinstance(value, str):
tags_dict[key] = [value]
elif isinstance(value, list):
if not all(isinstance(s, str) for s in value): # pragma: no cover
raise TypeError(err_msg)
tags_dict[key] = value
else: # pragma: no cover
raise TypeError(err_msg)
# convert the tags dict into a list of {tag:value} dicts
tags_list: list[dict[str, bool | str | list[str]]] = []
for key, value in tags_dict.items():
if isinstance(value, bool):
tags_list.append({key: value})
else:
for value_item in value:
tags_list.append({key: value_item}) # noqa: PERF401
# add node/way/relation query components one at a time
components = []
for d in tags_list:
for key, value in d.items():
if isinstance(value, bool):
# if bool (ie, True) just pass the key, no value
tag_str = f"[{key!r}](poly:{polygon_coord_str!r});(._;>;);"
else:
# otherwise, pass "key"="value"
tag_str = f"[{key!r}={value!r}](poly:{polygon_coord_str!r});(._;>;);"
for kind in ("node", "way", "relation"):
components.append(f"({kind}{tag_str});") # noqa: PERF401
# finalize query and return
components_str = "".join(components)
return f"{overpass_settings};({components_str});out;"
14:49
I see - the issue is that `maxsize` is being formatted as `[maxsize:536870912]` which produces `[maxsize:536870912]` in the settings string. But then the query has `{maxsize}` which needs to be explicit, and the date also uses `[]` which conflicts with the `maxsize` formatting. Let me look more carefully at the `_make_overpass_settings` function.
The `maxsize` value needs to be a string with "Mi" suffix or just numbers. Let me check what format is expected. Actually, looking at the Overpass API docs, the `maxsize` parameter expects a value in bytes, optionally with suffix ki, Mi, or Gi. So `[maxsize:536870912]` should be fine.
Wait, looking at the error more carefully: "parse error: Value expected - '[' found" - this suggests the date parameter is causing issues. Let me look at the actual query that gets generated.
Actually, the template is:
```
[out:json][timeout:{timeout}][maxsize:{maxsize}][date:"2014-01-01T00:00:00Z"]
```
When `maxsize` is not None (536870912), it becomes:
```
[out:json][timeout:180][maxsize:536870912][date:"2014-01-01T00:00:00Z"]
```
Wait, `overpass_memory` is set to 536870912, but the template has `{maxsize}` field. Let me check `_make_overpass_settings`:
```python
maxsize = "" if settings.overpass_memory is None else f"[maxsize:{settings.overpass_memory}]"
return settings.overpass_settings.format(timeout=settings.requests_timeout, maxsize=maxsize)
```
So if `overpass_memory` is set, `maxsize` becomes `[maxsize:536870912]`. Then the template becomes:
```
[out:json][timeout:180][maxsize:536870912][date:"2014-01-01T00:00:00Z"]
```
That looks valid. But the error says "For the attribute 'element-limit' ... the only allowed values are positive integers optionally with suffix 'ki', 'Mi', or 'Gi'."
Hmm, but when `overpass_memory` is None, `maxsize` is `""`, and the template becomes:
```
[out:json][timeout:180][date:"2014-01-01T00:00:00Z"]
```
Wait, there would be an empty `{}maxsize` issue. Let me re-check.
Actually, the template string is: `'[out:json][timeout:{timeout}]','{maxsize}'` ... no wait, it's `'[out:json][timeout:{timeout}]{maxsize}'` - so `{maxsize}` without brackets. So when maxsize is `""`, the output is `[out:json][timeout:180]`. When maxsize is `[maxsize:536870912]`, it becomes `[out:json][timeout:180][maxsize:536870912]`.
Then with the date it's: `'[out:json][timeout:{timeout}]{maxsize}[date:"2014-01-01T00:00:00Z"]'`
When maxsize is `""`: `[out:json][timeout:180][date:"2014-01-01T00:00:00Z"]`
When maxsize is `[maxsize:536870912]`: `[out:json][timeout:180][maxsize:536870912][date:"2014-01-01T00:00:00Z"]`
The error says "parse error: Value expected - '[' found" which seems like the date brackets are being confused. Let me look at the full query that gets generated...
Actually, I think the issue might be that the `[date:...]` parameter is not supported in the `[out:json][timeout:...]` syntax. The `[date:...]` is actually a global setting that goes before the query, not inside the `[out:json]` bracket. Let me check the Overpass QL documentation.
In Overpass QL, the date filter is indeed: `[date:"2014-01-01T00:00:00Z"]` as a top-level setting, just like `[out:json]`, `[timeout:...]`, and `[maxsize:...]`. So the format should be:
```
[out:json][timeout:180][maxsize:536870912][date:"2014-01-01T00:00:00Z"];
```
That looks right. But maybe the issue is that the Overpass server version doesn't support `[date:]` in this position? Let me check if the `maxsize` value needs to have the "Mi" suffix.
Actually wait - looking at the error again: "For the attribute 'element-limit' of the element 'osm-script' the only allowed values are positive integers optionally with suffix 'ki', 'Mi', or 'Gi'."
This says the value should be like "256Mi" or "1073741824" (bytes). 536870912 should work as a raw byte value... but the error messages are a bit confusing. Let me try with a different maxsize format.
Actually, let me just try without the date first to make sure the maxsize works, then try the date.
Let me take a different approach. Instead of manipulating osmnx's settings, I'll craft the Overpass query directly using requests, which gives me full control.
14:49
Bash
cd /work && timeout 120 python3 -c "
cd /work && timeout 120 python3 -c "
import osmnx as ox
# Reset
ox.settings.overpass_settings = '[out:json][timeout:{timeout}]{maxsize}'
ox.settings.overpass_memory = None
# Get current districts as reference IDs
vienna = ox.geocode_to_gdf('Vienna, Austria')
current = ox.features_from_polygon(vienna.geometry.iloc[0], tags={'boundary': 'administrative', 'admin_level': '9'})
district_names = [
'Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 'Mariahilf',
'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 'Simmering', 'Meidling',
'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Ottakring', 'Hernals',
'Währing', 'Döbling', 'Brigittenau', 'Floridsdorf', 'Donaustadt', 'Liesing'
]
current = current[current['name'].isin(district_names)]
current = current[current['admin_level'] != '10']
# Get the OSM relation IDs
ids = [idx[1] for idx in current.index]
print('Relation IDs:', ids)
# Save them
import json
with open('/work/district_ids.json', 'w') as f:
json.dump(ids, f)
" 2>&1
Relation IDs: [1990590, 1990591, 1990592, 1990593, 1990594, 1990595, 1990596, 1990597, 1990598, 1990599, 1990600, 1991416, 1991433, 1991434, 1991435, 1991436, 1991437, 1991438, 1991439, 1991440, 1991441, 1991442, 1991443]
14:56
Now I need to get the historical versions of these districts from Overpass using the `[date:"2014-01-01T00:00:00Z"]` parameter. The issue earlier was that the maxsize parameter was being set incorrectly. Let me craft the Overpass query manually.
14:56
14:56
Bash
cd /work && timeout 120 python3 -c "
cd /work && timeout 120 python3 -c "
import requests
# Simple test first
query = '[out:json][timeout:60][maxsize:1073741824];relation(1990590);out geom;'
r = requests.post('https://overpass-api.de/api/interpreter', data=query, timeout=60)
print('Status:', r.status_code)
if r.status_code == 200:
data = r.json()
el = data['elements'][0]
print('Tags:', el.get('tags', {}))
print('Has geometry:', 'geometry' in el)
print('Members:', len(el.get('members', [])))
else:
print(r.text[:500])
" 2>&1
Status: 406 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
15:00
The direct POST requests to overpass are failing with 406. But osmnx works fine via features_from_polygon. Let me look at how osmnx makes the request - maybe it uses a different method or has some special handling.
15:00
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect from osmnx import _overpass src = inspect.getsource(_overpass._overpass_request) print(src) " 2>&1
def _overpass_request(data: OrderedDict[str, Any]) -> dict[str, Any]:
"""
Send a HTTP POST request to the Overpass API and return response.
Parameters
----------
data
Key-value pairs of parameters.
Returns
-------
response_json
The Overpass API's response.
"""
# resolve url to same IP even if there is server round-robin redirecting
_http._config_dns(settings.overpass_url)
# prepare the Overpass API URL and see if request already exists in cache
url = settings.overpass_url.rstrip("/") + "/interpreter"
prepared_url = str(requests.Request("GET", url, params=data).prepare().url)
cached_response_json = _http._retrieve_from_cache(prepared_url)
if isinstance(cached_response_json, dict):
return cached_response_json
# pause then request this URL
pause = _get_overpass_pause(settings.overpass_url)
hostname = _http._hostname_from_url(url)
msg = f"Pausing {pause} second(s) before making HTTP POST request to {hostname!r}"
utils.log(msg, level=lg.INFO)
time.sleep(pause)
# transmit the HTTP POST request
msg = f"Post {prepared_url} with timeout={settings.requests_timeout}"
utils.log(msg, level=lg.INFO)
response = requests.post(
url,
data=data,
timeout=settings.requests_timeout,
headers=_http._get_http_headers(),
**settings.requests_kwargs,
)
# handle 429 and 504 errors by pausing then recursively re-trying request
if response.status_code in {429, 504}: # pragma: no cover
error_pause = 55
msg = (
f"{hostname!r} responded {response.status_code} {response.reason}: "
f"we'll retry in {error_pause} secs"
)
utils.log(msg, level=lg.WARNING)
time.sleep(error_pause)
return _overpass_request(data)
response_json = _http._parse_response(response)
if not isinstance(response_json, dict): # pragma: no cover
msg = "Overpass API did not return a dict of results."
raise InsufficientResponseError(msg)
_http._save_to_cache(prepared_url, response_json, response.ok)
return response_json
15:05
I see - osmnx uses `data=data` as POST data, not `data={'data': query}`. The data parameter is an OrderedDict with a 'data' key. Let me look at how the query is built in osmnx.
Actually I notice it passes `data=data` where data is an OrderedDict. Then the `_http._get_http_headers()` likely sets the Content-Type properly.
The 406 error I'm getting when using bare requests is likely due to missing proper headers. Let me check what headers osmnx uses.
15:05
Bash
cd /work && python3 -c "
cd /work && python3 -c " import osmnx as ox import inspect from osmnx import _http src = inspect.getsource(_http._get_http_headers) print(src) " 2>&1
def _get_http_headers(
*,
user_agent: str | None = None,
referer: str | None = None,
accept_language: str | None = None,
) -> dict[str, str]:
"""
Update the default requests HTTP headers with OSMnx information.
Parameters
----------
user_agent
The user agent. If None, use `settings.http_user_agent` value.
referer
The referer. If None, use `settings.http_referer` value.
accept_language
The accept language. If None, use `settings.http_accept_language`
value.
Returns
-------
headers
The updated HTTP headers.
"""
if user_agent is None:
user_agent = settings.http_user_agent
if referer is None:
referer = settings.http_referer
if accept_language is None:
accept_language = settings.http_accept_language
info = {"User-Agent": user_agent, "referer": referer, "Accept-Language": accept_language}
headers = dict(requests.utils.default_headers())
headers.update(info)
return headers
15:11
Let me look at how OSMnx sends the request more carefully, specifically what content-type it uses:
15:11
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
from osmnx import _http, _overpass
from collections import OrderedDict
import requests
# Check default headers
h = _http._get_http_headers()
print('Headers:', h)
# Try sending the exact same format osmnx uses
data = OrderedDict()
data['data'] = '[out:json][timeout:180];relation(1990590);out geom;'
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
print('URL:', url)
r = requests.post(url, data=data, timeout=60, headers=h)
print('Status:', r.status_code)
if r.status_code == 200:
print('Success!')
print(r.text[:500])
else:
print(r.text[:500])
" 2>&1
Headers: {'User-Agent': 'OSMnx Python package (https://github.com/gboeing/osmnx)', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '*/*', 'Connection': 'keep-alive', 'referer': 'OSMnx Python package (https://github.com/gboeing/osmnx)', 'Accept-Language': 'en'}
URL: https://overpass-api.de/api/interpreter
Status: 200
Success!
{
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_osm_base": "2026-06-17T11:12:14Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "relation",
"id": 1990590,
"bounds": {
"minlat": 48.2138140,
"minlon": 16.3408703,
"maxlat": 48.2363090,
"maxlon": 16.3701490
},
"members": [
{
"type": "way",
"ref": 147
15:19
The issue was the HTTP headers. The osmnx package uses a specific User-Agent and other headers that work with the Overpass API, but when I was using the requests library directly, I wasn't sending those headers, which caused the 406 errors. Let me now try getting historical data with proper headers.
15:19
15:19
Bash
cd /work && timeout 180 python3 -c "
cd /work && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
# Get the headers OSMnx uses
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Historic query for districts on 2014-01-01
# Use relation IDs directly with date
query = '''[out:json][timeout:180][maxsize:1073741824][date:\"2014-01-01T00:00:00Z\"];
relation(1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
print('Saved to historic_raw.json')
# Show what we got
for el in result['elements']:
tags = el.get('tags', {})
print(f\" id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))} has_geom={'geometry' in el}\")
else:
print(r.text[:1000])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "n" of the element "bbox-query" the only allowed values are floats between -90.0 and 90.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "w" of the element "bbox-query" the only allowed values are floats between -180.0 and 180.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "e" of the element &
15:31
The issue is that the date parameter syntax is wrong for Overpass. Looking at the Overpass API documentation, the date parameter should be at the beginning of the query, not in the settings bracket. The correct format is: ``` [date:"2014-01-01T00:00:00Z"] ``` But it can't go in the `[out:json][timeout:180][maxsize:...]` block. Let me check... actually, in Overpass QL, the `[date:"..."]` should be placed before any statements. Let me try: ``` [out:json][timeout:180][maxsize:1073741824]; [date:"2014-01-01T00:00:00Z"]; relation(1990590,...); out geom; ``` Wait, actually looking at the Overpass docs more carefully, the date is a global setting that goes in the initial header brackets. Let me try: ``` [out:json][timeout:180][maxsize:1073741824][date:"2014-01-01T00:00:00Z"]; relation(...); out geom; ``` Hmm but that caused the error before. Let me look at the error more carefully: "For the attribute 'n' of the element 'bbox-query' the only allowed values are floats between -90.0 and 90.0" - this suggests the date is being parsed incorrectly and some values are being interpreted as lat/lon. Actually the date format with quotes inside double brackets might be wrong. The Overpass QL documentation says: ``` [date:"2014-01-01T00:00:00Z"] ``` But when embedded in the osm-script element, the brackets need to be escaped properly. Let me try a different format. Actually, looking at the OSM3S error, the issue might be with the maxsize value. Let me try without maxsize first, or use a simpler format. Let me try: ``` [out:json][timeout:180][date:"2014-01-01T00:00:00Z"]; relation(1990590,1990591,...); out geom; ```
15:31
Bash
cd /work && timeout 120 python3 -c "
cd /work && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try with proper date format - no maxsize
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
relation(1990590);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
el = result['elements'][0]
print('Success! Name:', el.get('tags', {}).get('name'))
print('Has geometry:', 'geometry' in el)
if 'geometry' in el:
print('Geom points:', len(el['geometry']))
else:
print(r.text[:1000])
" 2>&1
Status: 429
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: runtime error: open64: 0 Success /osm3s_osm_base Dispatcher_Client::request_read_and_idx::rate_limited. Please check /api/status for the quota of your IP address. </p>
</body>
</html>
15:47
I'm being rate-limited. Let me wait a bit and try again with the proper date query.
15:47
Bash
cd /work && sleep 30 && timeout 120 python3 -c "
cd /work && sleep 30 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try with proper date format
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
relation(1990590);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
el = result['elements'][0]
print('Success! Name:', el.get('tags', {}).get('name'))
print('Has geometry:', 'geometry' in el)
if 'geometry' in el:
print('Geom points:', len(el['geometry']))
else:
print(r.text[:1000])
" 2>&1
Status: 200 Success! Name: Alsergrund Has geometry: False
16:33
The historic query with `out geom` doesn't return geometry for relations - it returns members. For relation geometries, I need to use `out geom;` with the relation and its members. But the date feature on Overpass doesn't work well with member geometry retrieval. Let me try a different approach. I'll use the Overpass `convert` functionality or `recurse` to get full geometries, or use `(._;>;); out geom;` which should get all the ways/nodes belonging to the relation. Actually, I think the issue is that for historic data, out geom on relations may not work. Let me use `(._;>;); out geom;` to recurse into the ways and nodes.
16:33
Bash
cd /work && sleep 30 && timeout 180 python3 -c "
cd /work && sleep 30 && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try with recurse to get full geometry
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
relation(1990590,1990591);
(._;>;);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
# Save
with open('/work/test_historic.json', 'w') as f:
json.dump(result, f)
# Show types
types = {}
for el in result['elements']:
t = el['type']
types[t] = types.get(t, 0) + 1
print('Types:', types)
# Show relations
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" relation id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: parse error: bbox requires four arguments </p>
</body>
</html>
17:10
The issue is that `(._;>;);out geom;` doesn't work well with `out geom` on relations. Let me think about this differently. The `out geom` gives geometry for ways/nodes but relations need `out geom` as well. The query `relation(id); (._;>;); out geom;` should work - let me adjust the syntax. Actually, the issue is that when we use `(._;>;)` the recurse `>` gets the ways but the combined result may cause parse issues. Let me try differently.
17:10
Bash
cd /work && sleep 30 && timeout 180 python3 -c "
cd /work && sleep 30 && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Query: get relations with their ways, output geometry
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
rel(1990590,1990591);
out body;
>;
out skel qt;
'''
# This gives ways with nodes, then we can reconstruct geometry
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
t = el['type']
types[t] = types.get(t, 0) + 1
print('Types:', types)
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" relation id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: parse error: bbox requires four arguments </p>
</body>
</html>
18:13
The "bbox requires four arguments" error suggests the date parameter is interfering with parsing. Let me try a different approach - query within the Vienna area with date.
18:13
Bash
cd /work && sleep 30 && timeout 180 python3 -c "
cd /work && sleep 30 && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Use area-based approach with date
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
area[name=\"Wien\"][admin_level=4]->.wien;
rel(area.wien)[boundary=administrative][admin_level=9];
out body;
>;
out skel qt;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
t = el['type']
types[t] = types.get(t, 0) + 1
print('Types:', types)
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" relation id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Status: 504
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: runtime error: open64: 0 Success /osm3s_osm_base Dispatcher_Client::request_read_and_idx::timeout. The server is probably too busy to handle your request. </p>
</body>
</html>
19:00
The historic query with date is timing out. Let me try a simpler approach - query each relation individually with date and recurse down. Let me try `out geom` for one relation first, or use `convert` to get the geometry.
19:00
Bash
cd /work && sleep 30 && timeout 180 python3 -c "
cd /work && sleep 30 && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Query one district with date and get full geometry
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
rel(1990590);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
el = result['elements'][0]
tags = el.get('tags', {})
print(f\"Name: {tags.get('name')}\")
print(f\"Has geometry: {'geometry' in el}\")
print(f\"Has members: {'members' in el}\")
if 'geometry' in el:
print(f\"Geom type: {el.get('geometry', [{}])[0].keys() if el.get('geometry') else 'empty'}\")
# Save it
with open('/work/test_hist_one.json', 'w') as f:
json.dump(result, f)
else:
print(r.text[:1000])
" 2>&1
Status: 200 Name: Alsergrund Has geometry: False Has members: True
19:39
The historic query returns the relation with members but no geometry for the relation itself (just the ways/nodes). I need to reconstruct the geometry from the relation members. Let me think about this differently. For the Overpass API with a date parameter, `out geom` gives geometry for ways and nodes, but for relations it doesn't give the full polygon geometry directly. I need to: 1. Query the relation (with `out body`) 2. Query the member ways and nodes (with `>; out skel qt;` or `out geom;`) 3. Reconstruct the polygon from the ways and nodes Or I can use `out geom` on the individual ways. Let me try a different approach. Actually, for Overpass with `[date:"..."]`, if I query `rel(id); out geom;` it gives the relation but without explicit geometry. But if I use `(rel(id);>;); out geom;` it should recurse into ways and nodes and give geometry for those elements, from which I can reconstruct the polygon. Let me try `(rel(1990590);>;); out geom;` - that should get the relation, all its member ways with their node geometries.
19:39
Bash
cd /work && sleep 30 && timeout 180 python3 -c "
cd /work && sleep 30 && timeout 180 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Use recurse-down and out geom
query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"];
(rel(1990590);>;);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=180, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
types[el['type']] = types.get(el['type'], 0) + 1
print('Types:', types)
# Check relation
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\"R id={el['id']} name='{tags.get('name')}'\")
print(f\" members: {len(el.get('members', []))}\")
elif el['type'] == 'way':
print(f\" W id={el['id']} nodes={len(el.get('nodes', []))} has_geom={'geometry' in el}\")
else:
print(r.text[:1000])
" 2>&1
Status: 200
Elements: 190
Types: {'node': 179, 'way': 10, 'relation': 1}
W id=8046654 nodes=8 has_geom=True
W id=68480634 nodes=16 has_geom=True
W id=104962880 nodes=7 has_geom=True
W id=147396089 nodes=10 has_geom=True
W id=147463454 nodes=15 has_geom=True
W id=147463455 nodes=12 has_geom=True
W id=147463456 nodes=16 has_geom=True
W id=147463462 nodes=86 has_geom=True
W id=147468285 nodes=17 has_geom=True
W id=206221053 nodes=2 has_geom=True
R id=1990590 name='Alsergrund'
members: 10
20:45
The query works! I can get the historical data with `out geom` and the recurse-down operator `>;`. Now let me build the complete solution. I need to: 1. Get current districts (already saved) 2. Get historical districts (2014-01-01) - query each district or all together 3. Normalize district names to match between the two snapshots 4. Compute symmetric difference to find changed areas 5. Classify changes: added_since_2014, removed_since_2014, unchanged 6. For each district, flag if it touches any changed area Let me write the complete solution.
20:45
20:45
Write
/work/solve.py
content · 552 lines · py
"""
Fetch Vienna districts (Bezirke) from OSM current and 2014-01-01 snapshots,
compare boundary changes, and output vienna_boundary_changes.geojson.
"""
import json
import time
import os
import sys
from collections import OrderedDict
import geopandas as gpd
import pandas as pd
import requests
from shapely.geometry import shape, MultiPolygon, Polygon
from shapely.ops import unary_union
# ---------------------------------------------------------------------------
# 1. Helpers
# ---------------------------------------------------------------------------
def overpass_query(query, timeout=300, max_retries=3):
"""Send a query to Overpass API using OSMnx-style headers."""
from osmnx import _http
from osmnx import settings as ox_settings
headers = _http._get_http_headers()
url = ox_settings.overpass_url.rstrip("/") + "/interpreter"
data = OrderedDict()
data["data"] = query
for attempt in range(max_retries):
try:
r = requests.post(url, data=data, timeout=timeout, headers=headers)
if r.status_code == 200:
return r.json()
elif r.status_code in (429, 504):
wait = 55 * (attempt + 1)
print(f" Rate limited / timeout (attempt {attempt+1}), waiting {wait}s...")
time.sleep(wait)
continue
else:
print(f" HTTP {r.status_code}: {r.text[:200]}")
if attempt < max_retries - 1:
time.sleep(10)
except requests.RequestException as e:
print(f" Request error: {e}")
if attempt < max_retries - 1:
time.sleep(30)
raise RuntimeError("Overpass query failed after max retries")
def build_relation_polygon_from_ways(elements, rel_id):
"""Reconstruct relation geometry from its member ways with geometry."""
ways = {}
for el in elements:
if el["type"] == "way":
ways[el["id"]] = el.get("geometry", [])
# Find the relation
rel = None
for el in elements:
if el["type"] == "relation" and el["id"] == rel_id:
rel = el
break
if rel is None:
return None
from shapely.geometry import Polygon as ShPolygon, LineString
polygons = []
for member in rel.get("members", []):
if member["type"] == "way":
wid = member["ref"]
geom_coords = ways.get(wid)
if geom_coords and len(geom_coords) >= 3:
coords = [(pt["lon"], pt["lat"]) for pt in geom_coords]
if coords[0] == coords[-1]:
try:
polygons.append(ShPolygon(coords))
except Exception:
pass
else:
# It's a way (line), not a closed ring - skip
pass
if not polygons:
return None
from shapely.ops import polygonize, unary_union
# Try union
try:
result = unary_union(polygons)
if result.geom_type == "Polygon":
return result
elif result.geom_type == "MultiPolygon":
return result
else:
# Try polygonizing the lines
lines = [LineString([(pt["lon"], pt["lat"]) for pt in ways[wid]])
for member in rel.get("members", [])
if member["type"] == "way" and (wid := member["ref"]) in ways
and len(ways[wid]) >= 2]
merged = unary_union(lines)
polys = list(polygonize(merged))
if polys:
return unary_union(polys)
except Exception as e:
print(f" Error building polygon for relation {rel_id}: {e}")
return None
def reconstruct_district_geometry_simple(elements, rel_id):
"""
Simpler reconstruction: find all way geometries for members of the relation
and try to create a polygon from their closed rings.
"""
ways = {}
for el in elements:
if el["type"] == "way":
ways[el["id"]] = el.get("geometry", [])
rel = None
for el in elements:
if el["type"] == "relation" and el["id"] == rel_id:
rel = el
break
if rel is None:
return None
from shapely.geometry import Polygon as ShPolygon
from shapely.ops import unary_union
member_polys = []
for member in rel.get("members", []):
if member["type"] == "way":
geom_coords = ways.get(member["ref"])
if geom_coords and len(geom_coords) >= 4:
coords = [(pt["lon"], pt["lat"]) for pt in geom_coords]
if coords[0] == coords[-1]:
try:
member_polys.append(ShPolygon(coords))
except Exception:
pass
if not member_polys:
return None
merged = unary_union(member_polys)
return merged
# ---------------------------------------------------------------------------
# 2. Current districts (from previously saved GPKG)
# ---------------------------------------------------------------------------
print("=" * 60)
print("Step 1: Loading current districts")
print("=" * 60)
current_gdf = gpd.read_file("/work/districts_current.gpkg")
print(f" Current districts loaded: {len(current_gdf)}")
print(f" CRS: {current_gdf.crs}")
print(f" Names: {sorted(current_gdf['name'].tolist())}")
# Build name->geometry mapping for current data
current_dict = {}
for _, row in current_gdf.iterrows():
name = row["name"]
geom = row["geometry"]
current_dict[name] = geom
# ---------------------------------------------------------------------------
# 3. Historical districts (2014-01-01)
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("Step 2: Fetching historical districts from Overpass (2014-01-01)")
print("=" * 60)
# We'll query each relation individually to avoid huge requests
relation_ids = [1990590, 1990591, 1990592, 1990593, 1990594, 1990595,
1990596, 1990597, 1990598, 1990599, 1990600, 1991416,
1991433, 1991434, 1991435, 1991436, 1991437, 1991438,
1991439, 1991440, 1991441, 1991442, 1991443]
historic_features = {} # name -> geometry
for rid in relation_ids:
query = f'''[out:json][timeout:180][date:"2014-01-01T00:00:00Z"];
(rel({rid});>;);
out geom;'''
print(f" Querying relation {rid} (historic)...")
try:
result = overpass_query(query, timeout=180)
elements = result.get("elements", [])
# Find the relation
rel = None
for el in elements:
if el["type"] == "relation" and el["id"] == rid:
rel = el
break
if rel is None:
print(f" WARNING: Relation {rid} not found in historic data!")
continue
name_hist = rel.get("tags", {}).get("name", f"unknown_{rid}")
print(f" Historic name: '{name_hist}'")
# Reconstruct geometry
geom = reconstruct_district_geometry_simple(elements, rid)
if geom is None:
print(f" WARNING: Could not reconstruct geometry for {name_hist} (id={rid})!")
continue
# Validate geometry
if not geom.is_valid:
geom = geom.buffer(0)
historic_features[name_hist] = geom
print(f" Geometry: {geom.geom_type}, area deg² ≈ {geom.area:.6f}")
except Exception as e:
print(f" ERROR fetching/handling relation {rid}: {e}")
continue
# Brief pause between queries
time.sleep(2)
print(f"\n Historic districts retrieved: {len(historic_features)}")
print(f" Names: {sorted(historic_features.keys())}")
# ---------------------------------------------------------------------------
# 4. Normalise district names between current and historic
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("Step 3: Normalising district names")
print("=" * 60)
def normalise_name(name):
"""Normalise district names to a canonical form."""
n = name.strip().lower()
# Handle Umlauts and special chars
replacements = {
'ä': 'ae', 'ö': 'oe', 'ü': 'ue', 'ß': 'ss',
'é': 'e', 'è': 'e', 'á': 'a',
'-': ' ', # Rudolfsheim-Fünfhaus -> Rudolfsheim Fünfhaus
'strasse': 'straße',
}
# Special known mappings
name_lower = n
return name_lower
def canonical_district_name(name):
"""Convert district name to a canonical form for matching."""
n = name.strip()
# Handle casing: most are title case
# Known variants
mapping = {
# Current names (as they appear in OSM now)
'Innere Stadt': 'Innere Stadt',
'Leopoldstadt': 'Leopoldstadt',
'Landstraße': 'Landstraße',
'Wieden': 'Wieden',
'Margareten': 'Margareten',
'Mariahilf': 'Mariahilf',
'Neubau': 'Neubau',
'Josefstadt': 'Josefstadt',
'Alsergrund': 'Alsergrund',
'Favoriten': 'Favoriten',
'Simmering': 'Simmering',
'Meidling': 'Meidling',
'Hietzing': 'Hietzing',
'Penzing': 'Penzing',
'Rudolfsheim-Fünfhaus': 'Rudolfsheim-Fünfhaus',
'Ottakring': 'Ottakring',
'Hernals': 'Hernals',
'Währing': 'Währing',
'Döbling': 'Döbling',
'Brigittenau': 'Brigittenau',
'Floridsdorf': 'Floridsdorf',
'Donaustadt': 'Donaustadt',
'Liesing': 'Liesing',
# Potential historical variants
'Rudolfsheim Fünfhaus': 'Rudolfsheim-Fünfhaus',
'Rudolfsheim-fünfhaus': 'Rudolfsheim-Fünfhaus',
'rudolfsheim-fünfhaus': 'Rudolfsheim-Fünfhaus',
'Rudolfsheim_fünfhaus': 'Rudolfsheim-Fünfhaus',
# Handle case variations
'innere stadt': 'Innere Stadt',
'leopoldstadt': 'Leopoldstadt',
'landstraße': 'Landstraße',
'landstrasse': 'Landstraße',
'wieden': 'Wieden',
'margareten': 'Margareten',
'mariahilf': 'Mariahilf',
'neubau': 'Neubau',
'josefstadt': 'Josefstadt',
'alsergrund': 'Alsergrund',
'favoriten': 'Favoriten',
'simmering': 'Simmering',
'meidling': 'Meidling',
'hietzing': 'Hietzing',
'penzing': 'Penzing',
'ottakring': 'Ottakring',
'hernals': 'Hernals',
'währing': 'Währing',
'döbling': 'Döbling',
'brigittenau': 'Brigittenau',
'floridsdorf': 'Floridsdorf',
'donaustadt': 'Donaustadt',
'liesing': 'Liesing',
}
n_stripped = n.strip()
if n_stripped in mapping:
return mapping[n_stripped]
# Try lowercase lookup
n_lower = n_stripped.lower()
for k, v in mapping.items():
if k.lower() == n_lower:
return v
# Try removing special chars
import re
n_simple = re.sub(r'[^a-z]', '', n_lower)
for k, v in mapping.items():
if re.sub(r'[^a-z]', '', k.lower()) == n_simple:
return v
return n_stripped
# Map historic names to canonical names
historic_canonical = {}
for h_name, h_geom in historic_features.items():
canon = canonical_district_name(h_name)
historic_canonical[canon] = h_geom
if canon != h_name:
print(f" '{h_name}' -> '{canon}'")
# Map current names to canonical
current_canonical = {}
for c_name, c_geom in current_dict.items():
canon = canonical_district_name(c_name)
current_canonical[canon] = c_geom
if canon != c_name:
print(f" '{c_name}' -> '{canon}'")
print(f"\n Canonical historic districts: {sorted(historic_canonical.keys())}")
print(f" Canonical current districts: {sorted(current_canonical.keys())}")
# Find matching districts
all_canonical = sorted(set(list(historic_canonical.keys()) + list(current_canonical.keys())))
print(f"\n All canonical district names ({len(all_canonical)}):")
for c in all_canonical:
in_hist = c in historic_canonical
in_curr = c in current_canonical
print(f" {c}: historic={in_hist}, current={in_curr}")
# ---------------------------------------------------------------------------
# 5. Compute boundary changes per district
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("Step 4: Computing boundary changes")
print("=" * 60)
# We'll work in a projected CRS for area calculations (Austria Lambert)
AUT_LAMBERT = "EPSG:3035" # ETRS89-extended / LAEA Europe
all_change_features = [] # list of (geometry, change_type, district_name, touches_changed)
for district_name in all_canonical:
in_hist = district_name in historic_canonical
in_curr = district_name in current_canonical
hist_geom = historic_canonical.get(district_name)
curr_geom = current_canonical.get(district_name)
print(f"\n Processing: {district_name} (historic={in_hist}, current={in_curr})")
if not in_hist and not in_curr:
continue
if not in_hist:
# Entire district is new
gdf_temp = gpd.GeoDataFrame(geometry=[curr_geom], crs="EPSG:4326")
gdf_temp = gdf_temp.to_crs(AUT_LAMBERT)
curr_proj = gdf_temp.geometry.iloc[0]
all_change_features.append((curr_proj, "added_since_2014", district_name, False))
print(f" -> Entire district is ADDED")
continue
if not in_curr:
# Entire district was removed
gdf_temp = gpd.GeoDataFrame(geometry=[hist_geom], crs="EPSG:4326")
gdf_temp = gdf_temp.to_crs(AUT_LAMBERT)
hist_proj = gdf_temp.geometry.iloc[0]
all_change_features.append((hist_proj, "removed_since_2014", district_name, False))
print(f" -> Entire district is REMOVED")
continue
# Both exist: compute symmetric difference
gdf_h = gpd.GeoDataFrame(geometry=[hist_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT)
gdf_c = gpd.GeoDataFrame(geometry=[curr_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT)
hist_proj = gdf_h.geometry.iloc[0]
curr_proj = gdf_c.geometry.iloc[0]
# Ensure validity
if not hist_proj.is_valid:
hist_proj = hist_proj.buffer(0)
if not curr_proj.is_valid:
curr_proj = curr_proj.buffer(0)
# Compute diff
added = curr_proj.difference(hist_proj) # in current but not historic
removed = hist_proj.difference(curr_proj) # in historic but not current
unchanged = curr_proj.intersection(hist_proj) # in both
# Skip empty geometries
for geom_part, change_type in [(added, "added_since_2014"),
(removed, "removed_since_2014"),
(unchanged, "unchanged")]:
if geom_part.is_empty:
continue
if geom_part.geom_type == "GeometryCollection":
# Extract only polygons
for g in geom_part.geoms:
if g.geom_type in ("Polygon", "MultiPolygon") and not g.is_empty:
all_change_features.append((g, change_type, district_name, False))
elif geom_part.geom_type in ("Polygon", "MultiPolygon"):
all_change_features.append((geom_part, change_type, district_name, False))
else:
# Try buffering small to fix
buffered = geom_part.buffer(0)
if buffered.geom_type in ("Polygon", "MultiPolygon") and not buffered.is_empty:
all_change_features.append((buffered, change_type, district_name, False))
# Summary
added_area = added.area if not added.is_empty else 0
removed_area = removed.area if not removed.is_empty else 0
unchanged_area = unchanged.area if not unchanged.is_empty else 0
print(f" Added area: {added_area:.0f} m²")
print(f" Removed area: {removed_area:.0f} m²")
print(f" Unchanged area: {unchanged_area:.0f} m²")
print(f"\n Total change features: {len(all_change_features)}")
if not all_change_features:
print("ERROR: No change features found!")
sys.exit(1)
# ---------------------------------------------------------------------------
# 6. Determine touches_changed for each district
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("Step 5: Computing touches_changed flags")
print("=" * 60)
# Build a GeoDataFrame of all change features
change_gdf = gpd.GeoDataFrame(all_change_features,
columns=["geometry", "change_type", "district_name", "touches_changed"],
crs=AUT_LAMBERT)
print(f" Change features GDF: {len(change_gdf)}")
# Changed features (added or removed)
changed = change_gdf[change_gdf["change_type"] != "unchanged"]
changed_union = changed.unary_union if len(changed) > 0 else None
print(f" Changed features count: {len(changed)}")
if changed_union:
print(f" Changed union type: {changed_union.geom_type}")
# For each district, check if its geometry touches any changed area
# We need the original district geometries in projected CRS
touches_flags = {} # district_name -> bool
if changed_union is not None and not changed_union.is_empty:
# Buffer slightly to catch adjacent touches
changed_buffered = changed_union.buffer(1) # 1 meter buffer
for district_name in all_canonical:
if district_name in current_canonical:
gdf_temp = gpd.GeoDataFrame(geometry=[current_canonical[district_name]], crs="EPSG:4326")
gdf_temp = gdf_temp.to_crs(AUT_LAMBERT)
district_geom_proj = gdf_temp.geometry.iloc[0]
elif district_name in historic_canonical:
gdf_temp = gpd.GeoDataFrame(geometry=[historic_canonical[district_name]], crs="EPSG:4326")
gdf_temp = gdf_temp.to_crs(AUT_LAMBERT)
district_geom_proj = gdf_temp.geometry.iloc[0]
else:
continue
# Check if district touches any changed area
touches = district_geom_proj.intersects(changed_buffered)
touches_flags[district_name] = bool(touches)
print(f" {district_name}: touches_changed={touches}")
else:
for district_name in all_canonical:
touches_flags[district_name] = False
print(" No changed areas found, all touches_changed=False")
# Apply touches flags to change features
for i, row in change_gdf.iterrows():
change_gdf.at[i, "touches_changed"] = touches_flags.get(row["district_name"], False)
# ---------------------------------------------------------------------------
# 7. Convert back to EPSG:4326 and output
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("Step 6: Writing output")
print("=" * 60)
# Convert to WGS84
output_gdf = change_gdf.to_crs("EPSG:4326")
# Ensure MultiPolygon
def ensure_multipolygon(geom):
if geom.geom_type == "Polygon":
return MultiPolygon([geom])
return geom
output_gdf["geometry"] = output_gdf["geometry"].apply(ensure_multipolygon)
# Select and reorder columns
output_gdf = output_gdf[["geometry", "change_type", "district_name", "touches_changed"]]
# Write GeoJSON
output_path = "/work/vienna_boundary_changes.geojson"
output_gdf.to_file(output_path, driver="GeoJSON")
print(f" Written to {output_path}")
# Summary stats
print()
print("=" * 60)
print("SUMMARY")
print("=" * 60)
print(f" Total features: {len(output_gdf)}")
for ct in ["added_since_2014", "removed_since_2014", "unchanged"]:
count = len(output_gdf[output_gdf["change_type"] == ct])
print(f" {ct}: {count}")
print(f" Districts with touches_changed=True: {sum(touches_flags.values())}")
print("Done!")
Wrote /work/solve.py (19645 chars).
20:48
Let me check the current districts file to make sure it's there, and also run the solve script. But first, let me check the current GPKG file.
20:48
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
gdf = gpd.read_file('/work/districts_current.gpkg')
print(gdf[['name','admin_level']])
print(gdf.crs)
" 2>&1
/usr/local/lib/python3.14/site-packages/pyogrio/geopandas.py:382: UserWarning: More than one layer found in 'districts_current.gpkg': 'districts' (default), 'districts_current'. Specify layer parameter to avoid this warning.
result = read_func(
name admin_level
0 Alsergrund 9
1 Hietzing 9
2 Innere Stadt 9
3 Josefstadt 9
4 Leopoldstadt 9
5 Mariahilf 9
6 Meidling 9
7 Neubau 9
8 Penzing 9
9 Rudolfsheim-Fünfhaus 9
10 Währing 9
11 Landstraße 9
12 Brigittenau 9
13 Donaustadt 9
14 Döbling 9
15 Favoriten 9
16 Floridsdorf 9
17 Hernals 9
18 Liesing 9
19 Margareten 9
20 Ottakring 9
21 Simmering 9
22 Wieden 9
23 Wieden 10
EPSG:4326
20:52
I need to handle the duplicate Wieden entry (admin_level 10) when loading. Let me fix the solve.py to load the current districts properly with the layer specified.
20:52
Edit
/work/solve.py
Edited /work/solve.py.
20:54
Actually the GPKG has two layers. Let me check which is correct.
20:54
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
# Check layers
gpd.read_file('/work/districts_current.gpkg', layer='districts').to_file('/work/test_check.gpkg', layer='test', driver='GPKG')
print(gpd.list_layers('/work/districts_current.gpkg'))
" 2>&1
name geometry_type 0 districts Polygon 1 districts_current Polygon
20:59
Good. Now let me run the solve script. But first let me also remove the duplicate Wieden (admin_level=10) from the current_gdf.
20:59
Edit
/work/solve.py
Edited /work/solve.py.
21:01
Now let me run the full script. It may take a while due to the historical queries and rate limiting.
21:01
Bash
cd /work && timeout 600 python3 solve.py 2>&1
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
31:17
It seems the script timed out. The issue is likely that querying each of the 23 relations individually from Overpass with the date parameter takes a very long time (rate limiting etc). Let me rethink the approach. Instead of querying each relation individually, I should: 1. Query all districts at once in a single Overpass query 2. Use a more efficient approach Let me try querying all 23 districts in one go, which would be much faster. The issue before was with the ^ character in the query string causing issues. Let me craft a single Overpass query that gets all relations with their member ways in one request. Actually, let me think about this more carefully. The Overpass query that worked for a single relation was: ``` [out:json][timeout:180][date:"2014-01-01T00:00:00Z"]; (rel(1990590);>;); out geom; ``` For multiple relations: ``` [out:json][timeout:180][date:"2014-01-01T00:00:00Z"]; (rel(1990590,1990591,...);>;); out geom; ``` Let me try this approach.
31:17
31:17
Bash
cd /work && sleep 10 && timeout 300 python3 -c "
cd /work && sleep 10 && timeout 300 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Query all 23 districts at once with date
ids = '1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443'
query = f'''[out:json][timeout:300][date:\"2014-01-01T00:00:00Z\"];
(rel({ids});>;);
out geom;'''
print('Sending query...')
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=300, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
types[el['type']] = types.get(el['type'], 0) + 1
print('Types:', types)
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
# Show relations
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Sending query...
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "n" of the element "bbox-query" the only allowed values are floats between -90.0 and 90.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "w" of the element "bbox-query" the only allowed values are floats between -180.0 and 180.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "e" of the element &
31:36
The issue is that with the date parameter and recursing down (>), the query format is being parsed differently. The "bbox requires four arguments" error means the `>` recurse is being interpreted as a bbox-query. Let me try a different syntax.
31:36
Bash
cd /work && sleep 10 && timeout 300 python3 -c "
cd /work && sleep 10 && timeout 300 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try with proper grouping
ids = '1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443'
query = f'''[out:json][timeout:300][date:\"2014-01-01T00:00:00Z\"];
rel({ids});
(._;>;);
out geom;'''
print('Sending query...')
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=300, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Sending query...
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "n" of the element "bbox-query" the only allowed values are floats between -90.0 and 90.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "w" of the element "bbox-query" the only allowed values are floats between -180.0 and 180.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 2: static error: For the attribute "e" of the element &
32:06
The issue seems to be with the `>` recurse operator combined with many relations. Let me try a different approach - query each relation individually but just with `out geom` for the relation itself (not recursing down to ways), then separately get the way geometries. Actually wait, the error says "bbox-query" which suggests the Overpass parser is interpreting something as a bbox. Maybe the issue is the number of IDs in the relation list? Let me try a smaller batch.
32:06
Bash
cd /work && sleep 10 && timeout 120 python3 -c "
cd /work && sleep 10 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try smaller batch of 3
query = '''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel(1990590,1990591,1990592);
(._;>;);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}'\")
else:
print(r.text[:500])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
32:28
The error says "bbox requires four arguments" or similar. The issue might be that `out geom;` on relations that have no geometry but only members triggers a bbox parsing issue. Let me try a different approach - use `out body;` and then `out geom qt;` separately for the ways and nodes.
32:28
Bash
cd /work && sleep 10 && timeout 120 python3 -c "
cd /work && sleep 10 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Use recusrse: first get rels (out body), then recurse down to ways/nodes, output geom for ways
query = '''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel(1990590);
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
elif el['type'] == 'way':
print(f\" way id={el['id']} nodes={len(el.get('nodes', []))} has_geom={'geometry' in el}\")
else:
print(r.text[:500])
" 2>&1
Status: 200 Elements: 190 rel id=1990590 name='Alsergrund' members=10 way id=8046654 nodes=8 has_geom=True way id=68480634 nodes=16 has_geom=True way id=104962880 nodes=7 has_geom=True way id=147396089 nodes=10 has_geom=True way id=147463454 nodes=15 has_geom=True way id=147463455 nodes=12 has_geom=True way id=147463456 nodes=16 has_geom=True way id=147463462 nodes=86 has_geom=True way id=147468285 nodes=17 has_geom=True way id=206221053 nodes=2 has_geom=True
33:11
The single-relation query works with `out body; >; out geom;`. The issue with multiple relations is that `>` recurses into ALL children of ALL relations and then `out geom` tries to include them all at once, which might cause the parser to get confused with bbox. Let me try with multiple relations but without the `(._;>;)` group - instead do them individually combined with `rel(); (._;>;); out geom;` for each. Actually, the issue might be that when using `date` and multiple relation IDs together with `>; out geom;`, the parser hits a limit. Let me try batching them differently.
33:11
Bash
cd /work && sleep 10 && timeout 120 python3 -c "
cd /work && sleep 10 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# Try multiple rels with separate recurse blocks
query = '''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
(
rel(1990590);>;
rel(1990591);>;
);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:500])
" 2>&1
Status: 200 Elements: 468 rel id=1990590 name='Alsergrund' members=10 rel id=1990591 name='Hietzing' members=33
33:38
This works! Let me now query all 23 relations in a single query with separate recurse blocks.
33:38
Bash
cd /work && sleep 15 && timeout 300 python3 -c "
cd /work && sleep 15 && timeout 300 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
# All 23 districts
ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
# Build query with all recursions in one union block
recursions = ' rel(' + ','.join(str(i) for i in ids) + ');>;'
query = f'''[out:json][timeout:300][date:\"2014-01-01T00:00:00Z\"];
(
{recursions}
);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=300, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
types[el['type']] = types.get(el['type'], 0) + 1
print('Types:', types)
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "n" of the element "bbox-query" the only allowed values are floats between -90.0 and 90.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "w" of the element "bbox-query" the only allowed values are floats between -180.0 and 180.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "e" of the element &
34:38
The issue is that when we use `rel(...)`, the Overpass API interprets it as a bounding box query on the relation IDs, not as a direct relation retrieval. The `rel()` function in Overpass API queries for relations by bounding box, not by ID. To query by ID, we need to use `relation()` (the full word). Let me fix the query syntax.
34:38
34:38
Bash
cd /work && sleep 15 && timeout 300 python3 -c "
cd /work && sleep 15 && timeout 300 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids_str = '1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443'
query = f'''[out:json][timeout:300][date:\"2014-01-01T00:00:00Z\"];
(
relation({ids_str});
>;
);
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=300, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
types = {}
for el in result['elements']:
types[el['type']] = types.get(el['type'], 0) + 1
print('Types:', types)
with open('/work/historic_raw.json', 'w') as f:
json.dump(result, f)
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:1000])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "n" of the element "bbox-query" the only allowed values are floats between -90.0 and 90.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "w" of the element "bbox-query" the only allowed values are floats between -180.0 and 180.0. </p>
<p><strong style="color:#FF0000">Error</strong>: line 3: static error: For the attribute "e" of the element &
35:12
The issue is the `>` (recurse down) is being misinterpreted. When you have `date:` in the settings, it might interpret it as a bbox-query. Let me try without the date in settings but use `date` in a different way. Actually, I think the issue is that the `>` operator after the block is being parsed as a bbox. Let me try adding semicolons properly.
35:12
Bash
cd /work && sleep 15 && timeout 120 python3 -c "
cd /work && sleep 15 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids_str = '1990590,1990591,1990592'
# Try simpler single-batch query: get rels with full geometry using recurse
query = f'''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
relation({ids_str});(._;>;);out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
else:
print(r.text[:500])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
35:48
The issue is that Overpass interprets "rel(" as a bounding box query, not as a relation ID query. The proper syntax is `relation(id)`. But when the `>` (recurse down) is applied to a set containing multiple relations from `relation(id1,id2,id3)`, it seems to cause an issue because the relation statement itself doesn't produce geometries (bbox), and the recurse down tries to find them.
Let me try a different approach - use `out body` to get the relation (with members), then `out geom` for the ways and nodes.
Actually wait, the issue might be simpler. Let me check if the `date` parameter causes parsing issues with the union statement.
Let me try a different query approach.
35:48
Bash
cd /work && sleep 15 && timeout 120 python3 -c "
cd /work && sleep 15 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids_str = '1990590,1990591,1990592'
# Try just body then geom separately
query = f'''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
relation({ids_str});
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}' members={len(el.get('members', []))}\")
elif r.status_code == 400:
print(r.text[:500])
else:
print(r.text[:500])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
36:23
The issue is that the `[date:...]` syntax with multiple relations seems to cause a bbox-query error. Let me check the exact query that works and what doesn't.
The query that works:
```
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
rel(1990590);
out body;
>;
out geom;
```
The query that fails - when I combine multiple relations with >;
```
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
relation(1990590,1990591,1990592);
out body;
>;
out geom;
```
Wait actually the earlier working query used `rel()` not `relation()`. Let me check...
The working query:
```python
query = '''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel(1990590);
out body;
>;
out geom;'''
```
And the non-working one used `relation()`. But earlier when I used `(rel({rid});>;)` (with parentheses around it), it also worked! Let me re-check...
Actually the earlier working query was:
```
[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];
rel(1990590);
out body;
>;
out geom;
```
This worked fine. The issue with `>;` is that it acts as a standalone statement when not in a union block.
Let me try the approach with individual queries but batched more efficiently. Or maybe I can try a different syntax.
36:23
Bash
cd /work && sleep 15 && timeout 120 python3 -c "
cd /work && sleep 15 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids_str = '1990590,1990591,1990592'
# Using rel() not relation() - with date
query = f'''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel({ids_str});
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}'\")
else:
print(r.text[:500])
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
43:02
The error seems truncated. Let me get the full error message.
43:02
Bash
cd /work && sleep 15 && timeout 120 python3 -c "
cd /work && sleep 15 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids_str = '1990590,1990591,1990592'
query = f'''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel({ids_str});
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
print(r.text)
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: parse error: bbox requires four arguments </p>
</body>
</html>
43:30
The issue is that `rel()` is a special element for area queries in Overpass. For querying by ID, we should use the `relation()` statement. And the issue with the `relation()` + date is the bbox error. Let me look more carefully. The error says "bbox requires four arguments" which happens when using `rel()` - it tries to create a bbox query. The `relation()` should work... Wait, the error was "line 2: parse error: bbox requires four arguments" for `rel()`. Let me use `relation()` without the union block. Actually, the earlier successful single query used: ``` query = '''[out:json][timeout:180][date:\"2014-01-01T00:00:00Z\"]; rel(1990590); out body; >; out geom;''' ``` This worked! So `rel()` works for single IDs but not for multiple IDs. Let me try with `relation()` for multiple IDs.
43:30
Bash
cd /work && sleep 15 && timeout 120 python3 -c "
cd /work && sleep 15 && timeout 120 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
query = '''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
relation(1990590,1990591,1990592);
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
print('Status:', r.status_code)
if r.status_code == 200:
print('Success!')
import json
result = r.json()
print('Elements:', len(result.get('elements', [])))
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}'\")
else:
print(r.text)
" 2>&1
Status: 400
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 2: parse error: bbox requires four arguments </p>
</body>
</html>
45:10
The issue is with the `>` recurse operator - it's being interpreted as a bbox query somehow. Let me look at this more carefully. Wait, the issue is that the `>` (recurse down) needs to be separated properly. When I used it like `rel(1990590); out body; >; out geom;` it works. But with `relation(1990590,1990591,1990592); out body; >; out geom;` it fails with "bbox requires four arguments". Actually, the `relation()` selector takes IDs, not a bbox. So the issue might be the date format interacting with the recurse operator. Let me try without the `date` parameter first to confirm the multi-ID query works. Actually wait - when I tried it earlier with a single relation and the date, it worked: ``` query = '''[out:json][timeout:180][date:"2014-01-01T00:00:00Z"]; rel(1990590); out body; >; out geom;''' Status: 200 ``` But with `relation(1990590,1990591,1990592)` it fails. But `rel()` should work the same as `relation()`... Actually the error might be from the `>` operator when used with multiple relations. Let me try something different - query each relation separately and combine. Actually, let me try using batches of 4-5 at a time with the individual approach that works (the `rel(id); out body; >; out geom;` pattern).
45:10
Bash
cd /work && sleep 15 && timeout 300 python3 -c "
cd /work && sleep 15 && timeout 300 python3 -c "
import osmnx as ox
from osmnx import _http
from collections import OrderedDict
import requests
import json
headers = _http._get_http_headers()
url = ox.settings.overpass_url.rstrip('/') + '/interpreter'
ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
for rid in ids:
query = f'''[out:json][timeout:120][date:\"2014-01-01T00:00:00Z\"];
rel({rid});
out body;
>;
out geom;'''
data = OrderedDict()
data['data'] = query
r = requests.post(url, data=data, timeout=120, headers=headers)
if r.status_code == 200:
result = r.json()
for el in result['elements']:
if el['type'] == 'relation':
tags = el.get('tags', {})
print(f\" rel id={el['id']} name='{tags.get('name')}'\")
all_elements.append(el)
# Sleep between queries
import time
time.sleep(3)
else:
print(f\" ERROR for {rid}: {r.status_code}\")
print(r.text[:200])
print(f'\\nTotal elements: {len(all_elements)}')
# Save
with open('/work/historic_raw.json', 'w') as f:
json.dump({'elements': all_elements}, f)
print('Saved to historic_raw.json')
" 2>&1
<exit_code>-1</exit_code> <error>Command timed out after 120000ms</error>
47:48
That timed out because of the sleep between queries. Let me rewrite the solution to be much more efficient. I'll query districts in larger batches and handle the Overpass query limitations properly. Let me also use the approach that works (single relation query at a time, with the proper `rel()` function). Actually wait the issue was that `rel()` with multiple IDs seems to work for current data but not with the date parameter. Let me check if the date parameter works with single relations... Actually I already confirmed it works for single relations! The approach that works: ``` [out:json][timeout:120][date:"2014-01-01T00:00:00Z"]; rel(1990590); out body; >; out geom; ``` This returns 200. So I need to query one district at a time. Let me rewrite solve.py more efficiently.
47:48
47:48
Write
/work/solve.py
content · 425 lines · py
"""
Fetch Vienna districts (Bezirke) from OSM current and 2014-01-01 snapshots,
compare boundary changes, and output vienna_boundary_changes.geojson.
Each output feature has:
- change_type: added_since_2014 | removed_since_2014 | unchanged
- district_name: normalised canonical name
- touches_changed: bool
"""
import json
import time
import sys
from collections import OrderedDict
import geopandas as gpd
import pandas as pd
import requests
from shapely.geometry import shape, MultiPolygon, Polygon
from shapely.ops import unary_union
from osmnx import _http
from osmnx import settings as ox_settings
# ---------------------------------------------------------------------------
# 0. Constants
# ---------------------------------------------------------------------------
AUT_LAMBERT = "EPSG:3035" # ETRS89-LAEA Europe (good for area calcs)
# Vienna district relation IDs (admin_level=9)
REL_IDS = [
1990590, 1990591, 1990592, 1990593, 1990594,
1990595, 1990596, 1990597, 1990598, 1990599,
1990600, 1991416, 1991433, 1991434, 1991435,
1991436, 1991437, 1991438, 1991439, 1991440,
1991441, 1991442, 1991443,
]
# Known district names (current OSM)
DISTRICT_NAMES_CANONICAL = [
"Alsergrund", "Hietzing", "Innere Stadt", "Josefstadt",
"Leopoldstadt", "Mariahilf", "Meidling", "Neubau",
"Penzing", "Rudolfsheim-Fünfhaus", "Währing", "Landstraße",
"Brigittenau", "Donaustadt", "Döbling", "Favoriten",
"Floridsdorf", "Hernals", "Liesing", "Margareten",
"Ottakring", "Simmering", "Wieden",
]
# ---------------------------------------------------------------------------
# 1. Helpers
# ---------------------------------------------------------------------------
def overpass_query(query, timeout=180, max_retries=3):
"""Send a query to Overpass API and return parsed JSON."""
headers = _http._get_http_headers()
url = ox_settings.overpass_url.rstrip("/") + "/interpreter"
data = OrderedDict()
data["data"] = query
for attempt in range(max_retries):
try:
r = requests.post(url, data=data, timeout=timeout, headers=headers)
if r.status_code == 200:
return r.json()
elif r.status_code in (429, 504):
wait = 55 * (attempt + 1)
print(f" Rate limited (attempt {attempt+1}), waiting {wait}s...")
time.sleep(wait)
continue
else:
txt = r.text[:300]
print(f" HTTP {r.status_code}: {txt}")
if attempt < max_retries - 1:
time.sleep(10)
except requests.RequestException as e:
print(f" Request error: {e}")
if attempt < max_retries - 1:
time.sleep(30)
raise RuntimeError("Overpass query failed after max retries")
def reconstruct_polygon_from_ways(elements, rel_id):
"""
Given a list of OSM elements (members ways with geometry), reconstruct
the relation's polygon by taking the union of all closed way rings.
"""
# Collect ways with their geometry
ways = {}
for el in elements:
if el["type"] == "way" and "geometry" in el:
ways[el["id"]] = el["geometry"]
# Find the relation
rel = None
for el in elements:
if el["type"] == "relation" and el["id"] == rel_id:
rel = el
break
if rel is None:
return None
from shapely.geometry import Polygon as ShPolygon
rings = []
for member in rel.get("members", []):
if member["type"] == "way":
wgeom = ways.get(member["ref"])
if wgeom and len(wgeom) >= 3:
coords = [(p["lon"], p["lat"]) for p in wgeom]
# Only take closed rings
if len(coords) >= 4 and coords[0] == coords[-1]:
try:
rings.append(ShPolygon(coords))
except Exception:
pass
if not rings:
return None
merged = unary_union(rings)
if not merged.is_valid:
merged = merged.buffer(0)
return merged
def query_district_historic(rel_id):
"""
Fetch one district relation as of 2014-01-01 and reconstruct its polygon.
Returns (name, geometry) or (None, None) on failure.
"""
# Use rel() for single-ID query (works with date parameter)
query = (
f'[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];\n'
f'rel({rel_id});\n'
f'out body;\n'
f'>;\n'
f'out geom;'
)
result = overpass_query(query, timeout=120)
elements = result.get("elements", [])
# Find the relation element
rel = None
for el in elements:
if el["type"] == "relation" and el["id"] == rel_id:
rel = el
break
if rel is None:
return None, None
name = rel.get("tags", {}).get("name", f"unknown_{rel_id}")
geom = reconstruct_polygon_from_ways(elements, rel_id)
return name, geom
def fetch_current_districts():
"""Load current districts from saved GPKG."""
gdf = gpd.read_file("/work/districts_current.gpkg", layer="districts")
# Remove duplicate Wieden (admin_level=10)
gdf = gdf[gdf["admin_level"] != "10"]
result = {}
for _, row in gdf.iterrows():
result[row["name"]] = row["geometry"]
return result
def normalise_name(name):
"""
Match district names across different casings/spellings.
Returns canonical name from DISTRICT_NAMES_CANONICAL or the original
if no match found.
"""
n = name.strip()
# Direct match
if n in DISTRICT_NAMES_CANONICAL:
return n
n_lower = n.lower()
# Build lookup table
lookup = {}
for cn in DISTRICT_NAMES_CANONICAL:
lookup[cn.lower()] = cn
# Also add versions without umlauts
no_umlaut = (cn.lower()
.replace('ä', 'ae').replace('ö', 'oe')
.replace('ü', 'ue').replace('ß', 'ss'))
lookup[no_umlaut] = cn
# Without hyphens
no_hyphen = cn.lower().replace('-', ' ')
lookup[no_hyphen] = cn
if n_lower in lookup:
return lookup[n_lower]
# Try removing special chars
import re
n_simple = re.sub(r'[^a-z0-9]', '', n_lower)
for cn in DISTRICT_NAMES_CANONICAL:
cn_simple = re.sub(r'[^a-z0-9]', '', cn.lower())
if cn_simple == n_simple:
return cn
# Try second-level: for compound names, check component matching
# e.g. "Rudolfsheim Fünfhaus" should match "Rudolfsheim-Fünfhaus"
n_parts = set(re.sub(r'[^a-z0-9 ]', ' ', n_lower).split())
for cn in DISTRICT_NAMES_CANONICAL:
cn_parts = set(re.sub(r'[^a-z0-9 ]', ' ', cn.lower()).split())
if n_parts == cn_parts:
return cn
# Fuzzy: scan for substring containment
for cn in DISTRICT_NAMES_CANONICAL:
cn_lower = cn.lower()
# If one is a substring of the other
if cn_lower in n_lower or n_lower in cn_lower:
return cn
print(f" WARNING: Could not match '{name}' to any canonical district!")
return name
# ---------------------------------------------------------------------------
# 2. Fetch data
# ---------------------------------------------------------------------------
print("=" * 60)
print("STEP 1: Loading current districts")
print("=" * 60)
current_dict = fetch_current_districts()
print(f" {len(current_dict)} current districts loaded")
for n in sorted(current_dict.keys()):
print(f" {n}")
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("STEP 2: Fetching historical districts (2014-01-01)")
print("=" * 60)
historic_dict = {} # canonical_name -> geometry
for rid in REL_IDS:
name_raw, geom = query_district_historic(rid)
if name_raw is None:
print(f" WARNING: Relation {rid} not found or no geometry!")
continue
canon = normalise_name(name_raw)
print(f" rel {rid}: '{name_raw}' -> canon='{canon}' "
f"geom={geom.geom_type if geom else 'NONE'} "
f"area={geom.area:.6f} deg²" if geom else "")
if geom is None:
print(f" WARNING: No geometry for {name_raw}")
continue
historic_dict[canon] = geom
# Brief pause to be nice to the API
time.sleep(1)
print(f"\n Historic districts retrieved: {len(historic_dict)}")
for n in sorted(historic_dict.keys()):
print(f" {n}: area={historic_dict[n].area:.6f} deg²")
# ---------------------------------------------------------------------------
# 3. Compute symmetric differences per district
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("STEP 3: Computing boundary changes")
print("=" * 60)
all_canonical = sorted(set(list(historic_dict.keys()) + list(current_dict.keys())))
print(f" All districts ({len(all_canonical)}):")
for c in all_canonical:
print(f" {c}: hist={'✓' if c in historic_dict else '✗'} "
f"curr={'✓' if c in current_dict else '✗'}")
# For each district compute the symmetric difference fragments
change_features = [] # list of (geometry_projected, change_type, district_name)
for dname in all_canonical:
hist_geom = historic_dict.get(dname)
curr_geom = current_dict.get(dname)
print(f"\n {dname}: ", end="", flush=True)
if hist_geom is None and curr_geom is not None:
print("ENTIRELY ADDED")
g = gpd.GeoSeries([curr_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT).iloc[0]
change_features.append((g, "added_since_2014", dname))
continue
if curr_geom is None and hist_geom is not None:
print("ENTIRELY REMOVED")
g = gpd.GeoSeries([hist_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT).iloc[0]
change_features.append((g, "removed_since_2014", dname))
continue
# Both exist — compute diffs in projected CRS
h = gpd.GeoSeries([hist_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT).iloc[0]
c = gpd.GeoSeries([curr_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT).iloc[0]
if not h.is_valid:
h = h.buffer(0)
if not c.is_valid:
c = c.buffer(0)
added = c.difference(h)
removed = h.difference(c)
unchanged = c.intersection(h)
areas = []
for geom_part, ctype in [(added, "added_since_2014"),
(removed, "removed_since_2014"),
(unchanged, "unchanged")]:
if geom_part.is_empty:
continue
areas.append(geom_part.area)
if geom_part.geom_type == "GeometryCollection":
for g in geom_part.geoms:
if g.geom_type in ("Polygon", "MultiPolygon") and not g.is_empty:
change_features.append((g, ctype, dname))
elif geom_part.geom_type in ("Polygon", "MultiPolygon"):
change_features.append((geom_part, ctype, dname))
else:
buf = geom_part.buffer(0)
if buf.geom_type in ("Polygon", "MultiPolygon") and not buf.is_empty:
change_features.append((buf, ctype, dname))
print(f"added={added.area:.0f}m² removed={removed.area:.0f}m² "
f"unchanged={unchanged.area:.0f}m²")
print(f"\n Total change fragments: {len(change_features)}")
# ---------------------------------------------------------------------------
# 4. touches_changed flag
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("STEP 4: Computing touches_changed")
print("=" * 60)
# Build GeoDataFrame of change fragments
cgdf = gpd.GeoDataFrame(
change_features,
columns=["geometry", "change_type", "district_name"],
crs=AUT_LAMBERT,
)
# Get all changed (non-unchanged) geometries
changed = cgdf[cgdf["change_type"] != "unchanged"]
touches_changed_map = {}
if len(changed) > 0:
changed_union = changed.unary_union
if not changed_union.is_empty:
# Buffer by 1 m to detect adjacency
changed_buf = changed_union.buffer(1)
for dname in all_canonical:
# Use whichever geometry exists for the district
src_geom = current_dict.get(dname) or historic_dict.get(dname)
if src_geom is None:
touches_changed_map[dname] = False
continue
g_proj = gpd.GeoSeries([src_geom], crs="EPSG:4326").to_crs(AUT_LAMBERT).iloc[0]
touches = g_proj.intersects(changed_buf)
touches_changed_map[dname] = bool(touches)
print(f" {dname}: touches_changed={bool(touches)}")
else:
for dname in all_canonical:
touches_changed_map[dname] = False
else:
for dname in all_canonical:
touches_changed_map[dname] = False
# Apply flag
cgdf["touches_changed"] = cgdf["district_name"].map(touches_changed_map)
# ---------------------------------------------------------------------------
# 5. Convert to WGS84 and ensure MultiPolygon
# ---------------------------------------------------------------------------
print()
print("=" * 60)
print("STEP 5: Writing output")
print("=" * 60)
out_gdf = cgdf.to_crs("EPSG:4326")
def to_multipolygon(geom):
if geom.geom_type == "Polygon":
return MultiPolygon([geom])
return geom
out_gdf["geometry"] = out_gdf["geometry"].apply(to_multipolygon)
out_gdf = out_gdf[["geometry", "change_type", "district_name", "touches_changed"]]
out_path = "/work/vienna_boundary_changes.geojson"
out_gdf.to_file(out_path, driver="GeoJSON")
print(f" Written to {out_path}")
print(f" Features: {len(out_gdf)}")
# Summary
print()
print("=" * 60)
print("SUMMARY")
print("=" * 60)
for ct in ["added_since_2014", "removed_since_2014", "unchanged"]:
cnt = len(out_gdf[out_gdf["change_type"] == ct])
print(f" {ct}: {cnt}")
print(f" Districts with touches_changed=True: "
f"{sum(1 for v in touches_changed_map.values() if v)}")
# Quick sanity
print()
print("Sanity check — coordinate ranges:")
bounds = out_gdf.total_bounds
print(f" lon: {bounds[0]:.4f} to {bounds[2]:.4f} "
f"lat: {bounds[1]:.4f} to {bounds[3]:.4f}")
print(" (expected ~16.2-16.6 E, 48.1-48.3 N for Vienna)")
print("Done!")
Wrote /work/solve.py (14237 chars).