dc-l3-vienna-overpass-historical
Map (reference ← swipe → agent)
0:00
Fetch Vienna's current district polygons (Bezirke) and their 2014-01-01 counterparts from Overpass and identify which parts of the district boundaries have changed between then and now. District names changed casing and spelling between snapshots, so normalise them to match. Classify each resulting geometry fragment by how it changed: `added_since_2014` (in current but not 2014), `removed_since_2014` (in 2014 but not current), or `unchanged` (present in both snapshots). For each district flag whether its geometry borders any area that was added or removed. Output vienna_boundary_changes.geojson, MultiPolygon, each feature with change_type, district_name (normalised), and touches_changed (boolean).
0:00
System info
{
"adapter": "openrouter",
"model": "deepseek/deepseek-v4-flash",
"base_url": "https://openrouter.ai/api/v1"
}
0:09
Let me break down this task: 1. Fetch Vienna's current district polygons (Bezirke) from Overpass API 2. Fetch Vienna's district polygons from 2014-01-01 from Overpass API 3. Normalise district names to match between snapshots 4. Identify geometry changes (added, removed, unchanged) 5. For each district, flag whether its geometry borders any area that was added or removed 6. Output as GeoJSON Let me start by writing a Python script to do this. First, let me understand the Overpass Query Language (Overpass QL) to fetch the data. For current data, I can use: ``` area[name="Wien"]; rel(area)[admin_level=9][type=boundary][boundary=administrative]; out geom; ``` For historical data (2014-01-01), I need to use the date parameter in Overpass. The query would be similar but with [date:"2014-01-01T00:00:00Z"]. Let me think about this more carefully. Overpass supports historical queries with the `date` parameter. But we need to be careful about how to query for historical data. Actually, for Overpass, the `[date:"..."]` qualifier can be added to the query. But it might be tricky to get historical relation data with full geometry. Let me try a different approach - use the Overpass API with the date parameter. Actually, let me write a Python script that: 1. Queries Overpass for current Vienna districts 2. Queries Overpass for 2014-01-01 Vienna districts 3. Processes the data Let me start with the script. For Overpass queries, I'll use the `overpass` Python library or just requests. Actually, let me check what libraries are available first.
0:09
Bash
pip list 2>/dev/null | grep -iE "overpass|requests|geopandas|shapely|pyproj|duck...
duckdb 1.5.3 geopandas 1.1.3 pyproj 3.7.2 requests 2.34.2 shapely 2.1.2
0:24
Let me plan this out. I need to:
1. Fetch Vienna's current district polygons from Overpass
2. Fetch Vienna's district polygons as of 2014-01-01 from Overpass (using the Overpass API's date filtering or the "old" data)
3. Normalize district names to match between the two snapshots
4. Compute symmetric difference to find changed areas
5. Classify fragments: added_since_2014, removed_since_2014, unchanged
6. For each district, flag whether its geometry borders any area that was added or removed
7. Output to vienna_boundary_changes.geojson
Let me first understand the Overpass API. For historical data (2014-01-01), we can use the `date` parameter in Overpass.
Let me write a Python script to do this.
Vienna districts (Bezirke) are typically mapped as relations with admin_level=9 or admin_level=10. Let me check.
Actually, in OpenStreetMap, Vienna's districts are admin_level=10 (municipality level in Austria, districts of Vienna are admin_level=10). Let me use this.
For historical data, Overpass supports `[date:"2014-01-01T00:00:00Z"]` to query data as of that date.
Let me write a comprehensive script.
First, let me figure out the relation IDs for Vienna districts. Vienna's 23 districts have relation IDs. Let me query the Overpass API.
Actually, I don't need to know them in advance. I can query for all relations with admin_level=10 within the Vienna boundary, or query for relations tagged with name: "Wien" or boundary=administrative and admin_level=10 within the area of Vienna.
Let me think about the approach:
1. Get the current Vienna districts: query for relations with admin_level=10, boundary=administrative, and name starting with "Wien" or located within the Vienna administrative area.
Actually, Vienna districts are tagged with:
- boundary=administrative
- admin_level=10 (for districts in Vienna) - actually let me check: in Austria, admin_level=9 is state (Bundesland), admin_level=10 is district (Bezirk). For Vienna which is both a state and a city, the districts are at admin_level=10.
Let me write a script:
```python
import geopandas as gpd
import requests
import json
from shapely.geometry import shape, MultiPolygon, Polygon
from shapely.ops import unary_union
import re
def query_overpass(query, date=None):
"""Query Overpass API with optional date for historical data."""
url = "https://overpass-api.de/api/interpreter"
if date:
# Add date to the query
pass
response = requests.post(url, data={"data": query})
response.raise_for_status()
return response.json()
```
Actually, Overpass supports date filtering with the `[date:"..."]` syntax at the beginning of the query, or by using the `date` parameter in the POST request. Let me use the `date` parameter approach.
Let me try a different approach: the Overpass Turbo format. Let me craft the query.
For current data:
```
[out:json];
area["name"="Wien"]["admin_level"="9"]->.vienna;
rel(area.vienna)["admin_level"="10"]["boundary"="administrative"];
out geom;
```
For 2014-01-01 data, I can use:
```
[date:"2014-01-01T00:00:00Z"];
[out:json];
area["name"="Wien"]["admin_level"="9"]->.vienna;
rel(area.vienna)["admin_level"="10"]["boundary"="administrative"];
out geom;
```
But careful: the `area` might not exist in 2014. Let me use a different approach - use the relation ID of Vienna.
Actually, let me use the Overpass API more carefully. The `[date:...]` filter applies to all elements returned.
Let me write a script step by step.
First, let me find the relation ID for Vienna (Bundesland Wien):
Actually, let me just use the Overpass API with `area` name "Wien" and see if it works for historical data. If not, I'll use the relation ID directly.
Let me first check what relation IDs exist for Vienna districts.
0:24
Bash
cat << 'EOF' > /work/explore.py
cat << 'EOF' > /work/explore.py
import requests
import json
# First, find Vienna's relation ID
query = """
[out:json];
area["name"="Wien"]["admin_level"="9"];
out body;
"""
resp = requests.post("https://overpass-api.de/api/interpreter", data={"data": query})
print("Status:", resp.status_code)
data = resp.json()
print(json.dumps(data, indent=2)[:2000])
EOF
python /work/explore.py
Status: 406
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
return complexjson.loads(self.text, **kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/work/explore.py", line 12, in <module>
data = resp.json()
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
0:29
The Overpass API returned a 406 error. Let me try a different approach - maybe I need to use a different endpoint or format the query differently.
0:29
Bash
cat << 'EOF' > /work/explore2.py
cat << 'EOF' > /work/explore2.py
import requests
import json
# Try Overpass API with proper format
query = """
[out:json];
area[name="Wien"][admin_level=9];
out geom;
"""
resp = requests.post("https://overpass-api.de/api/interpreter", data={"data": query})
print("Status:", resp.status_code)
print("Response text (first 2000):", resp.text[:2000])
EOF
python /work/explore2.py
Status: 406 Response text (first 2000): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
0:34
The Overpass API is returning 406. I think the issue is that the area query might need a specific area ID or the query format is wrong. Let me try a different approach - query for relations with admin_level=9 in Vienna. Actually, Vienna is both a city and a state. The districts of Vienna are admin_level=9. Let me first find the relation for Vienna city/state (admin_level=4 or 8) to get the area, then query districts within it. Let me try a different query format.
0:34
Bash
cat << 'EOF' > /work/explore3.py
cat << 'EOF' > /work/explore3.py
import requests
import json
# First find Vienna relation
query = """
[out:json];
rel[name="Wien"][admin_level=4];
out body;
"""
resp = requests.post("https://overpass-api.de/api/interpreter", data={"data": query})
print("Status:", resp.status_code)
print("Response text (first 2000):", resp.text[:2000])
EOF
python /work/explore3.py
Status: 406 Response text (first 2000): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
0:39
The Overpass API is returning 406 Not Acceptable. Maybe it needs specific Accept headers or a different content type. Let me try with proper headers.
0:39
Bash
cat << 'EOF' > /work/explore4.py
cat << 'EOF' > /work/explore4.py
import requests
import json
headers = {"Accept": "application/json"}
# Try finding Vienna districts
query = """
[out:json];
relation["admin_level"="9"]["name"="Wien"](47.0,16.0,49.0,17.5);
out body;
"""
resp = requests.post("https://overpass-api.de/api/interpreter", data={"data": query}, headers=headers)
print("Status:", resp.status_code)
print("Response text (first 2000):", resp.text[:2000])
EOF
python /work/explore4.py
Status: 406 Response text (first 2000): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
0:43
The Overpass API seems to be returning 406 when using POST with data parameter. Let me try a different approach - maybe using GET or a different URL.
0:43
Bash
cat << 'EOF' > /work/explore5.py
cat << 'EOF' > /work/explore5.py
import requests
import json
# Try GET with parameters
query = """
[out:json];
relation["admin_level"="9"]["name"="Wien"](47.0,16.0,49.0,17.5);
out body;
"""
resp = requests.get("https://overpass-api.de/api/interpreter", params={"data": query})
print("Status:", resp.status_code)
print("Response text (first 2000):", resp.text[:2000])
EOF
python /work/explore5.py
Status: 406 Response text (first 2000): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
0:48
The Overpass API is refusing with 406. Let me try a different endpoint or approach. Maybe I need to use the proper content-type header.
0:48
Bash
cat << 'EOF' > /work/explore6.py
cat << 'EOF' > /work/explore6.py
import requests
import json
# Try different endpoint
query = """
[out:json];
relation["admin_level"="9"]["name"="Wien"](47.0,16.0,49.0,17.5);
out body;
"""
resp = requests.post("https://lz4.overpass-api.de/api/interpreter", data={"data": query})
print("Status:", resp.status_code)
if resp.status_code == 200:
print("OK:", resp.text[:500])
else:
print("Error:", resp.text[:500])
EOF
python /work/explore6.py
Status: 406 Error: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at lz4.overpass-api.de Port 443</address> </body></html>
0:54
0:54
Bash
cat << 'EOF' > /work/explore7.py
cat << 'EOF' > /work/explore7.py
import requests
import json
# Try without Accept header manipulation, just POST
query = """
[out:json];
relation["admin_level"="9"]["name"="Wien"](47.0,16.0,49.0,17.5);
out body;
"""
resp = requests.post("https://overpass-api.de/api/interpreter", data={"data": query},
headers={"Content-Type": "application/x-www-form-urlencoded"})
print("Status:", resp.status_code, "Reason:", resp.reason)
print("Headers:", dict(resp.headers))
print("Text (first 1000):", resp.text[:1000])
EOF
python /work/explore7.py
Status: 406 Reason: Not Acceptable
Headers: {'Date': 'Thu, 18 Jun 2026 03:58:09 GMT', 'Server': 'Apache/2.4.67 (Debian)', 'Content-Length': '371', 'Connection': 'close', 'Content-Type': 'text/html; charset=iso-8859-1'}
Text (first 1000): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
0:59
The Overpass API is returning 406 Not Acceptable. This might be because the server doesn't recognize the content type. Let me try with curl to see the actual response.
0:59
Bash
curl -s -o /dev/null -w "%{http_code}" -X POST -d "data=[out:json];node(1);out;"...
bash: line 1: curl: command not found
1:03
Bash
python3 -c "
python3 -c "
import requests
resp = requests.post('https://overpass-api.de/api/interpreter', data={'data': '[out:json];node(1);out;'})
print(resp.status_code, resp.text[:200])
"
406 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate repres
1:09
The Overpass API is returning 406 Not Acceptable. This might be because the requests library sends an Accept header that the server doesn't like. Let me try to set the Accept header explicitly.
1:09
Bash
python3 -c "
python3 -c "
import requests
resp = requests.post('https://overpass-api.de/api/interpreter',
data={'data': '[out:json];node(1);out;'},
headers={'Accept': '*/*'})
print(resp.status_code, resp.text[:300])
"
406 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6
1:16
The 406 error seems persistent. Let me try the main API endpoint or the OSM API. Maybe try `overpass.kumi.systems` or `maps.mail.ru/osm/tools/overpass/` or simply use `overpass.openstreetmap.ru`. Let's try a few more. Actually, I recall that the Overpass API at overpass-api.de returns 406 if the Accept header doesn't include text/plain, text/html, etc. Or maybe there's a different issue. Let me try with `Accept: text/plain` or try the main overpass turbo API.
1:16
Bash
python3 -c "
python3 -c "
import requests
# Try with text/plain accept
resp = requests.post('https://overpass-api.de/api/interpreter',
data={'data': '[out:json];node(1);out;'},
headers={'Accept': 'text/plain,application/json,*/*'})
print(resp.status_code, resp.text[:300])
"
406 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6
1:24
The Overpass API is returning 406 errors. This might be a network issue or the server is blocking. Let me try a different approach - maybe use the alternative endpoint or check what's going on with network connectivity.
1:24
Bash
python3 -c "
python3 -c "
import requests
# Try overpass turbo alternative - use Kumi endpoint
try:
resp = requests.post('https://overpass.kumi.systems/api/interpreter',
data={'data': '[out:json];node(1);out;'},
timeout=10)
print('Kumi:', resp.status_code, resp.text[:300])
except Exception as e:
print('Kumi error:', e)
"
Kumi: 200 {
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_osm_base": "2026-04-21T17:01:53Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "node",
1:32
Great, the kumi endpoint works. Let me now fetch Vienna districts using that endpoint.
1:32
Bash
cat << 'EOF' > /work/fetch_districts.py
cat << 'EOF' > /work/fetch_districts.py
import requests
import json
# Fetch Vienna districts (admin_level=9 = district)
# First find the Vienna relation or area
query_current = """
[out:json];
area["name"="Wien"]["admin_level"=4]->.vienna;
rel(area.vienna)["admin_level"=9];
out geom;
"""
resp = requests.post("https://overpass.kumi.systems/api/interpreter",
data={"data": query_current}, timeout=60)
print("Status:", resp.status_code)
data = resp.json()
print("Number of elements:", len(data.get("elements", [])))
print(json.dumps(data, indent=2)[:2000])
EOF
python /work/fetch_districts.py
Status: 504
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
return complexjson.loads(self.text, **kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/work/fetch_districts.py", line 15, in <module>
data = resp.json()
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2:10
The query timed out (504). Let me try a simpler query or use a different approach. Let me first get the Vienna area ID and then fetch districts.
2:10
Bash
cat << 'EOF' > /work/fetch1.py
cat << 'EOF' > /work/fetch1.py
import requests
import json
# First get Vienna area
q1 = """
[out:json];
area["name"="Wien"]["admin_level"=4];
out;
"""
resp = requests.post("https://overpass.kumi.systems/api/interpreter",
data={"data": q1}, timeout=30)
print("Status:", resp.status_code)
data = resp.json()
print(json.dumps(data, indent=2)[:1000])
EOF
python /work/fetch1.py
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 534, in _make_request
response = conn.getresponse()
File "/usr/local/lib/python3.14/site-packages/urllib3/connection.py", line 571, in getresponse
httplib_response = super().getresponse()
File "/usr/local/lib/python3.14/http/client.py", line 1459, in getresponse
response.begin()
~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 336, in begin
version, status, reason = self._read_status()
~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 297, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/socket.py", line 729, in readinto
return self._sock.recv_into(b)
~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/ssl.py", line 1304, in recv_into
return self.read(nbytes, buffer)
~~~~~~~~~^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/ssl.py", line 1138, in read
return self._sslobj.read(len, buffer)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
TimeoutError: The read operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 696, in send
resp = conn.urlopen(
method=request.method,
...<9 lines>...
chunked=chunked,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 842, in urlopen
retries = retries.increment(
method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
)
File "/usr/local/lib/python3.14/site-packages/urllib3/util/retry.py", line 498, in increment
raise reraise(type(error), error, _stacktrace)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 788, in urlopen
response = self._make_request(
conn,
...<10 lines>...
**response_kw,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 536, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 367, in _raise_timeout
raise ReadTimeoutError(
self, url, f"Read timed out. (read timeout={timeout_value})"
) from err
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=30)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/work/fetch1.py", line 10, in <module>
resp = requests.post("https://overpass.kumi.systems/api/interpreter",
data={"data": q1}, timeout=30)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 134, in post
return request("post", url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 71, in request
return session.request(method=method, url=url, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 651, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 784, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 742, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=30)
2:45
The Overpass API is timing out. Let me try a different approach - maybe use the main overpass-api.de with proper headers, or use a different approach entirely. Let me check what's available and try a different method. Actually, let me try using the Overpass API with a simpler query first to see if it works.
2:45
Bash
python3 -c "
python3 -c "
import requests
# Try with just a tiny query
resp = requests.post('https://overpass.kumi.systems/api/interpreter',
data={'data': '[out:json];node(47.0,16.0,49.0,17.5);out 1;'},
timeout=60)
print('Status:', resp.status_code)
print('Text:', resp.text[:500])
" 2>&1
Status: 200
Text: {
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_osm_base": "2026-06-08T06:09:44Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "node",
"id": 199732,
"lat": 48.2084326,
"lon": 16.3600662,
"tags": {
"crossing": "no",
"highway": "traffic_signals"
}
}
]
}
3:12
The Kumi endpoint works with small queries. Let me try a more targeted query for Vienna districts.
3:12
Bash
python3 -c "
python3 -c "
import requests
# Find Vienna relation first
q = '[out:json];relation[\"name\"=\"Wien\"][\"admin_level\"=4];out;'
resp = requests.post('https://overpass.kumi.systems/api/interpreter', data={'data': q}, timeout=120)
print('Status:', resp.status_code)
data = resp.json()
print('Elements:', len(data['elements']))
for e in data['elements']:
print(e.get('tags', {}))
"
Status: 504
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
return complexjson.loads(self.text, **kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 7, in <module>
data = resp.json()
File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
3:57
The kumi.systems endpoint times out on relation queries. Let me try a more targeted query. Maybe I should use bounding box and find Vienna's districts directly. Let me try with a smaller query. Actually, let me try the main overpass-api.de with proper Accept headers.
3:57
Bash
python3 -c "
python3 -c "
import requests
# Try with Accept: text/html to bypass Apache 406 issue
resp = requests.post('https://overpass-api.de/api/interpreter',
data={'data': '[out:json];node(47.0,16.0,49.0,17.5);out 1;'},
timeout=60)
print('Status:', resp.status_code, resp.reason)
print('Headers:', dict(resp.headers))
print('Text:', resp.text[:500])
"
Status: 406 Not Acceptable
Headers: {'Date': 'Thu, 18 Jun 2026 04:01:11 GMT', 'Server': 'Apache/2.4.67 (Debian)', 'Content-Length': '371', 'Connection': 'close', 'Content-Type': 'text/html; charset=iso-8859-1'}
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
4:04
The 406 error on overpass-api.de seems to be an Apache configuration issue. Let me try overpass.kumi.systems with a more careful query, or use the French endpoint (overpass.openstreetmap.fr).
4:04
Bash
python3 -c "
python3 -c "
import requests
# Try French endpoint
try:
resp = requests.post('https://overpass.openstreetmap.fr/api/interpreter',
data={'data': '[out:json];node(47.0,16.0,49.0,17.5);out 1;'},
timeout=60)
print('French Status:', resp.status_code)
print('French Text:', resp.text[:500])
except Exception as e:
print('French error:', e)
"
French Status: 403 French Text: This service is only available to white-listed usages
4:12
The main overpass-api.de is giving 406, and overpass.kumi.systems is timing out on larger queries. Let me try using the main endpoint but maybe with a different Accept header, or try using the `overpass-api.de` with proper content negotiation. Actually, looking more carefully, the 406 error from `overpass-api.de` is happening because of the Accept header that Python requests sends by default. Let me check what headers requests sends and try to override them. Actually, looking at the error - Python requests sends `Accept: */*` by default. But Apache is configured to only serve certain content types. Let me check if there's a user-agent issue or something else. Wait - a simple query `[out:json];node(47.0,16.0,49.0,17.5);out 1;` also gives 406 from overpass-api.de. This is strange. Let me try with urllib instead.
4:12
Bash
python3 -c "
python3 -c "
import urllib.request, urllib.parse
data = urllib.parse.urlencode({'data': '[out:json];node(47.0,16.0,49.0,17.5);out 1;'}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
# Set a browser-like user-agent
req.add_header('User-Agent', 'Mozilla/5.0')
try:
resp = urllib.request.urlopen(req, timeout=30)
print('Status:', resp.status)
print('Text:', resp.read().decode()[:500])
except urllib.error.HTTPError as e:
print('Error:', e.code, e.reason)
print('Body:', e.read().decode()[:500])
except Exception as e:
print('Error:', e)
"
Error: 406 Not Acceptable Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
4:18
The 406 error from overpass-api.de is likely because the server is checking the Accept header. Let me check what the default Accept header is from Python requests and urllib. Actually, the requests library sends `Accept: */*` by default, and urllib sends something similar. But the server returns 406 anyway. Let me try with a different user agent and perhaps the `accept: application/json` header. Actually looking at the Apache docs, 406 is returned when the server can't produce a response matching the Accept header. The default Accept from requests might be too broad but the server should accept `*/*`. Actually, maybe the server is blocking based on User-Agent. Let me try with a browser-like user agent. Wait, I already tried various approaches. Let me try with urllib and a browser-like User-Agent properly.
4:18
Bash
python3 -c "
python3 -c "
import urllib.request, urllib.parse
data = urllib.parse.urlencode({'data': '[out:json];node(1);out;'}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('User-Agent', 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36')
req.add_header('Accept', 'application/json, text/plain, */*')
try:
resp = urllib.request.urlopen(req, timeout=30)
print('Status:', resp.status)
print('Text:', resp.read().decode()[:1000])
except urllib.error.HTTPError as e:
print('Error:', e.code)
print('Body:', e.read().decode()[:500])
except Exception as e:
print('Error:', e)
"
Error: 406 Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
4:28
The overpass-api.de main endpoint seems to be giving 406 errors consistently. Let's try the kumi endpoint with a larger timeout, and use a more efficient query. Let's also try finding the Vienna relation ID first with a simpler query.
4:28
Bash
python3 -c "
python3 -c "
import requests
# Try a simple query with kumi with longer timeout
q = '[out:json];node(48.0,16.2,48.3,16.5)[\"amenity\"];out 5;'
resp = requests.post('https://overpass.kumi.systems/api/interpreter', data={'data': q}, timeout=120)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data['elements']))
print('OK')
else:
print('Error:', resp.text[:500])
"
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 534, in _make_request
response = conn.getresponse()
File "/usr/local/lib/python3.14/site-packages/urllib3/connection.py", line 571, in getresponse
httplib_response = super().getresponse()
File "/usr/local/lib/python3.14/http/client.py", line 1459, in getresponse
response.begin()
~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 336, in begin
version, status, reason = self._read_status()
~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14/http/client.py", line 297, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/socket.py", line 729, in readinto
return self._sock.recv_into(b)
~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.14/ssl.py", line 1304, in recv_into
return self.read(nbytes, buffer)
~~~~~~~~~^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/ssl.py", line 1138, in read
return self._sslobj.read(len, buffer)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
TimeoutError: The read operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 696, in send
resp = conn.urlopen(
method=request.method,
...<9 lines>...
chunked=chunked,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 842, in urlopen
retries = retries.increment(
method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
)
File "/usr/local/lib/python3.14/site-packages/urllib3/util/retry.py", line 498, in increment
raise reraise(type(error), error, _stacktrace)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 788, in urlopen
response = self._make_request(
conn,
...<10 lines>...
**response_kw,
)
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 536, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 367, in _raise_timeout
raise ReadTimeoutError(
self, url, f"Read timed out. (read timeout={timeout_value})"
) from err
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=120)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 5, in <module>
resp = requests.post('https://overpass.kumi.systems/api/interpreter', data={'data': q}, timeout=120)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 134, in post
return request("post", url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 71, in request
return session.request(method=method, url=url, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 651, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 784, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 742, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=120)
6:35
The kumi.systems endpoint is timing out on larger queries. Let me try a different approach. Let me try using the main overpass-api.de but with urllib and proper Accept headers, or try a different endpoint. Actually, the 406 error from overpass-api.de suggests it's rejecting our requests. Let me try with urllib directly and see if setting proper headers works.
6:35
Bash
pip install --user httpx 2>/dev/null; python3 -c "
pip install --user httpx 2>/dev/null; python3 -c "
import httpx
# Try with httpx
try:
resp = httpx.post('https://overpass-api.de/api/interpreter',
data={'data': '[out:json];node(1);out;'},
timeout=30)
print('Status:', resp.status_code)
print('Text:', resp.text[:300])
except Exception as e:
print('Error:', e)
"
Collecting httpx Downloading httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB) Collecting anyio (from httpx) Downloading anyio-4.14.0-py3-none-any.whl.metadata (4.6 kB) Requirement already satisfied: certifi in /usr/local/lib/python3.14/site-packages (from httpx) (2026.5.20) Collecting httpcore==1.* (from httpx) Downloading httpcore-1.0.9-py3-none-any.whl.metadata (21 kB) Requirement already satisfied: idna in /usr/local/lib/python3.14/site-packages (from httpx) (3.18) Collecting h11>=0.16 (from httpcore==1.*->httpx) Downloading h11-0.16.0-py3-none-any.whl.metadata (8.3 kB) Downloading httpx-0.28.1-py3-none-any.whl (73 kB) Downloading httpcore-1.0.9-py3-none-any.whl (78 kB) Downloading h11-0.16.0-py3-none-any.whl (37 kB) Downloading anyio-4.14.0-py3-none-any.whl (123 kB) Installing collected packages: h11, anyio, httpcore, httpx Successfully installed anyio-4.14.0 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 Status: 406 Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.6
6:45
The overpass-api.de is returning 406 for all requests from this environment. Let me try using the kumi endpoint with a simpler approach.
6:45
Bash
python3 -c "
python3 -c "
import httpx
# Try kumi with httpx
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': '[out:json];node(47.0,16.0,49.0,17.5);out 1;'},
timeout=120)
print('Status:', resp.status_code)
print('Text:', resp.text[:500])
except Exception as e:
print('Error:', e)
"
Status: 200
Text: {
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_osm_base": "2026-06-08T06:09:44Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "node",
"id": 199732,
"lat": 48.2084326,
"lon": 16.3600662,
"tags": {
"crossing": "no",
"highway": "traffic_signals"
}
}
]
}
7:18
OK, so the overpass-api.de returns 406 for some reason but kumi.systems works with httpx. Let me try a bigger query on kumi with httpx. Let me try fetching the Vienna districts with a simpler query first - just the relation IDs.
7:18
Bash
cat << 'EOF' > /work/fetch_districts.py
cat << 'EOF' > /work/fetch_districts.py
import httpx
import json
# First find Vienna's area ID
q = '[out:json];area["name"="Wien"]["admin_level"=4];out body;'
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=180)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print(json.dumps(data, indent=2)[:2000])
else:
print('Error:', resp.text[:500])
EOF
timeout 200 python3 /work/fetch_districts.py
Status: 200
{
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_osm_base": "2026-06-08T06:09:44Z",
"timestamp_areas_base": "2026-06-08T06:09:44Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "area",
"id": 3600109166,
"tags": {
"ISO3166-2": "AT-9",
"admin_level": "4",
"admin_level:6": "yes",
"admin_level:8": "yes",
"boundary": "administrative",
"capital": "yes",
"coat_of_arms": "File:Wien 3 Wappen.svg",
"name": "Wien",
"name:ar": "\u0641\u064a\u064a\u0646\u0627",
"name:az": "Vyana",
"name:be": "\u0412\u0435\u043d\u0430",
"name:be-tarask": "\u0412\u0435\u043d\u0430",
"name:br": "Vienna",
"name:ca": "Viena",
"name:de": "Wien",
"name:ee": "Viin",
"name:en": "Vienna",
"name:es": "Viena",
"name:fr": "Vienne",
"name:fy": "Wenen",
"name:gl": "Viena",
"name:hr": "Be\u010d",
"name:hu": "B\u00e9cs",
"name:ka": "\u10d5\u10d4\u10dc\u10d0",
"name:ko": "\ube48",
"name:nl": "Wenen",
"name:no": "Wien",
"name:pl": "Wiede\u0144",
"name:prefix:at": "Statutarstadt",
"name:pt": "Viena",
"name:ro": "Viena",
"name:ru": "\u0412\u0435\u043d\u0430",
"name:sk": "Viede\u0148",
"name:sl": "Dunaj",
"name:szl": "Wiedziy\u0144",
"name:tzl": "Viadna",
"name:vi": "Vi\u00ean",
"name:zh": "\u7ef4\u4e5f\u7eb3\u5dde",
"name:zh-Hans": "\u7ef4\u4e5f\u7eb3\u5dde",
"name:zh-Hant": "\u7dad\u4e5f\u7d0d\u90a6",
"note": "Wien (Vienna) is \"Bundesland\" and \"Statutarstadt\" which equals admin_levels 4, 6, 8",
"ref:at:gkz": "90001",
"ref:nuts": "AT13",
"ref:nuts:2": "AT13",
"type": "boundary"
8:06
Great, the area ID for Wien is 3600109166. Now I need to fetch the districts (admin_level=9) within this area. Let me get the current districts and their geometries.
8:06
Bash
cat << 'EOF' > /work/fetch_districts2.py
cat << 'EOF' > /work/fetch_districts2.py
import httpx
import json
# Get districts (admin_level=9) within Vienna area
q = '''
[out:json];
area(3600109166)->.vienna;
rel(area.vienna)["admin_level"="9"];
out geom;
'''
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements', [])))
# Save to file
with open('/work/districts_current.json', 'w') as f:
json.dump(data, f)
print('Saved to /work/districts_current.json')
# Print first element summary
for e in data['elements'][:2]:
print('Type:', e['type'], 'ID:', e['id'])
print('Tags:', json.dumps(e.get('tags', {}), indent=2))
print('Members:', len(e.get('members', [])))
if 'geometry' in e:
print('Geometry points:', len(e['geometry']))
else:
print('Error:', resp.text[:500])
EOF
timeout 300 python3 /work/fetch_districts2.py
Status: 504
Error: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
9:04
The query timed out with 504 (gateway timeout). Let me try a different approach - first find the relation IDs of Vienna districts, then fetch them individually or use a simpler query.
9:04
Bash
cat << 'EOF' > /work/fetch_districts3.py
cat << 'EOF' > /work/fetch_districts3.py
import httpx
import json
# Get districts (admin_level=9) within Vienna - just IDs and names
q = '''
[out:json];
area(3600109166)->.vienna;
rel(area.vienna)["admin_level"="9"];
out body;
'''
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements', [])))
for e in data['elements']:
tags = e.get('tags', {})
print(f" ID: {e['id']} Name: {tags.get('name','?')} Ref: {tags.get('ref','?')}")
with open('/work/districts_current_ids.json', 'w') as f:
json.dump(data, f)
else:
print('Error:', resp.text[:500])
EOF
timeout 300 python3 /work/fetch_districts3.py
Status: 200 Elements: 24 ID: 1990590 Name: Alsergrund Ref: 9 ID: 1990591 Name: Hietzing Ref: 13 ID: 1990592 Name: Innere Stadt Ref: 1 ID: 1990593 Name: Josefstadt Ref: 8 ID: 1990594 Name: Leopoldstadt Ref: 2 ID: 1990595 Name: Mariahilf Ref: 6 ID: 1990596 Name: Meidling Ref: 12 ID: 1990597 Name: Neubau Ref: 7 ID: 1990598 Name: Penzing Ref: 14 ID: 1990599 Name: Rudolfsheim-Fünfhaus Ref: 15 ID: 1990600 Name: Währing Ref: 18 ID: 1991416 Name: Landstraße Ref: 3 ID: 1991433 Name: Brigittenau Ref: 20 ID: 1991434 Name: Donaustadt Ref: 22 ID: 1991435 Name: Döbling Ref: 19 ID: 1991436 Name: Favoriten Ref: 10 ID: 1991437 Name: Floridsdorf Ref: 21 ID: 1991438 Name: Hernals Ref: 17 ID: 1991439 Name: Liesing Ref: 23 ID: 1991440 Name: Margareten Ref: 5 ID: 1991441 Name: Ottakring Ref: 16 ID: 1991442 Name: Simmering Ref: 11 ID: 1991443 Name: Wieden Ref: 4 ID: 3980995 Name: Amtshaus für den 16. Bezirk Ref: ?
11:31
Good, 23 districts plus one weird entry (Amtshaus für den 16. Bezirk). Let's get the geometries now. We need full geometries with out geom. The issue is that the query with out geom timed out. Let's try a more targeted approach - use `out geom` on individual relations.
11:31
Bash
cat << 'EOF' > /work/fetch_districts_geom.py
cat << 'EOF' > /work/fetch_districts_geom.py
import httpx
import json
# Build query for all 23 district relations with geom
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
# Use a simpler approach - fetch each relation individually
query_parts = []
for rid in rel_ids:
query_parts.append(f"rel({rid});")
q = f'''
[out:json];
(
{';'.join(query_parts)}
);
out geom;
'''
print("Query length:", len(q))
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements', [])))
with open('/work/districts_current_geom.json', 'w') as f:
json.dump(data, f)
print('Saved.')
for e in data['elements']:
tags = e.get('tags', {})
members = e.get('members', [])
geom = e.get('geometry', [])
print(f"ID: {e['id']} Name: {tags.get('name','?')} Members: {len(members)} DirectGeom: {len(geom)}")
else:
print('Error:', resp.text[:1000])
EOF
timeout 300 python3 /work/fetch_districts_geom.py
Query length: 352
Status: 400
Error: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: line 4: parse error: Unknown type ";" </p>
<p><strong style="color:#FF0000">Error</strong>: line 4: parse error: An empty query is not allowed </p>
<p><strong style="color:#FF0000">Error</strong>: line 4: parse error: ';' expected - 'rel' found. </p>
<p><strong style="color:#FF0000">Error</strong>: line 4: parse error: Unknown type ";" </p>
<p><strong style="color:#FF0000">Error</strong>: line 4: parse error: An empty query is not allowed </p>
<p><strong sty
13:51
The Overpass query has a syntax issue. The problem is the semicolon placement. Let me fix: ``` [out:json]; ( rel(1990590); rel(1990591); ... ); out geom; ``` Note: I need to avoid semicolons before rel() in the union block. Let me check the exact syntax.
13:51
Bash
cat << 'EOF' > /work/fetch_districts_geom.py
cat << 'EOF' > /work/fetch_districts_geom.py
import httpx
import json
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
# Correct Overpass QL union syntax
rel_statements = ';\n '.join([f"rel({rid})" for rid in rel_ids])
q = f'''[out:json];
(
{rel_statements};
);
out geom;
'''
print("Query:")
print(q[:500])
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements', [])))
with open('/work/districts_current_geom.json', 'w') as f:
json.dump(data, f)
print('Saved.')
for e in data['elements']:
tags = e.get('tags', {})
members = e.get('members', [])
geom = e.get('geometry', [])
print(f"ID: {e['id']} Name: {tags.get('name','?')} Members: {len(members)} DirectGeom: {len(geom)}")
else:
print('Error:', resp.text[:1000])
EOF
timeout 300 python3 /work/fetch_districts_geom.py
Query:
[out:json];
(
rel(1990590);
rel(1990591);
rel(1990592);
rel(1990593);
rel(1990594);
rel(1990595);
rel(1990596);
rel(1990597);
rel(1990598);
rel(1990599);
rel(1990600);
rel(1991416);
rel(1991433);
rel(1991434);
rel(1991435);
rel(1991436);
rel(1991437);
rel(1991438);
rel(1991439);
rel(1991440);
rel(1991441);
rel(1991442);
rel(1991443);
);
out geom;
Status: 504
Error: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: runtime error: open64: 0 Success /osm3s_osm_base Dispatcher_Client::request_read_and_idx::timeout. The server is probably too busy to handle your request. </p>
</body>
</html>
14:33
The server is too busy. Let me try querying one relation at a time and collecting the results.
14:33
Bash
cat << 'EOF' > /work/fetch_one.py
cat << 'EOF' > /work/fetch_one.py
import httpx
import json
import time
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
for rid in rel_ids:
q = f'[out:json];rel({rid});out geom;'
for attempt in range(3):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=120)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
all_elements.extend(data['elements'])
tags = data['elements'][0].get('tags', {})
print(f"OK: ID={rid} Name={tags.get('name','?')}")
break
else:
print(f"Retry {rid}: {resp.status_code} - {resp.text[:100]}")
time.sleep(2)
except Exception as e:
print(f"Retry {rid}: {e}")
time.sleep(2)
print(f"\nTotal elements: {len(all_elements)}")
with open('/work/districts_current_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved to /work/districts_current_geom.json")
EOF
timeout 600 python3 /work/fetch_one.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
24:38
The fetch is timing out. Let me try a different approach. Let me use the Overpass API in a more efficient way - maybe I can query them in batches, or use a different endpoint. Also I could try using `out skel` (skeleton, which is fast) first and then get geometries separately, or I could try querying using the area approach but with a simpler output. Actually, let me try using the mainstream Overpass API endpoint but with different Accept headers, or maybe I can use a simpler query approach.
24:38
Bash
cat << 'EOF' > /work/fetch_fast.py
cat << 'EOF' > /work/fetch_fast.py
import httpx
import json
# Try getting all districts in one query but just IDs/names first
q = '''
[out:json];
area(3600109166)->.vienna;
rel(area.vienna)["admin_level"="9"]["ref"]["name"];
out body ids;
'''
resp = httpx.post('https://overpass-api.de/api/interpreter',
data={'data': q, 'Accept': 'application/json'},
timeout=120,
headers={'Accept': 'application/json,text/plain,*/*'})
print('Status:', resp.status_code)
print('Text:', resp.text[:500])
EOF
python3 /work/fetch_fast.py
Status: 406 Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
24:48
The overpass-api.de endpoint keeps returning 406. Let me focus on getting data from kumi.systems with smaller batches.
24:48
Bash
cat << 'EOF' > /work/fetch_batch.py
cat << 'EOF' > /work/fetch_batch.py
import httpx
import json
import time
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
# Process in batches of 3-4 to avoid timeout
batch_size = 4
for i in range(0, len(rel_ids), batch_size):
batch = rel_ids[i:i+batch_size]
rel_statements = ';\n '.join([f"rel({rid})" for rid in batch])
q = f'''[out:json];
(
{rel_statements};
);
out geom;
'''
print(f"Fetching batch {i//batch_size + 1}: IDs {batch}...")
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=120)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
all_elements.extend(data['elements'])
for e in data['elements']:
tags = e.get('tags', {})
print(f" OK: {tags.get('name','?')}")
else:
print(f" Error {resp.status_code}: {resp.text[:200]}")
except Exception as e:
print(f" Exception: {e}")
time.sleep(2)
print(f"\nTotal elements: {len(all_elements)}")
with open('/work/districts_current_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_batch.py
Fetching batch 1: IDs [1990590, 1990591, 1990592, 1990593]...
Error 504: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Fetching batch 2: IDs [1990594, 1990595, 1990596, 1990597]...
OK: Leopoldstadt
OK: Mariahilf
OK: Meidling
OK: Neubau
Fetching batch 3: IDs [1990598, 1990599, 1990600, 1991416]...
Error 504: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Fetching batch 4: IDs [1991433, 1991434, 1991435, 1991436]...
OK: Brigittenau
OK: Donaustadt
OK: Döbling
OK: Favoriten
Fetching batch 5: IDs [1991437, 1991438, 1991439, 1991440]...
OK: Floridsdorf
OK: Hernals
OK: Liesing
OK: Margareten
Fetching batch 6: IDs [1991441, 1991442, 1991443]...
OK: Ottakring
OK: Simmering
OK: Wieden
Total elements: 15
Saved!
26:26
I got 15 out of 23 districts. Some failed. Let me retry the failed ones individually.
26:26
Bash
cat << 'EOF' > /work/fetch_retry.py
cat << 'EOF' > /work/fetch_retry.py
import httpx
import json
import time
# Failed: 1990590 (Alsergrund), 1990591 (Hietzing), 1990592 (Innere Stadt), 1990593 (Josefstadt)
# Failed: 1990598 (Penzing), 1990599 (Rudolfsheim-Fünfhaus), 1990600 (Währing), 1991416 (Landstraße)
rel_ids = [1990590, 1990591, 1990592, 1990593, 1990598, 1990599, 1990600, 1991416]
# Load existing data
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
all_elements = data['elements']
existing_ids = {e['id'] for e in all_elements}
print(f"Existing: {len(all_elements)} elements with IDs: {existing_ids}")
for rid in rel_ids:
q = f'[out:json];rel({rid});out geom;'
print(f"Fetching {rid}...")
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=120)
if resp.status_code == 200:
d = resp.json()
if d.get('elements'):
e = d['elements'][0]
all_elements.append(e)
print(f" OK: {e.get('tags',{}).get('name','?')}")
else:
print(f" Error {resp.status_code}")
except Exception as ex:
print(f" Exception: {ex}")
time.sleep(2)
print(f"\nTotal elements: {len(all_elements)}")
with open('/work/districts_current_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_retry.py
Existing: 15 elements with IDs: {1990594, 1990595, 1990596, 1990597, 1991433, 1991434, 1991435, 1991436, 1991437, 1991438, 1991439, 1991440, 1991441, 1991442, 1991443}
Fetching 1990590...
Exception: The read operation timed out
Fetching 1990591...
Error 504
Fetching 1990592...
Error 504
Fetching 1990593...
Exception: The read operation timed out
Fetching 1990598...
OK: Penzing
Fetching 1990599...
OK: Rudolfsheim-Fünfhaus
Fetching 1990600...
Error 504
Fetching 1991416...
Error 504
Total elements: 17
Saved!
33:26
Bash
cat << 'EOF' > /work/fetch_retry2.py
cat << 'EOF' > /work/fetch_retry2.py
import httpx
import json
import time
# Still missing: 1990590 (Alsergrund), 1990591 (Hietzing), 1990592 (Innere Stadt), 1990593 (Josefstadt), 1990600 (Währing), 1991416 (Landstraße)
rel_ids = [1990590, 1990591, 1990592, 1990593, 1990600, 1991416]
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
all_elements = data['elements']
existing_ids = {e['id'] for e in all_elements}
print(f"Existing: {len(all_elements)} elements, IDs: {existing_ids}")
for rid in rel_ids:
q = f'[out:json];rel({rid});out geom;'
print(f"Fetching {rid}...")
for attempt in range(3):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=180)
if resp.status_code == 200:
d = resp.json()
if d.get('elements'):
e = d['elements'][0]
all_elements.append(e)
print(f" OK: {e.get('tags',{}).get('name','?')}")
break
else:
print(f" Attempt {attempt+1}: Error {resp.status_code}")
time.sleep(3)
except Exception as ex:
print(f" Attempt {attempt+1}: {ex}")
time.sleep(3)
time.sleep(2)
print(f"\nTotal elements: {len(all_elements)}")
with open('/work/districts_current_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_retry2.py
Existing: 17 elements, IDs: {1990594, 1990595, 1990596, 1990597, 1990598, 1990599, 1991433, 1991434, 1991435, 1991436, 1991437, 1991438, 1991439, 1991440, 1991441, 1991442, 1991443}
Fetching 1990590...
Attempt 1: Error 504
Attempt 2: Error 504
OK: Alsergrund
Fetching 1990591...
OK: Hietzing
Fetching 1990592...
OK: Innere Stadt
Fetching 1990593...
Attempt 1: Error 504
OK: Josefstadt
Fetching 1990600...
OK: Währing
Fetching 1991416...
OK: Landstraße
Total elements: 23
Saved!
36:13
Great! Now I have all 23 current districts. Let me check the geometry structure - whether it uses the 'geometry' field on the relation or needs way geometries.
36:13
Bash
python3 -c "
python3 -c "
import json
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
for e in data['elements']:
tags = e.get('tags', {})
geom = e.get('geometry', [])
members = e.get('members', [])
print(f\"ID: {e['id']} Name: {tags.get('name','?')} Members: {len(members)} DirectGeom: {len(geom)}\")
if members:
print(f\" First member: {members[0]}\")
"
ID: 1990594 Name: Leopoldstadt Members: 11 DirectGeom: 0
First member: {'type': 'way', 'ref': 147463460, 'role': 'outer', 'geometry': [{'lat': 48.2251075, 'lon': 16.3674575}, {'lat': 48.2253741, 'lon': 16.368439}, {'lat': 48.225683, 'lon': 16.3693506}, {'lat': 48.2256151, 'lon': 16.3697574}, {'lat': 48.2258255, 'lon': 16.3703725}, {'lat': 48.2261717, 'lon': 16.3704212}, {'lat': 48.2264456, 'lon': 16.3707049}, {'lat': 48.2266748, 'lon': 16.3709455}, {'lat': 48.2278334, 'lon': 16.3721614}, {'lat': 48.2282613, 'lon': 16.3735354}, {'lat': 48.2291536, 'lon': 16.3762505}, {'lat': 48.2292792, 'lon': 16.3766339}, {'lat': 48.2292994, 'lon': 16.3766953}, {'lat': 48.2295707, 'lon': 16.3775195}, {'lat': 48.2294971, 'lon': 16.3776716}, {'lat': 48.2288677, 'lon': 16.3789703}, {'lat': 48.2286756, 'lon': 16.379366}, {'lat': 48.2284433, 'lon': 16.3798446}, {'lat': 48.2283022, 'lon': 16.3801373}, {'lat': 48.228313, 'lon': 16.3804719}, {'lat': 48.2279137, 'lon': 16.3813061}, {'lat': 48.2256828, 'lon': 16.383051}, {'lat': 48.2256566, 'lon': 16.383074}, {'lat': 48.2256355, 'lon': 16.383103}, {'lat': 48.2256194, 'lon': 16.383138}, {'lat': 48.2256042, 'lon': 16.3831846}, {'lat': 48.2255989, 'lon': 16.3832231}, {'lat': 48.2255961, 'lon': 16.3832676}, {'lat': 48.2256004, 'lon': 16.3833051}, {'lat': 48.2256078, 'lon': 16.3833445}, {'lat': 48.2260849, 'lon': 16.3847}, {'lat': 48.2278318, 'lon': 16.3879656}, {'lat': 48.2279555, 'lon': 16.3879758}, {'lat': 48.2280722, 'lon': 16.3880499}, {'lat': 48.2281664, 'lon': 16.3881807}, {'lat': 48.2293718, 'lon': 16.3879148}, {'lat': 48.2303683, 'lon': 16.3867834}, {'lat': 48.2305039, 'lon': 16.387021}, {'lat': 48.2307099, 'lon': 16.3874126}, {'lat': 48.2307192, 'lon': 16.3874115}, {'lat': 48.2316746, 'lon': 16.3892605}, {'lat': 48.2320872, 'lon': 16.3900552}, {'lat': 48.2332751, 'lon': 16.3923408}, {'lat': 48.2335545, 'lon': 16.3928763}, {'lat': 48.2348168, 'lon': 16.3952832}, {'lat': 48.2348255, 'lon': 16.3952727}]}
ID: 1990595 Name: Mariahilf Members: 26 DirectGeom: 0
First member: {'type': 'way', 'ref': 147399469, 'role': 'outer', 'geometry': [{'lat': 48.1884673, 'lon': 16.3417518}, {'lat': 48.1884907, 'lon': 16.3419906}, {'lat': 48.1885038, 'lon': 16.3421967}, {'lat': 48.1885125, 'lon': 16.3423347}, {'lat': 48.1885145, 'lon': 16.3423344}, {'lat': 48.1885884, 'lon': 16.3434981}, {'lat': 48.1886066, 'lon': 16.3437838}, {'lat': 48.1886224, 'lon': 16.3440329}, {'lat': 48.1886632, 'lon': 16.3446012}, {'lat': 48.1886706, 'lon': 16.3447292}, {'lat': 48.1886777, 'lon': 16.3448535}, {'lat': 48.1887271, 'lon': 16.3454084}, {'lat': 48.1892042, 'lon': 16.3483776}, {'lat': 48.1892332, 'lon': 16.348541}, {'lat': 48.189289, 'lon': 16.3488562}, {'lat': 48.1896737, 'lon': 16.3510088}, {'lat': 48.1897558, 'lon': 16.3513343}, {'lat': 48.189839, 'lon': 16.3516004}, {'lat': 48.1899098, 'lon': 16.3517935}, {'lat': 48.1899818, 'lon': 16.3519577}, {'lat': 48.1900481, 'lon': 16.3520883}, {'lat': 48.1901183, 'lon': 16.3522181}, {'lat': 48.1902457, 'lon': 16.3524244}, {'lat': 48.1903726, 'lon': 16.3526091}]}
ID: 1990596 Name: Meidling Members: 53 DirectGeom: 0
First member: {'type': 'way', 'ref': 147399473, 'role': 'outer', 'geometry': [{'lat': 48.1792208, 'lon': 16.349725}, {'lat': 48.1795396, 'lon': 16.3495462}, {'lat': 48.1797052, 'lon': 16.3494244}, {'lat': 48.1854026, 'lon': 16.3444982}, {'lat': 48.1856807, 'lon': 16.3442529}]}
ID: 1990597 Name: Neubau Members: 31 DirectGeom: 0
First member: {'type': 'way', 'ref': 146678751, 'role': 'outer', 'geometry': [{'lat': 48.2041934, 'lon': 16.3370141}, {'lat': 48.2027519, 'lon': 16.3371361}, {'lat': 48.2026746, 'lon': 16.3371632}, {'lat': 48.2025063, 'lon': 16.3371784}, {'lat': 48.2025048, 'lon': 16.3371426}, {'lat': 48.2023071, 'lon': 16.3371581}, {'lat': 48.2022202, 'lon': 16.3371741}, {'lat': 48.2018366, 'lon': 16.3372315}, {'lat': 48.2006104, 'lon': 16.3375945}, {'lat': 48.1997437, 'lon': 16.3379228}, {'lat': 48.1997449, 'lon': 16.3379393}, {'lat': 48.1996398, 'lon': 16.3379795}, {'lat': 48.1996247, 'lon': 16.33796}, {'lat': 48.1995034, 'lon': 16.3380108}, {'lat': 48.1981056, 'lon': 16.3385436}, {'lat': 48.1980138, 'lon': 16.3385744}, {'lat': 48.1977849, 'lon': 16.3385765}, {'lat': 48.1976788, 'lon': 16.3385929}, {'lat': 48.1974912, 'lon': 16.3386513}, {'lat': 48.1973256, 'lon': 16.3387154}, {'lat': 48.1971961, 'lon': 16.3387895}, {'lat': 48.1966553, 'lon': 16.3389936}, {'lat': 48.1965884, 'lon': 16.3390166}, {'lat': 48.1964866, 'lon': 16.3390252}, {'lat': 48.1964269, 'lon': 16.3390389}, {'lat': 48.1958746, 'lon': 16.3392406}]}
ID: 1991433 Name: Brigittenau Members: 10 DirectGeom: 0
First member: {'type': 'way', 'ref': 147463464, 'role': 'outer', 'geometry': [{'lat': 48.2363479, 'lon': 16.398134}, {'lat': 48.2397528, 'lon': 16.3942444}, {'lat': 48.2432219, 'lon': 16.390506}, {'lat': 48.248205, 'lon': 16.3854004}, {'lat': 48.2487412, 'lon': 16.3848614}, {'lat': 48.2497632, 'lon': 16.3838591}, {'lat': 48.2507939, 'lon': 16.382872}, {'lat': 48.2530063, 'lon': 16.3807789}, {'lat': 48.2552352, 'lon': 16.3787118}, {'lat': 48.258047, 'lon': 16.3763322}, {'lat': 48.2627227, 'lon': 16.3724028}]}
ID: 1991434 Name: Donaustadt Members: 66 DirectGeom: 0
First member: {'type': 'way', 'ref': 263625782, 'role': 'outer', 'geometry': [{'lat': 48.1352198, 'lon': 16.5760306}, {'lat': 48.1363791, 'lon': 16.5717092}, {'lat': 48.1374412, 'lon': 16.5678495}, {'lat': 48.1378847, 'lon': 16.5662995}, {'lat': 48.1385371, 'lon': 16.5637843}, {'lat': 48.1396216, 'lon': 16.5593919}, {'lat': 48.1408262, 'lon': 16.5537786}, {'lat': 48.1411992, 'lon': 16.551886}, {'lat': 48.1414258, 'lon': 16.5507263}, {'lat': 48.1419225, 'lon': 16.5483286}, {'lat': 48.1425233, 'lon': 16.5456932}, {'lat': 48.1429959, 'lon': 16.5439417}, {'lat': 48.1433293, 'lon': 16.5430083}, {'lat': 48.1441537, 'lon': 16.5411361}, {'lat': 48.1446226, 'lon': 16.5402356}, {'lat': 48.1462125, 'lon': 16.5378527}, {'lat': 48.1468034, 'lon': 16.5370034}, {'lat': 48.1504365, 'lon': 16.5316952}, {'lat': 48.1510847, 'lon': 16.5307426}, {'lat': 48.15288, 'lon': 16.528242}, {'lat': 48.1539418, 'lon': 16.5265423}, {'lat': 48.1544619, 'lon': 16.5256693}, {'lat': 48.1552311, 'lon': 16.5240716}, {'lat': 48.1564252, 'lon': 16.5214307}, {'lat': 48.1582696, 'lon': 16.516991}, {'lat': 48.1584543, 'lon': 16.5165442}]}
ID: 1991435 Name: Döbling Members: 51 DirectGeom: 0
First member: {'type': 'way', 'ref': 33288721, 'role': 'outer', 'geometry': [{'lat': 48.258323, 'lon': 16.2855585}, {'lat': 48.258404, 'lon': 16.2857573}]}
ID: 1991436 Name: Favoriten Members: 43 DirectGeom: 0
First member: {'type': 'way', 'ref': 150887614, 'role': 'outer', 'geometry': [{'lat': 48.1592699, 'lon': 16.3356406}, {'lat': 48.1572973, 'lon': 16.336893}, {'lat': 48.1552024, 'lon': 16.3380785}, {'lat': 48.1548257, 'lon': 16.338353}, {'lat': 48.1544862, 'lon': 16.3386555}, {'lat': 48.1542634, 'lon': 16.3389388}, {'lat': 48.1538703, 'lon': 16.3395053}, {'lat': 48.1536476, 'lon': 16.3399616}, {'lat': 48.1532253, 'lon': 16.3411401}, {'lat': 48.1531226, 'lon': 16.3415265}, {'lat': 48.1528648, 'lon': 16.3428116}, {'lat': 48.1524951, 'lon': 16.344936}, {'lat': 48.1520483, 'lon': 16.3475885}, {'lat': 48.1520005, 'lon': 16.3481934}, {'lat': 48.1518978, 'lon': 16.3481812}, {'lat': 48.1517777, 'lon': 16.3484592}, {'lat': 48.15178, 'lon': 16.3487669}, {'lat': 48.151899, 'lon': 16.3491551}, {'lat': 48.1518873, 'lon': 16.3503755}, {'lat': 48.1518103, 'lon': 16.3521485}, {'lat': 48.1516027, 'lon': 16.3591214}, {'lat': 48.151521, 'lon': 16.3600306}, {'lat': 48.1513951, 'lon': 16.3609328}, {'lat': 48.1511991, 'lon': 16.361905}, {'lat': 48.1498439, 'lon': 16.3674165}, {'lat': 48.1497666, 'lon': 16.3677309}, {'lat': 48.1491698, 'lon': 16.3700774}, {'lat': 48.1481598, 'lon': 16.3740486}, {'lat': 48.1481123, 'lon': 16.374533}, {'lat': 48.1479138, 'lon': 16.375462}]}
ID: 1991437 Name: Floridsdorf Members: 38 DirectGeom: 0
First member: {'type': 'way', 'ref': 147463464, 'role': 'outer', 'geometry': [{'lat': 48.2363479, 'lon': 16.398134}, {'lat': 48.2397528, 'lon': 16.3942444}, {'lat': 48.2432219, 'lon': 16.390506}, {'lat': 48.248205, 'lon': 16.3854004}, {'lat': 48.2487412, 'lon': 16.3848614}, {'lat': 48.2497632, 'lon': 16.3838591}, {'lat': 48.2507939, 'lon': 16.382872}, {'lat': 48.2530063, 'lon': 16.3807789}, {'lat': 48.2552352, 'lon': 16.3787118}, {'lat': 48.258047, 'lon': 16.3763322}, {'lat': 48.2627227, 'lon': 16.3724028}]}
ID: 1991438 Name: Hernals Members: 81 DirectGeom: 0
First member: {'type': 'way', 'ref': 171978231, 'role': 'outer', 'geometry': [{'lat': 48.2203592, 'lon': 16.336262}, {'lat': 48.2207485, 'lon': 16.3366029}]}
ID: 1991439 Name: Liesing Members: 69 DirectGeom: 0
First member: {'type': 'way', 'ref': 59178603, 'role': 'outer', 'geometry': [{'lat': 48.1280684, 'lon': 16.2180992}, {'lat': 48.1282256, 'lon': 16.2192254}, {'lat': 48.1283086, 'lon': 16.2202416}, {'lat': 48.1288903, 'lon': 16.2220582}, {'lat': 48.129278, 'lon': 16.2245445}, {'lat': 48.129382, 'lon': 16.2258565}, {'lat': 48.1295722, 'lon': 16.226857}, {'lat': 48.1300545, 'lon': 16.22893}, {'lat': 48.1305758, 'lon': 16.2306526}, {'lat': 48.1310089, 'lon': 16.2321222}, {'lat': 48.1312093, 'lon': 16.2333424}, {'lat': 48.1313975, 'lon': 16.2345602}, {'lat': 48.1314508, 'lon': 16.2349064}, {'lat': 48.1315535, 'lon': 16.2354826}, {'lat': 48.1317118, 'lon': 16.2363796}, {'lat': 48.1319074, 'lon': 16.2374903}, {'lat': 48.1319871, 'lon': 16.2381161}, {'lat': 48.1320488, 'lon': 16.2386167}]}
ID: 1991440 Name: Margareten Members: 22 DirectGeom: 0
First member: {'type': 'way', 'ref': 224800031, 'role': 'outer', 'geometry': [{'lat': 48.1969247, 'lon': 16.3591932}, {'lat': 48.1967114, 'lon': 16.359371}]}
ID: 1991441 Name: Ottakring Members: 76 DirectGeom: 0
First member: {'type': 'way', 'ref': 2240370, 'role': 'outer', 'geometry': [{'lat': 48.2044903, 'lon': 16.3058609}, {'lat': 48.2042684, 'lon': 16.3069809}, {'lat': 48.2041796, 'lon': 16.3073456}, {'lat': 48.2039982, 'lon': 16.3079627}, {'lat': 48.2039812, 'lon': 16.308047}, {'lat': 48.2039768, 'lon': 16.3080691}]}
ID: 1991442 Name: Simmering Members: 43 DirectGeom: 0
First member: {'type': 'way', 'ref': 150886041, 'role': 'outer', 'geometry': [{'lat': 48.1662737, 'lon': 16.497085}, {'lat': 48.1650897, 'lon': 16.4952282}, {'lat': 48.1653265, 'lon': 16.4946641}]}
ID: 1991443 Name: Wieden Members: 30 DirectGeom: 0
First member: {'type': 'way', 'ref': 4786823, 'role': 'outer', 'geometry': [{'lat': 48.1921601, 'lon': 16.3640142}, {'lat': 48.1921999, 'lon': 16.3639811}, {'lat': 48.1924288, 'lon': 16.3636803}, {'lat': 48.1928619, 'lon': 16.3631186}, {'lat': 48.193213, 'lon': 16.3626594}]}
ID: 1990598 Name: Penzing Members: 45 DirectGeom: 0
First member: {'type': 'way', 'ref': 147399476, 'role': 'outer', 'geometry': [{'lat': 48.1871851, 'lon': 16.3142055}, {'lat': 48.187338, 'lon': 16.3133873}, {'lat': 48.1873759, 'lon': 16.3130943}, {'lat': 48.1874081, 'lon': 16.312593}, {'lat': 48.1874384, 'lon': 16.3119141}, {'lat': 48.1875601, 'lon': 16.3083038}, {'lat': 48.1875627, 'lon': 16.3082256}, {'lat': 48.1876298, 'lon': 16.3051883}, {'lat': 48.1876772, 'lon': 16.3046555}, {'lat': 48.1878075, 'lon': 16.3038295}, {'lat': 48.187876, 'lon': 16.3034966}]}
ID: 1990599 Name: Rudolfsheim-Fünfhaus Members: 16 DirectGeom: 0
First member: {'type': 'way', 'ref': 146678755, 'role': 'outer', 'geometry': [{'lat': 48.1958746, 'lon': 16.3392406}, {'lat': 48.1958469, 'lon': 16.3391615}, {'lat': 48.19555, 'lon': 16.3392303}, {'lat': 48.1954004, 'lon': 16.3392629}, {'lat': 48.1953208, 'lon': 16.339287}, {'lat': 48.1952484, 'lon': 16.339316}, {'lat': 48.1951518, 'lon': 16.3393908}, {'lat': 48.1950698, 'lon': 16.3394005}, {'lat': 48.194962, 'lon': 16.3393956}, {'lat': 48.1948574, 'lon': 16.3393751}, {'lat': 48.1947568, 'lon': 16.3393401}, {'lat': 48.1946341, 'lon': 16.3392794}, {'lat': 48.1942966, 'lon': 16.3390906}, {'lat': 48.194268, 'lon': 16.3391907}, {'lat': 48.1928753, 'lon': 16.3384075}, {'lat': 48.1928577, 'lon': 16.3383888}, {'lat': 48.1920913, 'lon': 16.3379918}, {'lat': 48.1918707, 'lon': 16.3377896}, {'lat': 48.1913521, 'lon': 16.3374884}, {'lat': 48.1913283, 'lon': 16.3375077}, {'lat': 48.1911018, 'lon': 16.337439}, {'lat': 48.1910155, 'lon': 16.3374217}, {'lat': 48.1910001, 'lon': 16.3374196}, {'lat': 48.190872, 'lon': 16.337414}, {'lat': 48.1908655, 'lon': 16.3373282}, {'lat': 48.1905829, 'lon': 16.3373684}, {'lat': 48.1905835, 'lon': 16.3373764}, {'lat': 48.190593, 'lon': 16.3375004}, {'lat': 48.1904053, 'lon': 16.3375751}]}
ID: 1990590 Name: Alsergrund Members: 9 DirectGeom: 0
First member: {'type': 'way', 'ref': 147468285, 'role': 'outer', 'geometry': [{'lat': 48.2151672, 'lon': 16.3408703}, {'lat': 48.2152918, 'lon': 16.3409357}, {'lat': 48.2157703, 'lon': 16.3412026}, {'lat': 48.2158543, 'lon': 16.3412458}, {'lat': 48.2162683, 'lon': 16.3414721}, {'lat': 48.2163349, 'lon': 16.3415085}, {'lat': 48.216373, 'lon': 16.3415294}, {'lat': 48.2164779, 'lon': 16.3415867}, {'lat': 48.2164945, 'lon': 16.3415958}, {'lat': 48.2164966, 'lon': 16.3415865}, {'lat': 48.2164976, 'lon': 16.3415823}, {'lat': 48.2165076, 'lon': 16.3415404}, {'lat': 48.2165373, 'lon': 16.3415563}, {'lat': 48.2165399, 'lon': 16.3415577}, {'lat': 48.2166236, 'lon': 16.3416022}, {'lat': 48.2166264, 'lon': 16.3416037}, {'lat': 48.216638, 'lon': 16.341553}, {'lat': 48.2166664, 'lon': 16.3415681}, {'lat': 48.2166687, 'lon': 16.3415693}, {'lat': 48.2168178, 'lon': 16.341649}, {'lat': 48.216846, 'lon': 16.3416641}, {'lat': 48.2168729, 'lon': 16.3416788}, {'lat': 48.2170185, 'lon': 16.3417586}, {'lat': 48.2170469, 'lon': 16.3417738}, {'lat': 48.217035, 'lon': 16.3418242}, {'lat': 48.2170374, 'lon': 16.3418255}, {'lat': 48.2171084, 'lon': 16.3418648}, {'lat': 48.2171167, 'lon': 16.3418696}, {'lat': 48.2171192, 'lon': 16.3418711}, {'lat': 48.217149, 'lon': 16.3418868}, {'lat': 48.2171389, 'lon': 16.3419295}, {'lat': 48.2172473, 'lon': 16.3419883}]}
ID: 1990591 Name: Hietzing Members: 50 DirectGeom: 0
First member: {'type': 'way', 'ref': 147399476, 'role': 'outer', 'geometry': [{'lat': 48.1871851, 'lon': 16.3142055}, {'lat': 48.187338, 'lon': 16.3133873}, {'lat': 48.1873759, 'lon': 16.3130943}, {'lat': 48.1874081, 'lon': 16.312593}, {'lat': 48.1874384, 'lon': 16.3119141}, {'lat': 48.1875601, 'lon': 16.3083038}, {'lat': 48.1875627, 'lon': 16.3082256}, {'lat': 48.1876298, 'lon': 16.3051883}, {'lat': 48.1876772, 'lon': 16.3046555}, {'lat': 48.1878075, 'lon': 16.3038295}, {'lat': 48.187876, 'lon': 16.3034966}]}
ID: 1990592 Name: Innere Stadt Members: 8 DirectGeom: 0
First member: {'type': 'way', 'ref': 147416055, 'role': 'outer', 'geometry': [{'lat': 48.1999528, 'lon': 16.375215}, {'lat': 48.2003186, 'lon': 16.3766526}, {'lat': 48.2022508, 'lon': 16.3786346}, {'lat': 48.2025986, 'lon': 16.3789387}, {'lat': 48.2027104, 'lon': 16.3790775}, {'lat': 48.2027086, 'lon': 16.3790848}, {'lat': 48.2027676, 'lon': 16.3791543}, {'lat': 48.2028858, 'lon': 16.3792937}, {'lat': 48.2028837, 'lon': 16.3792984}, {'lat': 48.2035648, 'lon': 16.3801058}, {'lat': 48.203908, 'lon': 16.3805108}, {'lat': 48.2039675, 'lon': 16.3805805}, {'lat': 48.2040984, 'lon': 16.3807233}, {'lat': 48.2042177, 'lon': 16.38084}, {'lat': 48.2043536, 'lon': 16.3809627}, {'lat': 48.2044461, 'lon': 16.3810398}, {'lat': 48.2045462, 'lon': 16.3811176}, {'lat': 48.2055767, 'lon': 16.3818713}, {'lat': 48.2056211, 'lon': 16.3819055}, {'lat': 48.2056687, 'lon': 16.3819386}, {'lat': 48.2059449, 'lon': 16.3821406}, {'lat': 48.2060209, 'lon': 16.3821938}, {'lat': 48.2065032, 'lon': 16.3825427}, {'lat': 48.2065486, 'lon': 16.3825756}, {'lat': 48.2065719, 'lon': 16.3825925}, {'lat': 48.206633, 'lon': 16.3826359}, {'lat': 48.2066776, 'lon': 16.3826679}, {'lat': 48.2067075, 'lon': 16.3826893}, {'lat': 48.2067448, 'lon': 16.3827161}, {'lat': 48.2078962, 'lon': 16.3835562}, {'lat': 48.2082506, 'lon': 16.3838153}, {'lat': 48.2082634, 'lon': 16.3838247}, {'lat': 48.2083364, 'lon': 16.3838781}, {'lat': 48.2084055, 'lon': 16.383929}, {'lat': 48.208422, 'lon': 16.3839411}, {'lat': 48.2088901, 'lon': 16.3842917}, {'lat': 48.2089682, 'lon': 16.3843425}, {'lat': 48.2090776, 'lon': 16.3844083}, {'lat': 48.2092018, 'lon': 16.3844647}, {'lat': 48.2092894, 'lon': 16.3845}, {'lat': 48.2093967, 'lon': 16.3845268}, {'lat': 48.2095085, 'lon': 16.3845481}, {'lat': 48.2095598, 'lon': 16.3845583}, {'lat': 48.2096161, 'lon': 16.3845694}, {'lat': 48.2096694, 'lon': 16.38458}, {'lat': 48.2097543, 'lon': 16.3845984}, {'lat': 48.2097875, 'lon': 16.3846049}, {'lat': 48.2098191, 'lon': 16.3846105}, {'lat': 48.2102605, 'lon': 16.3847006}, {'lat': 48.2106439, 'lon': 16.3847736}, {'lat': 48.2110657, 'lon': 16.3848544}, {'lat': 48.2110669, 'lon': 16.3848407}, {'lat': 48.2110849, 'lon': 16.3848448}, {'lat': 48.2111131, 'lon': 16.3848508}, {'lat': 48.2111271, 'lon': 16.3848533}, {'lat': 48.2111398, 'lon': 16.3848556}, {'lat': 48.2111519, 'lon': 16.3848579}, {'lat': 48.2112114, 'lon': 16.384867}, {'lat': 48.211279, 'lon': 16.3848801}, {'lat': 48.2113281, 'lon': 16.3848893}, {'lat': 48.2113547, 'lon': 16.3848946}]}
ID: 1990593 Name: Josefstadt Members: 16 DirectGeom: 0
First member: {'type': 'way', 'ref': 8046654, 'role': 'outer', 'geometry': [{'lat': 48.2151672, 'lon': 16.3408703}, {'lat': 48.2152357, 'lon': 16.3433909}, {'lat': 48.2152535, 'lon': 16.3440565}, {'lat': 48.2152992, 'lon': 16.3454822}, {'lat': 48.2153196, 'lon': 16.345681}, {'lat': 48.2153636, 'lon': 16.3460314}, {'lat': 48.2153624, 'lon': 16.346172}, {'lat': 48.2153545, 'lon': 16.3463176}, {'lat': 48.215325, 'lon': 16.3468029}, {'lat': 48.2152562, 'lon': 16.3478885}, {'lat': 48.2152464, 'lon': 16.3480388}, {'lat': 48.2151881, 'lon': 16.3488328}, {'lat': 48.2151523, 'lon': 16.3491487}, {'lat': 48.2150847, 'lon': 16.3498296}, {'lat': 48.2150439, 'lon': 16.3502281}, {'lat': 48.2150152, 'lon': 16.3505833}, {'lat': 48.2149686, 'lon': 16.350993}, {'lat': 48.2149237, 'lon': 16.3518183}, {'lat': 48.2148801, 'lon': 16.3524862}, {'lat': 48.2148175, 'lon': 16.3534121}, {'lat': 48.2147892, 'lon': 16.3539447}, {'lat': 48.2147857, 'lon': 16.3542231}, {'lat': 48.2147689, 'lon': 16.3547355}, {'lat': 48.2147497, 'lon': 16.3555032}, {'lat': 48.2147405, 'lon': 16.3557842}, {'lat': 48.2147394, 'lon': 16.3558184}, {'lat': 48.2146161, 'lon': 16.3558154}, {'lat': 48.2143408, 'lon': 16.3565102}]}
ID: 1990600 Name: Währing Members: 51 DirectGeom: 0
First member: {'type': 'way', 'ref': 205747568, 'role': 'outer', 'geometry': [{'lat': 48.2172473, 'lon': 16.3419883}, {'lat': 48.2176511, 'lon': 16.3393211}, {'lat': 48.2176995, 'lon': 16.3388331}, {'lat': 48.2180192, 'lon': 16.3389361}, {'lat': 48.2183103, 'lon': 16.3390239}, {'lat': 48.2189867, 'lon': 16.33933}, {'lat': 48.2196601, 'lon': 16.339645}]}
ID: 1991416 Name: Landstraße Members: 48 DirectGeom: 0
First member: {'type': 'way', 'ref': 147444470, 'role': 'outer', 'geometry': [{'lat': 48.2113547, 'lon': 16.3848946}, {'lat': 48.2114232, 'lon': 16.3849036}, {'lat': 48.2114536, 'lon': 16.3849157}, {'lat': 48.2114786, 'lon': 16.3849313}, {'lat': 48.2115187, 'lon': 16.3849694}, {'lat': 48.2115334, 'lon': 16.3849941}, {'lat': 48.2117766, 'lon': 16.3850638}, {'lat': 48.2118472, 'lon': 16.3850918}, {'lat': 48.2119177, 'lon': 16.3851492}, {'lat': 48.211994, 'lon': 16.3852455}, {'lat': 48.2120648, 'lon': 16.3854257}, {'lat': 48.2122929, 'lon': 16.3860319}, {'lat': 48.2124664, 'lon': 16.3864832}, {'lat': 48.2125735, 'lon': 16.3865617}, {'lat': 48.2126356, 'lon': 16.3866406}, {'lat': 48.2126365, 'lon': 16.3866417}, {'lat': 48.2127026, 'lon': 16.3868043}, {'lat': 48.2128421, 'lon': 16.3871675}, {'lat': 48.212917, 'lon': 16.3874383}, {'lat': 48.2130384, 'lon': 16.3879236}, {'lat': 48.2131175, 'lon': 16.3883691}, {'lat': 48.2131612, 'lon': 16.3888027}, {'lat': 48.2131825, 'lon': 16.3892349}, {'lat': 48.213176, 'lon': 16.389647}, {'lat': 48.2131492, 'lon': 16.3900579}, {'lat': 48.2131111, 'lon': 16.3903953}, {'lat': 48.213035, 'lon': 16.3908259}, {'lat': 48.2128909, 'lon': 16.3915326}, {'lat': 48.2126943, 'lon': 16.3923033}, {'lat': 48.2123726, 'lon': 16.3931358}, {'lat': 48.2120635, 'lon': 16.3937209}, {'lat': 48.2116641, 'lon': 16.3942561}, {'lat': 48.2112488, 'lon': 16.3946581}, {'lat': 48.2097274, 'lon': 16.3956534}, {'lat': 48.2087094, 'lon': 16.3961994}, {'lat': 48.2078611, 'lon': 16.3965158}, {'lat': 48.2050464, 'lon': 16.397082}, {'lat': 48.2044186, 'lon': 16.3973936}, {'lat': 48.2038574, 'lon': 16.3978408}, {'lat': 48.2034024, 'lon': 16.3983736}, {'lat': 48.2030885, 'lon': 16.398797}, {'lat': 48.2027366, 'lon': 16.3994891}, {'lat': 48.202475, 'lon': 16.4003359}, {'lat': 48.201825, 'lon': 16.4039157}, {'lat': 48.2015825, 'lon': 16.404979}, {'lat': 48.2012733, 'lon': 16.4058995}, {'lat': 48.2008865, 'lon': 16.4066582}, {'lat': 48.2003903, 'lon': 16.4073552}, {'lat': 48.1998829, 'lon': 16.4078547}, {'lat': 48.1989871, 'lon': 16.4085445}, {'lat': 48.1972113, 'lon': 16.4097751}, {'lat': 48.1968503, 'lon': 16.4100661}, {'lat': 48.1965465, 'lon': 16.4103604}, {'lat': 48.1961194, 'lon': 16.410865}, {'lat': 48.195365, 'lon': 16.4118817}, {'lat': 48.1943787, 'lon': 16.4134445}, {'lat': 48.1898533, 'lon': 16.4211488}, {'lat': 48.1893474, 'lon': 16.4220931}, {'lat': 48.1881117, 'lon': 16.4251003}, {'lat': 48.1866113, 'lon': 16.4301686}]}
36:22
Now I have the current district polygons. The next step is to get the 2014-01-01 versions. Overpass supports historic data queries using `[date:"2014-01-01T00:00:00Z"]`. Let me fetch those.
36:22
36:22
Bash
cat << 'EOF' > /work/fetch_2014.py
cat << 'EOF' > /work/fetch_2014.py
import httpx
import json
import time
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
# Process in batches of 2 because historic queries are slower
batch_size = 2
for i in range(0, len(rel_ids), batch_size):
batch = rel_ids[i:i+batch_size]
rel_statements = ';\n '.join([f"rel({rid})" for rid in batch])
q = f'''[out:json][date:"2014-01-01T00:00:00Z"];
(
{rel_statements};
);
out geom;
'''
print(f"Fetching batch {i//batch_size + 1}: IDs {batch}...")
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=180)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
all_elements.extend(data['elements'])
for e in data['elements']:
tags = e.get('tags', {})
print(f" OK: {tags.get('name','?')} (ref={tags.get('ref','?')})")
else:
print(f" Error {resp.status_code}: {resp.text[:200]}")
except Exception as e:
print(f" Exception: {e}")
time.sleep(3)
print(f"\nTotal elements from 2014: {len(all_elements)}")
# Save the 2014 data
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_2014.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
46:30
The historic query is timing out. Let me try a different approach - maybe fetching the 2014 data individually per district, but with fewer attempts. Or I can try to use a different Overpass endpoint. Let me also think about this differently - perhaps I can use the Overpass API's "date" feature but with a simpler approach. Actually, the issue might be that the old Overpass kumi.systems server doesn't support the date parameter or has limited historic data. Let me try another approach - use the main overpass-api.de but with a proper Accept header, or use a different endpoint. Actually wait - let me see if the 406 error from overpass-api.de can be circumvented.
46:30
Bash
cat << 'EOF' > /work/test_overpass.py
cat << 'EOF' > /work/test_overpass.py
import requests
# Try with explicit content-type and no Accept header issues
resp = requests.post(
'https://overpass-api.de/api/interpreter',
data={'data': '[out:json];node(48.0,16.3,48.2,16.4);out 1;'},
headers={
'Content-Type': 'application/x-www-form-urlencoded',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64)'
}
)
print('Status:', resp.status_code)
print('Headers:', dict(resp.headers))
print('Text:', resp.text[:500])
EOF
timeout 30 python3 /work/test_overpass.py
Status: 406
Headers: {'Date': 'Thu, 18 Jun 2026 04:43:45 GMT', 'Server': 'Apache/2.4.67 (Debian)', 'Content-Length': '371', 'Connection': 'close', 'Content-Type': 'text/html; charset=iso-8859-1'}
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
46:39
The overpass-api.de endpoint is returning 406 for all requests. Let me try a different approach - maybe the kumi.systems endpoint works for individual queries. Let me try fetching the 2014 data one at a time.
46:39
Bash
cat << 'EOF' > /work/fetch_2014_one.py
cat << 'EOF' > /work/fetch_2014_one.py
import json
import time
import httpx
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
for rid in rel_ids:
q = f'[out:json][date:"2014-01-01T00:00:00Z"];rel({rid});out geom;'
print(f"Fetching {rid}...", end=" ")
for attempt in range(5):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=180)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
e = data['elements'][0]
tags = e.get('tags', {})
geom_count = len(e.get('geometry', []))
members = e.get('members', [])
member_geom_count = sum(len(m.get('geometry',[])) for m in members)
print(f"OK: {tags.get('name','?')} (geom:{geom_count}, members:{len(members)}, member_geom:{member_geom_count})")
all_elements.append(e)
else:
print(f"Empty result")
break
else:
print(f"Attempt {attempt+1}: Error {resp.status_code}")
time.sleep(2)
except Exception as e:
print(f"Attempt {attempt+1}: {e}")
time.sleep(2)
time.sleep(1)
print(f"\nTotal: {len(all_elements)} elements")
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_2014_one.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
56:50
The historic queries are timing out. Let me try a different approach - maybe I can use the Overpass API's "date" feature more efficiently. Or maybe I should try using the `[date:]` syntax which might work differently. Let me first check what's available. Actually, the problem might be that the kumi server doesn't have historical data, or it's just very slow. Let me try a more focused approach - maybe I can get the 2014 data for just a few relations at a time, with longer timeouts. Wait, actually the current data fetch took a lot of time too but eventually worked. The historic query is definitely slower. Let me try with just individual queries and even longer timeouts, maybe also try a different server. Actually, let me think about an alternative approach. Instead of Overpass historical data, I could: 1. Use the OSM API to get the relation versions at specific dates 2. Use Overpass with a more efficient query Let me try the metawiki (main OSM API) - I can get the relation in a specific version. Actually, let me try a different approach - use the `[date:]` feature but with a simpler query that just gets the relation IDs first, then fetches geometries individually with longer timeouts. Let me also try the main overpass-api.de endpoint - maybe the 406 issue was about the User-Agent, not Accept.
56:50
Bash
cat << 'EOF' > /work/test_urllib.py
cat << 'EOF' > /work/test_urllib.py
import urllib.request, urllib.parse
data = urllib.parse.urlencode({'data': '[out:json];node(1);out;'}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('User-Agent', 'Python-urllib/3.14')
try:
resp = urllib.request.urlopen(req, timeout=30)
print('Status:', resp.status)
print('Text:', resp.read().decode()[:500])
except urllib.error.HTTPError as e:
print('HTTPError:', e.code, e.reason)
print('Body:', e.read().decode()[:500])
except Exception as e:
print('Error:', e)
EOF
timeout 60 python3 /work/test_urllib.py
HTTPError: 406 Not Acceptable Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <title>406 Not Acceptable</title> </head><body> <h1>Not Acceptable</h1> <p>An appropriate representation of the requested resource could not be found on this server.</p> <hr> <address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address> </body></html>
57:01
The overpass-api.de is returning 406 for everything, but overpass.kumi.systems works (though it times out for large queries). Let me try using the historical API with smaller batches.
57:01
Bash
cat << 'EOF' > /work/fetch_2014_small.py
cat << 'EOF' > /work/fetch_2014_small.py
import json
import time
import httpx
# Let's try one at a time but with shorter timeouts and faster retries
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
for idx, rid in enumerate(rel_ids):
q = f'[out:json][date:"2014-01-01T00:00:00Z"];rel({rid});out geom;'
print(f"[{idx+1}/{len(rel_ids)}] Fetching {rid}...", end=" ", flush=True)
success = False
for attempt in range(5):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=120)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
e = data['elements'][0]
tags = e.get('tags', {})
print(f"OK: {tags.get('name','?')} (members:{len(e.get('members',[]))})")
all_elements.append(e)
success = True
else:
print("Empty")
success = True
break
elif resp.status_code == 504:
print(f"504", end=" ", flush=True)
time.sleep(2)
else:
print(f"E{resp.status_code}", end=" ", flush=True)
time.sleep(2)
except httpx.ReadTimeout:
print("T", end=" ", flush=True)
time.sleep(2)
except Exception as e:
print(f"X", end=" ", flush=True)
time.sleep(2)
if not success:
print("FAILED")
else:
print()
time.sleep(0.5)
print(f"\nTotal: {len(all_elements)} elements")
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_2014_small.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
67:06
The Overpass Kumi API is timing out repeatedly with historic queries. Let me try using a different approach - maybe duckdb to find local data, or try to use main Overpass API. Actually, let me check what's in the local file system first - maybe there's some Vienna data already available.
67:06
Bash
ls -la /work/
total 644 drwxr-xr-x 1 agent agent 728 Jun 18 04:54 . drwxr-xr-x 1 root root 22 Jun 18 03:57 .. -rw-r--r-- 1 agent agent 505953 Jun 18 04:33 districts_current_geom.json -rw-r--r-- 1 agent agent 59331 Jun 18 04:08 districts_current_ids.json -rw-r--r-- 1 agent agent 323 Jun 18 03:57 explore.py -rw-r--r-- 1 agent agent 315 Jun 18 03:57 explore2.py -rw-r--r-- 1 agent agent 305 Jun 18 03:57 explore3.py -rw-r--r-- 1 agent agent 398 Jun 18 03:57 explore4.py -rw-r--r-- 1 agent agent 335 Jun 18 03:57 explore5.py -rw-r--r-- 1 agent agent 387 Jun 18 03:58 explore6.py -rw-r--r-- 1 agent agent 498 Jun 18 03:58 explore7.py -rw-r--r-- 1 agent agent 339 Jun 18 03:59 fetch1.py -rw-r--r-- 1 agent agent 1485 Jun 18 04:33 fetch_2014.py -rw-r--r-- 1 agent agent 1661 Jun 18 04:43 fetch_2014_one.py -rw-r--r-- 1 agent agent 1918 Jun 18 04:54 fetch_2014_small.py -rw-r--r-- 1 agent agent 1383 Jun 18 04:22 fetch_batch.py -rw-r--r-- 1 agent agent 411 Jun 18 04:04 fetch_districts.py -rw-r--r-- 1 agent agent 951 Jun 18 04:05 fetch_districts2.py -rw-r--r-- 1 agent agent 735 Jun 18 04:06 fetch_districts3.py -rw-r--r-- 1 agent agent 1075 Jun 18 04:11 fetch_districts_geom.py -rw-r--r-- 1 agent agent 507 Jun 18 04:21 fetch_fast.py -rw-r--r-- 1 agent agent 1270 Jun 18 04:11 fetch_one.py -rw-r--r-- 1 agent agent 1382 Jun 18 04:23 fetch_retry.py -rw-r--r-- 1 agent agent 1474 Jun 18 04:30 fetch_retry2.py -rw-r--r-- 1 agent agent 456 Jun 18 04:43 test_overpass.py -rw-r--r-- 1 agent agent 551 Jun 18 04:54 test_urllib.py
67:21
The Overpass historic queries are timing out. Let me try an alternative approach - I can use the Overpass API's diff feature, or I can use a different approach entirely. Let me see if I can use the OSM history dump or use a different technique. Actually, let me think about this differently. The task says "their 2014-01-01 counterparts" - maybe I can use the Overpass "date" parameter but with a smaller query. Let me try fetching just one district from 2014 to test.
67:21
Bash
cat << 'EOF' > /work/fetch_one_2014.py
cat << 'EOF' > /work/fetch_one_2014.py
import httpx
import json
rid = 1990590
q = f'[out:json][date:"2014-01-01T00:00:00Z"];rel({rid});out geom;'
print(f"Query: {q[:100]}...")
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements',[])))
if data.get('elements'):
e = data['elements'][0]
print('Tags:', json.dumps(e.get('tags',{}),indent=2)[:300])
print('Members:', len(e.get('members',[])))
print('Geometry:', len(e.get('geometry',[])))
else:
print('Error:', resp.text[:500])
EOF
timeout 300 python3 /work/fetch_one_2014.py
Query: [out:json][date:"2014-01-01T00:00:00Z"];rel(1990590);out geom;...
Status: 504
Error: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
<title>OSM3S Response</title>
</head>
<body>
<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
70:52
The Overpass API with `[date:]` parameter seems to be timing out. Let me try a different approach - maybe I can use the Overpass API without the date filter but look for the historic versions of the relations. Or maybe I can download the data from a different source. Actually, let me try with a different Overpass instance that might have the attic data enabled. The main overpass-api.de server might have it, but it's returning 406. Let me check if there's an Austrian instance. Actually, a better approach: Let me try to use the OSM diff or just work with what I have. Another idea: I could use `overpass-api.de` with the `Accept` header issue solved by using `https://overpass-api.de/api/interpreter?` with GET instead of POST... but that would be too long. Actually, the kumi.systems instance supports attic queries but might need the attic database enabled. Let me check if we can query with `[date:]` on kumi. Wait - looking at the 504 error, it's a timeout issue. The kumi instance is just busy/overloaded. Let me try the main overpass-api.de with a proper content-type. The issue might be that the requests library sends an Accept header that the server doesn't like. Let me try with `httpx` on the main overpass-api.de with proper headers.
70:52
Bash
cat << 'EOF' > /work/fetch_2014_v2.py
cat << 'EOF' > /work/fetch_2014_v2.py
import httpx
import json
# Try overpass-api.de with explicit headers
q = '[out:json];node(1);out;'
headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'Accept': 'application/json,*/*',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64)'
}
resp = httpx.post('https://overpass-api.de/api/interpreter',
data={'data': q},
headers=headers,
timeout=30)
print('Status:', resp.status_code, resp.reason_phrase)
print('Headers:', dict(resp.headers))
print('Text:', resp.text[:500])
EOF
timeout 60 python3 /work/fetch_2014_v2.py
Status: 406 Not Acceptable
Headers: {'date': 'Thu, 18 Jun 2026 05:08:06 GMT', 'server': 'Apache/2.4.67 (Debian)', 'content-length': '371', 'connection': 'close', 'content-type': 'text/html; charset=iso-8859-1'}
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
71:03
The overpass-api.de server is returning 406 for all requests regardless. Let me check if there's some way to get around this. But kumi.systems works for current data. The issue is that historic queries with `[date:...]` seem to cause timeouts on kumi. Let me try a different approach - maybe I can download the data from OSM or use a different strategy. Actually, let me check if there are alternative Overpass endpoints. Also, maybe I can try querying the data through OSM's XAPI or download dumps from Geofabrik or other sources. Wait, actually let me check if the kumi endpoint just needs the query to be simpler. The 504 timeout might be because the historic data is in a different database. Let me try the dates differently.
71:03
Bash
cat << 'EOF' > /work/test_kumi.py
cat << 'EOF' > /work/test_kumi.py
import httpx
import json
# Test: a simple historic query on kumi
q = '[out:json][date:"2014-01-01T00:00:00Z"];node(48.0,16.3,48.2,16.4);out 5;'
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=120)
print('Status:', resp.status_code)
if resp.status_code == 200:
data = resp.json()
print('Elements:', len(data.get('elements',[])))
for e in data['elements'][:3]:
print(f" {e['type']} {e['id']} tags={e.get('tags',{})}")
else:
print('Error:', resp.text[:500])
EOF
timeout 180 python3 /work/test_kumi.py
Status: 200
Elements: 5
node 377970 tags={'crossing': 'traffic_signals', 'highway': 'traffic_signals'}
node 377988 tags={}
node 377989 tags={'highway': 'traffic_signals'}
71:59
Historic queries work on kumi.systems! The previous timeout was because of too many retries. Let me try a simpler approach - fetch one historic relation at a time.
71:59
Bash
cat << 'EOF' > /work/fetch_2014_v3.py
cat << 'EOF' > /work/fetch_2014_v3.py
import httpx
import json
import time
import sys
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
all_elements = []
for idx, rid in enumerate(rel_ids):
q = f'[out:json][date:"2014-01-01T00:00:00Z"];rel({rid});out geom;'
print(f"[{idx+1}/{len(rel_ids)}] Fetching {rid}...", end=" ", flush=True)
for attempt in range(3):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
e = data['elements'][0]
tags = e.get('tags', {})
print(f"OK: {tags.get('name','?')} (members:{len(e.get('members',[]))})")
all_elements.append(e)
else:
print("Empty result")
break
elif resp.status_code == 504:
print("504", end=" ", flush=True)
time.sleep(5)
else:
print(f"E{resp.status_code}", end=" ", flush=True)
time.sleep(3)
except httpx.TimeoutException:
print("T", end=" ", flush=True)
time.sleep(5)
except Exception as e:
print(f"X({e})", end=" ", flush=True)
time.sleep(3)
else:
print("FAILED")
sys.stdout.flush()
time.sleep(1)
print(f"\nTotal: {len(all_elements)} elements")
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 720 python3 /work/fetch_2014_v3.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
82:10
The historic queries are timing out. Let me try a different approach - maybe I can use the Overpass API with a simpler query format, or use a different endpoint. Let me also check if there's a way to get the data more efficiently. Let me try to query all at once without historic, since the historic approach on kumi seems to be timing out. Maybe I can use a snapshot approach. Actually, let me think about this differently. The task asks about 2014-01-01 versions. But Overpass historic data might be slow. Let me try using `[date:"2014-01-01T00:00:00Z"]` but with a more compact query format. Actually, looking at the errors - 504 means the server is timing out. The historic data query takes too long. Let me try a different approach: use the main overpass-api.de endpoint which might have the historic data cached better. But it's returning 406... Wait, the 406 error from overpass-api.de might be because of the Accept header. Let me check what httpx sends by default. Actually, it could be that the server is rejecting based on something else. Let me try with curl-like headers. Actually, maybe I should try the "overpass-api.de" endpoint with the correct URL encoding. Perhaps the issue is with how requests sends the data. Let me try a completely different approach: use an Overpass Turbo-like query with proper URL encoding.
82:10
Bash
cat << 'EOF' > /work/test_curl.py
cat << 'EOF' > /work/test_curl.py
import subprocess
result = subprocess.run([
'python3', '-c', '''
import urllib.request, urllib.parse
data = urllib.parse.urlencode({"data": "[out:json];node(1);out;"}).encode()
req = urllib.request.Request("https://overpass-api.de/api/interpreter", data=data)
req.add_header("Content-Type", "application/x-www-form-urlencoded")
req.add_header("Accept", "application/json")
try:
resp = urllib.request.urlopen(req, timeout=15)
print(resp.status, resp.read().decode()[:300])
except Exception as e:
print(type(e).__name__, str(e)[:300])
'''], capture_output=True, text=True)
print('stdout:', result.stdout)
print('stderr:', result.stderr)
EOF
python3 /work/test_curl.py
stdout: HTTPError HTTP Error 406: Not Acceptable stderr:
82:19
The overpass-api.de is returning 406 for all requests, even simple ones. This seems to be an Apache configuration issue on that specific server. The kumi.systems endpoint works but is slow for historic queries. Let me check how much of the 2014 data we already got, and try a different approach - maybe use a different endpoint or try to get the data through other means. Let me check if we already have some data from previous runs.
82:19
Bash
ls -la /work/districts_2014_geom.json 2>/dev/null || echo "File doesn't exist"
ls -la /work/districts_2014_geom.json 2>/dev/null || echo "File doesn't exist" # Check if previous run saved partial data find /work -name "*.json" -size +1000c 2>/dev/null
File doesn't exist /work/districts_current_ids.json /work/districts_current_geom.json
82:34
The historic queries are too slow. Let me try a different approach - maybe I can fetch the 2014 data through the OSM API or another method. Actually, let me think about this differently. The Overpass API historic queries are timing out because the server is busy. Let me try: 1. Using a different Overpass instance 2. Or using the OSM API directly 3. Or maybe fetching the ways at a specific date using the `[date:]` qualifier but with a simpler query Actually, wait - let me try the `main` instance (overpass-api.de) which seems to be returning 406 for all requests. This is an Apache configuration issue. Let me try to use the main instance with a GET request instead of POST. Actually, looking at the 406 error more carefully - it seems like the Apache server at overpass-api.de requires a specific Accept header. The kumi instance works but historic queries time out. Let me try a different approach - let me look at what data we already have and see if we can work with what we have. Actually, the task says we need the 2014-01-01 versions. Let me try a different Overpass endpoint. Maybe I can try the main overpass-api.de with a GET request?
82:34
Bash
python3 -c "
python3 -c "
import requests
# Try GET with URL-encoded query in the URL
import urllib.parse
q = '[out:json];node(1);out;'
resp = requests.get('https://overpass-api.de/api/interpreter?data=' + urllib.parse.quote(q),
timeout=30)
print('Status:', resp.status_code)
print('Headers:', dict(resp.headers))
print('Text:', resp.text[:500])
"
Status: 406
Headers: {'Date': 'Thu, 18 Jun 2026 05:19:48 GMT', 'Server': 'Apache/2.4.67 (Debian)', 'Content-Length': '371', 'Connection': 'close', 'Content-Type': 'text/html; charset=iso-8859-1'}
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
82:42
The overpass-api.de endpoint keeps returning 406. Let me check if there's a way to properly request. Maybe the issue is that the Apache config requires specific Accept headers. Let me try different approaches. Actually, let me just use kumi for the 2014 data, but do one at a time so it doesn't timeout. Let me do a short script that handles 504 errors more aggressively.
82:42
Bash
rm -f /work/districts_2014_geom.json
rm -f /work/districts_2014_geom.json
cat << 'EOF' > /work/fetch_2014_v4.py
import httpx
import json
import time
import sys
import os
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
# Load any already-fetched data
all_elements = []
if os.path.exists('/work/districts_2014_geom.json'):
with open('/work/districts_2014_geom.json') as f:
data = json.load(f)
all_elements = data.get('elements', [])
existing_ids = {e['id'] for e in all_elements}
print(f"Loaded {len(all_elements)} existing, IDs: {existing_ids}")
else:
existing_ids = set()
for idx, rid in enumerate(rel_ids):
if rid in existing_ids:
print(f"[{idx+1}/{len(rel_ids)}] {rid} already fetched, skipping")
continue
q = f'[out:json][date:"2014-01-01T00:00:00Z"];rel({rid});out geom;'
print(f"[{idx+1}/{len(rel_ids)}] Fetching {rid}...", end=" ", flush=True)
success = False
for attempt in range(5):
try:
resp = httpx.post('https://overpass.kumi.systems/api/interpreter',
data={'data': q}, timeout=300)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
e = data['elements'][0]
tags = e.get('tags', {})
print(f"OK: {tags.get('name','?')}")
all_elements.append(e)
success = True
else:
print("Empty result (no data for 2014)")
success = True
break
elif resp.status_code == 504:
print("504", end=".", flush=True)
time.sleep(5)
else:
print(f"E{resp.status_code}", end=".", flush=True)
time.sleep(3)
except httpx.TimeoutException:
print("T", end=".", flush=True)
time.sleep(5)
except Exception as e:
print(f"X", end=".", flush=True)
time.sleep(3)
if not success:
print(" FAILED")
sys.stdout.flush()
# Save after each successful fetch
if success:
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
time.sleep(2)
print(f"\nTotal: {len(all_elements)} elements")
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Saved!")
EOF
timeout 600 python3 /work/fetch_2014_v4.py
<exit_code>-1</exit_code> <error>Command timed out after 600000ms</error>
92:52
The historic Overpass queries are consistently timing out. Let me try a different approach - maybe use a different server or a smaller batch. Actually, it seems like the historic queries (with [date:"..."] filter) on kumi.systems are very slow/always timeout. Let me try a completely different approach. Since I can get current data, maybe I can use the Overpass API diff approach or try to get the 2014 data from a different source. Or I could try the main overpass-api.de with a different approach. Wait, the issue with overpass-api.de is 406 Not Acceptable. Let me check if the Python httpx is sending something that the Apache doesn't like. Actually, maybe I should just use the `requests` library with proper handling. Let me look at what the httpx default headers look like.
92:52
Bash
python3 -c "
python3 -c "
import httpx, json
# Let's check what httpx sends by default
client = httpx.Client()
print('Default headers:', dict(client.headers))
"
Default headers: {'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.28.1'}
93:02
OK, so kumi.systems works fine for current data but historic queries are timing out. Let me think about this differently. The issue is that historic queries (with `[date:"2014-01-01"]`) on kumi.systems are timing out. Maybe I should try a different approach: 1. Use `overpass-api.de` - but it's returning 406 errors for all requests 2. Use `overpass.kumi.systems` for current data (already done) and historic data Actually, wait - the 406 error on overpass-api.de seems to be an Apache configuration issue. Let me check if maybe I can work around it. But actually, I already have the current geometries. The problem is getting the 2014-01-01 geometries. Let me try a simpler approach for the historic data. Actually, another approach: instead of using historic queries, I can just get the relations from 2014 snapshot by querying the ways and nodes that made up the boundaries at that time. But that's complex. Let me think about what the task actually needs. We need to identify which parts of the district boundaries have changed between 2014-01-01 and now. An alternative approach: rather than using Overpass historic queries (which are timing out), I could use OSM's full history dump or use the Overpass API with a different approach. Wait - let me re-examine. The 406 error on overpass-api.de is about content negotiation. Let me try with requests.get instead of POST. Actually, looking at the Overpass API docs, the standard way is POST with form data. The 406 error is strange. Let me try with a different Accept header. Actually, looking at the httpx default headers: `accept: */*` - that should work. Let me check if kumi.systems also rejects certain things. Hmm, kumi.systems works for current data but historic queries return 504 (gateway timeout). This could mean the server is too busy or the historic data takes too long to process. Let me try a different approach altogether. Maybe I can query the OSM history database through a different method, or perhaps use a local extract. Actually, let me think about this differently. Rather than relying on Overpass for historic data, let me: 1. Get current district geometries (already have this) 2. For the 2014 data, try a smaller number of relations at once, or try the main overpass-api.de with proper headers Wait, the 406 error is happening because the Apache server is configured to only accept certain content types from the client's Accept header. Let me try without any Accept header manipulation and see if it works by default. Actually, httpx sends `accept: */*` by default. The 406 error on overpass-api.de is unusual. Maybe the server has been misconfigured recently. Let me try another Overpass instance.
93:02
Bash
python3 -c "
python3 -c "
import httpx
# Try different Overpass instances
instances = [
'https://overpass.kumi.systems/api/interpreter',
'https://overpass-api.de/api/interpreter',
'http://overpass-api.de/api/interpreter',
'https://maps.mail.ru/osm/tools/overpass/api/interpreter',
]
for inst in instances:
try:
resp = httpx.post(inst, data={'data': '[out:json];node(1);out;'}, timeout=10)
print(f'{inst}: {resp.status_code} - {resp.text[:100]}')
except Exception as e:
print(f'{inst}: {type(e).__name__}: {str(e)[:80]}')
"
https://overpass.kumi.systems/api/interpreter: 200 - {
"version": 0.6,
"generator": "Overpass API 0.7.62.11 87bfad18",
"osm3s": {
"timestamp_os
https://overpass-api.de/api/interpreter: 406 - <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><he
http://overpass-api.de/api/interpreter: 406 - <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><he
https://maps.mail.ru/osm/tools/overpass/api/interpreter: 200 - {
"version": 0.6,
"generator": "Overpass API 0.7.62.4 2390de5a",
"osm3s": {
"timestamp_osm
93:16
Let me try the maps.mail.ru endpoint with the historic query.
93:16
Bash
python3 -c "
python3 -c "
import httpx, json
# Test historic query on maps.mail.ru
q = '[out:json][date:\"2014-01-01T00:00:00Z\"];rel(1990590);out geom;'
resp = httpx.post('https://maps.mail.ru/osm/tools/overpass/api/interpreter',
data={'data': q}, timeout=60)
print('Status:', resp.status_code)
print('Text:', resp.text[:500])
"
Status: 200
Text: {
"version": 0.6,
"generator": "Overpass API 0.7.62.4 2390de5a",
"osm3s": {
"timestamp_osm_base": "2026-06-18T05:29:13Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "relation",
"id": 1990590,
"bounds": {
"minlat": 48.2138150,
"minlon": 16.3408703,
"maxlat": 48.2363090,
"maxlon": 16.3701490
},
"members": [
{
"type": "way",
"ref": 1474
93:26
The maps.mail.ru endpoint works for historic queries! Let me use it to fetch all 2014 data.
93:26
Bash
cat << 'EOF' > /work/fetch_2014_final.py
cat << 'EOF' > /work/fetch_2014_final.py
import httpx
import json
import time
import sys
import os
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,1990597,1990598,1990599,1990600,1991416,1991433,1991434,1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
# Load existing
all_elements = []
if os.path.exists('/work/districts_2014_geom.json'):
with open('/work/districts_2014_geom.json') as f:
data = json.load(f)
all_elements = data.get('elements', [])
existing_ids = {e['id'] for e in all_elements}
print(f"Loaded {len(all_elements)} existing, IDs: {existing_ids}")
else:
existing_ids = set()
for idx, rid in enumerate(rel_ids):
if rid in existing_ids:
print(f"[{idx+1}/{len(rel_ids)}] {rid} already fetched, skipping")
continue
q = f'[out:json][date:\"2014-01-01T00:00:00Z\"];rel({rid});out geom;'
print(f"[{idx+1}/{len(rel_ids)}] Fetching {rid}...", end=" ", flush=True)
success = False
for attempt in range(5):
try:
resp = httpx.post('https://maps.mail.ru/osm/tools/overpass/api/interpreter',
data={'data': q}, timeout=120)
if resp.status_code == 200:
data = resp.json()
if data.get('elements'):
e = data['elements'][0]
tags = e.get('tags', {})
geom_len = len(e.get('geometry', []))
members_count = len(e.get('members', []))
print(f"OK: {tags.get('name','?')} ({members_count} members, {geom_len} direct geom)")
all_elements.append(e)
success = True
else:
print("Empty result")
success = True
break
else:
print(f"E{resp.status_code}", end=".", flush=True)
time.sleep(3)
except httpx.TimeoutException:
print("T", end=".", flush=True)
time.sleep(5)
except Exception as e:
print(f"X", end=".", flush=True)
time.sleep(3)
if not success:
print(" FAILED")
sys.stdout.flush()
if success:
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
time.sleep(1)
print(f"\nTotal: {len(all_elements)} elements")
with open('/work/districts_2014_geom.json', 'w') as f:
json.dump({"version": 0.6, "elements": all_elements}, f)
print("Done!")
EOF
timeout 600 python3 /work/fetch_2014_final.py
Loaded 3 existing, IDs: {1990592, 1990590, 1990591}
[1/23] 1990590 already fetched, skipping
[2/23] 1990591 already fetched, skipping
[3/23] 1990592 already fetched, skipping
[4/23] Fetching 1990593... OK: Josefstadt (14 members, 0 direct geom)
[5/23] Fetching 1990594... OK: Leopoldstadt (6 members, 0 direct geom)
[6/23] Fetching 1990595... OK: Mariahilf (18 members, 0 direct geom)
[7/23] Fetching 1990596... OK: Meidling (42 members, 0 direct geom)
[8/23] Fetching 1990597... OK: Neubau (23 members, 0 direct geom)
[9/23] Fetching 1990598... OK: Penzing (46 members, 0 direct geom)
[10/23] Fetching 1990599... OK: Rudolfsheim-Fünfhaus (32 members, 0 direct geom)
[11/23] Fetching 1990600... OK: Währing (50 members, 0 direct geom)
[12/23] Fetching 1991416... OK: Landstraße (20 members, 0 direct geom)
[13/23] Fetching 1991433... OK: Brigittenau (5 members, 0 direct geom)
[14/23] Fetching 1991434... OK: Donaustadt (38 members, 0 direct geom)
[15/23] Fetching 1991435... OK: Döbling (26 members, 0 direct geom)
[16/23] Fetching 1991436... OK: Favoriten (25 members, 0 direct geom)
[17/23] Fetching 1991437... OK: Floridsdorf (21 members, 0 direct geom)
[18/23] Fetching 1991438... OK: Hernals (68 members, 0 direct geom)
[19/23] Fetching 1991439... OK: Liesing (30 members, 0 direct geom)
[20/23] Fetching 1991440... OK: Margareten (14 members, 0 direct geom)
[21/23] Fetching 1991441... OK: Ottakring (64 members, 0 direct geom)
[22/23] Fetching 1991442... OK: Simmering (17 members, 0 direct geom)
[23/23] Fetching 1991443... OK: Wieden (17 members, 0 direct geom)
Total: 23 elements
Done!
94:08
Excellent! I now have both the current and 2014 district geometries. Now let me process them - convert the Overpass JSON with member geometries into actual Shapely geometries using GeoPandas. Let me now build the full processing pipeline.
94:08
94:08
Bash
python3 -c "
python3 -c "
import json
with open('/work/districts_2014_geom.json') as f:
d2014 = json.load(f)
with open('/work/districts_current_geom.json') as f:
dcur = json.load(f)
print('2014 elements:', len(d2014['elements']))
print('Current elements:', len(dcur['elements']))
# Print names
for e in d2014['elements']:
print('2014:', e['id'], e.get('tags',{}).get('name',''))
for e in dcur['elements']:
print('Cur:', e['id'], e.get('tags',{}).get('name',''))
"
2014 elements: 23 Current elements: 23 2014: 1990590 Alsergrund 2014: 1990591 Hietzing 2014: 1990592 Innere Stadt 2014: 1990593 Josefstadt 2014: 1990594 Leopoldstadt 2014: 1990595 Mariahilf 2014: 1990596 Meidling 2014: 1990597 Neubau 2014: 1990598 Penzing 2014: 1990599 Rudolfsheim-Fünfhaus 2014: 1990600 Währing 2014: 1991416 Landstraße 2014: 1991433 Brigittenau 2014: 1991434 Donaustadt 2014: 1991435 Döbling 2014: 1991436 Favoriten 2014: 1991437 Floridsdorf 2014: 1991438 Hernals 2014: 1991439 Liesing 2014: 1991440 Margareten 2014: 1991441 Ottakring 2014: 1991442 Simmering 2014: 1991443 Wieden Cur: 1990594 Leopoldstadt Cur: 1990595 Mariahilf Cur: 1990596 Meidling Cur: 1990597 Neubau Cur: 1991433 Brigittenau Cur: 1991434 Donaustadt Cur: 1991435 Döbling Cur: 1991436 Favoriten Cur: 1991437 Floridsdorf Cur: 1991438 Hernals Cur: 1991439 Liesing Cur: 1991440 Margareten Cur: 1991441 Ottakring Cur: 1991442 Simmering Cur: 1991443 Wieden Cur: 1990598 Penzing Cur: 1990599 Rudolfsheim-Fünfhaus Cur: 1990590 Alsergrund Cur: 1990591 Hietzing Cur: 1990592 Innere Stadt Cur: 1990593 Josefstadt Cur: 1990600 Währing Cur: 1991416 Landstraße
94:15
Great, both files have all 23 districts. Now I need to: 1. Build proper geometries from the Overpass relation data (members with way geometries) 2. Compute the symmetric difference between current and 2014 geometries per district 3. Classify fragments as added/removed/unchanged 4. Determine which districts touch changed areas Let me write a comprehensive script to do all of this.
94:15
94:15
Bash
python3 -c "
python3 -c "
import json
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
# Look at first element's members
e = data['elements'][0]
print('ID:', e['id'])
print('Tags:', e.get('tags', {}))
print('Number of members:', len(e.get('members', [])))
print()
# Show role distribution
from collections import Counter
roles = Counter(m['role'] for m in e['members'])
print('Role counts:', roles)
# Show first outer member's geometry sample
for m in e['members']:
if m['role'] == 'outer':
print('First outer member ref:', m['ref'])
print('Geometry points:', len(m.get('geometry', [])))
print('First 3 points:', m['geometry'][:3])
break
"
ID: 1990594
Tags: {'admin_level': '9', 'boundary': 'administrative', 'name': 'Leopoldstadt', 'name:ru': 'Леопольдштадт', 'name:uk': 'Леопольдштадт', 'postal_code': '1020', 'ref': '2', 'ref:at:gkz': '90201', 'street_names:wikipedia': 'de:Liste der Straßennamen von Wien/Leopoldstadt', 'type': 'boundary', 'wikidata': 'Q259120', 'wikipedia': 'de:Leopoldstadt'}
Number of members: 11
Role counts: Counter({'outer': 11})
First outer member ref: 147463460
Geometry points: 46
First 3 points: [{'lat': 48.2251075, 'lon': 16.3674575}, {'lat': 48.2253741, 'lon': 16.368439}, {'lat': 48.225683, 'lon': 16.3693506}]
94:42
Now I need to convert the Overpass API response data into proper Shapely geometries. The relations have members (ways) with geometry arrays. I need to: 1. For each relation, build a polygon from the members' geometries 2. Compare current vs 2014 to find changes Let me write a comprehensive script to do this.
94:42
94:42
Write
/work/process.py
content · 424 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
"""
import json
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon, MultiPolygon, shape, mapping
from shapely.ops import unary_union, polygonize
from shapely.validation import make_valid
import re
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry."""
from shapely.geometry import LineString, Polygon
from shapely.ops import polygonize
members = elem.get('members', [])
if not members:
return None
outer_lines = []
inner_lines = []
for m in members:
geom = m.get('geometry', [])
if not geom:
continue
coords = [(p['lon'], p['lat']) for p in geom]
if len(coords) < 2:
continue
line = LineString(coords)
role = m.get('role', 'outer')
if role == 'inner':
inner_lines.append(line)
else:
outer_lines.append(line)
if not outer_lines:
return None
# Merge all outer lines and polygonize
merged_outer = unary_union(outer_lines)
inner_merged = unary_union(inner_lines) if inner_lines else None
polygons = list(polygonize(merged_outer))
if not polygons:
return None
# Take the largest polygon from polygonize
main_poly = max(polygons, key=lambda p: p.area)
# Subtract inner rings
if inner_merged and not inner_merged.is_empty:
try:
main_poly = main_poly.difference(inner_merged)
except Exception:
pass
# Make valid
if not main_poly.is_valid:
main_poly = make_valid(main_poly)
return main_poly
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
# Remove special chars, lowercase
n = name.lower().strip()
# Fix common casing/spelling differences
# Replace "straße" variations
n = n.replace('straße', 'strasse')
n = n.replace('strasse', 'strasse')
# Remove umlaut differences
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
# Remove spaces and special chars for comparison
return n
def clean_name(name):
"""Return a clean display name."""
return name.strip()
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type == 'Polygon':
cur_polys[norm] = (name, poly)
elif poly.geom_type == 'MultiPolygon':
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type == 'Polygon':
old_polys[norm] = (name, poly)
elif poly.geom_type == 'MultiPolygon':
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
for k, (n, p) in sorted(cur_polys.items()):
print(f" {n} ({k}) - area: {p.area:.6f}")
print(f"\n2014 districts: {len(old_polys)}")
for k, (n, p) in sorted(old_polys.items()):
print(f" {n} ({k}) - area: {p.area:.6f}")
# Find common district names (normalized matching)
all_norms = set(cur_polys.keys()) | set(old_polys.keys())
print(f"\nAll normalized names: {sorted(all_norms)}")
# Name mapping between normalized and display names
# Let's see what normalized names we have
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"\nCommon: {sorted(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# For unmatched names, try fuzzy matching
if only_cur or only_old:
print("\nAttempting fuzzy name matching...")
from difflib import SequenceMatcher
# Try to match based on ref tags
# Build ref-to-name maps
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
# Match by ref
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' - using display name from cur")
# Transfer geometry
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms.add(cur_norm)
print(f"\nAfter ref matching:")
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"Common: {sorted(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Build the combined district boundary extents to clip against
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
combined_bounds = unary_union([all_current_union, all_old_union])
print(f"\nCurrent union area: {all_current_union.area:.6f}")
print(f"Old union area: {all_old_union.area:.6f}")
# Compute overall added/removed areas
added_overall = all_current_union.difference(all_old_union) if not all_current_union.is_empty else None
removed_overall = all_old_union.difference(all_current_union) if not all_old_union.is_empty else None
print(f"Overall added area: {added_overall.area if added_overall else 0:.6f}")
print(f"Overall removed area: {removed_overall.area if removed_overall else 0:.6f}")
# Now, for each district, compute boundary changes
# The approach: for each district, compute the symmetric difference of its boundary
# Then classify fragments
results = []
district_change_geoms = {} # district_norm -> list of change geometries
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...")
# Compute intersection and differences
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly) # in current but not 2014
removed_part = old_poly.difference(cur_poly) # in 2014 but not current
print(f" Intersection area: {intersection.area:.6f}")
print(f" Added area: {added_part.area:.6f}")
print(f" Removed area: {removed_part.area:.6f}")
# Store district-level change info
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Now create the output features
# For each change fragment, create a feature
features = []
feature_id = 0
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
# Added parts
added = info['added']
if added and not added.is_empty:
if added.geom_type == 'MultiPolygon':
for poly in added.geoms:
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'added_since_2014',
'district_name': cur_name,
},
'geometry': mapping(poly)
})
elif added.geom_type == 'Polygon':
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'added_since_2014',
'district_name': cur_name,
},
'geometry': mapping(added)
})
# Removed parts
removed = info['removed']
if removed and not removed.is_empty:
if removed.geom_type == 'MultiPolygon':
for poly in removed.geoms:
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'removed_since_2014',
'district_name': cur_name,
},
'geometry': mapping(poly)
})
elif removed.geom_type == 'Polygon':
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'removed_since_2014',
'district_name': cur_name,
},
'geometry': mapping(removed)
})
# Unchanged parts
inter = info['intersection']
if inter and not inter.is_empty:
if inter.geom_type == 'MultiPolygon':
for poly in inter.geoms:
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'unchanged',
'district_name': cur_name,
},
'geometry': mapping(poly)
})
elif inter.geom_type == 'Polygon':
feature_id += 1
features.append({
'type': 'Feature',
'properties': {
'change_type': 'unchanged',
'district_name': cur_name,
},
'geometry': mapping(inter)
})
print(f"\nTotal features before merge: {len(features)}")
# Now compute touches_changed for each district
# A district "touches_changed" if any part of its border is shared with a changed area
# Collect all added/removed geoms
changed_parts = []
for f in features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_parts.append(shape(f['geometry']))
changed_union = unary_union(changed_parts) if changed_parts else None
# Create a buffer to detect adjacency (touches = boundaries share a point)
# Use a tiny buffer to capture touching
if changed_union and not changed_union.is_empty:
changed_boundary = changed_union.boundary
else:
changed_boundary = None
# For each feature, determine if its district touches any changed part
for f in features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
info = district_change_geoms.get(norm)
if not info:
f['properties']['touches_changed'] = False
continue
# Get the full district polygon (current) to check adjacency
# Use both current and old geometry
cur_poly = cur_polys.get(norm, (None, None))[1]
old_poly = old_polys.get(norm, (None, None))[1]
touches = False
geom = shape(f['geometry'])
if changed_boundary and not changed_boundary.is_empty:
# Check if this fragment touches the changed boundary
try:
if geom.touches(changed_union) or geom.distance(changed_union) < 1e-8:
touches = True
# Also check if the district boundary (not just fragment) touches
if cur_poly and not touches:
if cur_poly.touches(changed_union) or cur_poly.distance(changed_union) < 1e-8:
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# Now we need to merge adjacent fragments of the same type and district
# First convert to GeoDataFrame for easier manipulation
gdf = gpd.GeoDataFrame.from_features(features, crs='EPSG:4326')
# Merge by district and change_type
merged_features = []
for (district, change_type), group in gdf.groupby(['district_name', 'change_type']):
merged_geom = unary_union(group.geometry.values)
touches = group['touches_changed'].any()
if merged_geom.geom_type == 'MultiPolygon':
polys = list(merged_geom.geoms)
else:
polys = [merged_geom]
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district,
'touches_changed': bool(touches)
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
from collections import Counter
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"Change type counts: {dict(change_counts)}")
districts_with_changes = set()
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
districts_with_changes.add(f['properties']['district_name'])
print(f"Districts with changes: {sorted(districts_with_changes)}")
touch_districts = set()
for f in merged_features:
if f['properties']['touches_changed']:
touch_districts.add(f['properties']['district_name'])
print(f"Districts touching changed: {sorted(touch_districts)}")
Wrote /work/process.py (14439 chars).
94:50
Bash
python3 /work/process.py
Building geometries for current data...
Building geometries for 2014 data...
Current districts: 23
Alsergrund (alsergrund) - area: 0.000359
Brigittenau (brigittenau) - area: 0.000692
Döbling (doebling) - area: 0.003021
Donaustadt (donaustadt) - area: 0.012381
Favoriten (favoriten) - area: 0.003846
Floridsdorf (floridsdorf) - area: 0.005385
Hernals (hernals) - area: 0.001380
Hietzing (hietzing) - area: 0.004560
Innere Stadt (innere stadt) - area: 0.000347
Josefstadt (josefstadt) - area: 0.000132
Landstraße (landstrasse) - area: 0.000896
Leopoldstadt (leopoldstadt) - area: 0.002327
Liesing (liesing) - area: 0.003875
Margareten (margareten) - area: 0.000243
Mariahilf (mariahilf) - area: 0.000176
Meidling (meidling) - area: 0.000980
Neubau (neubau) - area: 0.000195
Ottakring (ottakring) - area: 0.001049
Penzing (penzing) - area: 0.004086
Rudolfsheim-Fünfhaus (rudolfsheim-fuenfhaus) - area: 0.000474
Simmering (simmering) - area: 0.002812
Währing (waehring) - area: 0.000768
Wieden (wieden) - area: 0.000215
2014 districts: 23
Alsergrund (alsergrund) - area: 0.000359
Brigittenau (brigittenau) - area: 0.000692
Döbling (doebling) - area: 0.003021
Donaustadt (donaustadt) - area: 0.012374
Favoriten (favoriten) - area: 0.003846
Floridsdorf (floridsdorf) - area: 0.005385
Hernals (hernals) - area: 0.001379
Hietzing (hietzing) - area: 0.004560
Innere Stadt (innere stadt) - area: 0.000347
Josefstadt (josefstadt) - area: 0.000132
Landstraße (landstrasse) - area: 0.000895
Leopoldstadt (leopoldstadt) - area: 0.002335
Liesing (liesing) - area: 0.003876
Margareten (margareten) - area: 0.000243
Mariahilf (mariahilf) - area: 0.000176
Meidling (meidling) - area: 0.000980
Neubau (neubau) - area: 0.000194
Ottakring (ottakring) - area: 0.001050
Penzing (penzing) - area: 0.004086
Rudolfsheim-Fünfhaus (rudolfsheim-fuenfhaus) - area: 0.000474
Simmering (simmering) - area: 0.002811
Währing (waehring) - area: 0.000768
Wieden (wieden) - area: 0.000215
All normalized names: ['alsergrund', 'brigittenau', 'doebling', 'donaustadt', 'favoriten', 'floridsdorf', 'hernals', 'hietzing', 'innere stadt', 'josefstadt', 'landstrasse', 'leopoldstadt', 'liesing', 'margareten', 'mariahilf', 'meidling', 'neubau', 'ottakring', 'penzing', 'rudolfsheim-fuenfhaus', 'simmering', 'waehring', 'wieden']
Common: ['alsergrund', 'brigittenau', 'doebling', 'donaustadt', 'favoriten', 'floridsdorf', 'hernals', 'hietzing', 'innere stadt', 'josefstadt', 'landstrasse', 'leopoldstadt', 'liesing', 'margareten', 'mariahilf', 'meidling', 'neubau', 'ottakring', 'penzing', 'rudolfsheim-fuenfhaus', 'simmering', 'waehring', 'wieden']
Only current: []
Only 2014: []
After ref matching:
Common: ['alsergrund', 'brigittenau', 'doebling', 'donaustadt', 'favoriten', 'floridsdorf', 'hernals', 'hietzing', 'innere stadt', 'josefstadt', 'landstrasse', 'leopoldstadt', 'liesing', 'margareten', 'mariahilf', 'meidling', 'neubau', 'ottakring', 'penzing', 'rudolfsheim-fuenfhaus', 'simmering', 'waehring', 'wieden']
Only current: []
Only 2014: []
Current union area: 0.050198
Old union area: 0.050198
Overall added area: 0.000002
Overall removed area: 0.000002
Processing Innere Stadt...
Intersection area: 0.000347
Added area: 0.000000
Removed area: 0.000000
Processing Floridsdorf...
Intersection area: 0.005384
Added area: 0.000000
Removed area: 0.000000
Processing Mariahilf...
Intersection area: 0.000176
Added area: 0.000000
Removed area: 0.000000
Processing Josefstadt...
Intersection area: 0.000132
Added area: 0.000000
Removed area: 0.000000
Processing Margareten...
Intersection area: 0.000243
Added area: 0.000000
Removed area: 0.000000
Processing Neubau...
Intersection area: 0.000194
Added area: 0.000001
Removed area: 0.000000
Processing Landstraße...
Intersection area: 0.000895
Added area: 0.000001
Removed area: 0.000000
Processing Hernals...
Intersection area: 0.001378
Added area: 0.000001
Removed area: 0.000001
Processing Liesing...
Intersection area: 0.003874
Added area: 0.000001
Removed area: 0.000002
Processing Simmering...
Intersection area: 0.002811
Added area: 0.000001
Removed area: 0.000000
Processing Döbling...
Intersection area: 0.003021
Added area: 0.000000
Removed area: 0.000000
Processing Rudolfsheim-Fünfhaus...
Intersection area: 0.000473
Added area: 0.000001
Removed area: 0.000001
Processing Leopoldstadt...
Intersection area: 0.002327
Added area: 0.000000
Removed area: 0.000008
Processing Alsergrund...
Intersection area: 0.000359
Added area: 0.000000
Removed area: 0.000000
Processing Meidling...
Intersection area: 0.000979
Added area: 0.000000
Removed area: 0.000000
Processing Donaustadt...
Intersection area: 0.012373
Added area: 0.000007
Removed area: 0.000001
Processing Währing...
Intersection area: 0.000768
Added area: 0.000001
Removed area: 0.000001
Processing Penzing...
Intersection area: 0.004085
Added area: 0.000001
Removed area: 0.000001
Processing Brigittenau...
Intersection area: 0.000691
Added area: 0.000000
Removed area: 0.000000
Processing Favoriten...
Intersection area: 0.003845
Added area: 0.000002
Removed area: 0.000002
Processing Ottakring...
Intersection area: 0.001049
Added area: 0.000001
Removed area: 0.000001
Processing Hietzing...
Intersection area: 0.004560
Added area: 0.000000
Removed area: 0.000000
Processing Wieden...
Intersection area: 0.000215
Added area: 0.000000
Removed area: 0.000000
Total features before merge: 2673
Merged features: 2673
Saved to /work/vienna_boundary_changes.geojson
Change type counts: {'added_since_2014': 1316, 'removed_since_2014': 1334, 'unchanged': 23}
Districts with changes: ['Alsergrund', 'Brigittenau', 'Donaustadt', 'Döbling', 'Favoriten', 'Floridsdorf', 'Hernals', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Landstraße', 'Leopoldstadt', 'Liesing', 'Margareten', 'Mariahilf', 'Meidling', 'Neubau', 'Ottakring', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Simmering', 'Wieden', 'Währing']
Districts touching changed: ['Alsergrund', 'Brigittenau', 'Donaustadt', 'Döbling', 'Favoriten', 'Floridsdorf', 'Hernals', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Landstraße', 'Leopoldstadt', 'Liesing', 'Margareten', 'Mariahilf', 'Meidling', 'Neubau', 'Ottakring', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Simmering', 'Wieden', 'Währing']
95:09
The issue is that the polygon building from the member ways with geometry is producing many tiny sliver polygons because the member ways share edges but are being treated as separate lines. The `polygonize` is creating many small fragments from the lines. I need to fix the geometry building approach. The problem is that I'm building polygons from individual member ways (which are line segments) and then polygonize creates lots of tiny sliver polygons. Instead, I should build the full polygon properly by merging all outer rings into a single boundary and polygonizing that. Let me rethink - the issue is that the ways shared between adjacent districts are included in both districts' member lists, and when polygonizing, they create slivers along shared boundaries. Actually, looking at the data more carefully - each district relation has multiple member ways that form its boundary. Some of these ways are shared between districts. The `polygonize` function correctly creates the polygon from the ways, but because I'm doing it per-district, the shared edges between adjacent districts can cause issues. Wait, actually 2673 features for 23 districts is way too many. The problem is in how I built the polygons. Let me look at what polygonize is producing. Actually, the issue might be that the geometry from the Overpass API gives geometries for each member way separately, but some ways might be split at intersections, and polygonize creates many small polygons from the line network. Let me take a different approach - instead of using polygonize on the lines, I should construct the polygon directly from the rings. For each district, I should collect all outer member ways' coordinates and join them into a linear ring. Actually, looking at the member structures, each member way is a segment of the boundary, and they are ordered. But the `polygonize` should work if the ways form a closed ring. The problem is some ways may be shared between districts (as inner for one and outer for another, or shared outer boundaries). Let me try a simpler approach: collect all outer way coordinates, merge them, and polygonize. The issue is probably that some ways are not quite connecting properly. Let me look at what the actual geometry looks like more carefully.
95:09
Bash
python3 -c "
python3 -c "
import json
from shapely.geometry import LineString
from shapely.ops import polygonize
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
# Look at Innere Stadt (ID 1990592) - simple district with 8 members
for e in data['elements']:
if e['id'] == 1990592:
members = e['members']
print('Members for Innere Stadt:')
for m in members:
geom = m.get('geometry', [])
print(f' Role: {m[\"role\"]}, Type: {m[\"type\"]}, Ref: {m[\"ref\"]}, Geom points: {len(geom)}')
# Check first and last point
if len(geom) >= 2:
first = geom[0]
last = geom[-1]
print(f' First: ({first[\"lat\"]}, {first[\"lon\"]})')
print(f' Last: ({last[\"lat\"]}, {last[\"lon\"]})')
# Build lines and check connections
lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) >= 2:
coords = [(p['lon'], p['lat']) for p in geom]
lines.append(LineString(coords))
merged = polygonize(lines)
print(f'\\nPolygonize produced {len(list(merged))} polygons')
# Try merging lines first
from shapely.ops import unary_union
merged_line = unary_union(lines)
print(f'Merged line type: {merged_line.geom_type}')
if hasattr(merged_line, 'geoms'):
print(f'Number of geom parts: {len(list(merged_line.geoms))}')
# Just try to construct a polygon from the outer ring
outer_lines = [LineString([(p['lon'], p['lat']) for p in m['geometry']])
for m in members if m['role'] == 'outer' and len(m.get('geometry',[])) >= 2]
# Try to create a LinearRing from all points concatenated
all_coords = []
for line in outer_lines:
if all_coords and line.coords[0] == all_coords[-1]:
all_coords.extend(line.coords[1:])
elif all_coords and line.coords[-1] == all_coords[0]:
all_coords = line.coords[:-1] + all_coords
else:
all_coords.extend(list(line.coords))
print(f'Total coordinates: {len(all_coords)}')
print(f'First: {all_coords[0]}, Last: {all_coords[-1]}')
break
"
Members for Innere Stadt:
Role: outer, Type: way, Ref: 147416055, Geom points: 61
First: (48.1999528, 16.375215)
Last: (48.2113547, 16.3848946)
Role: outer, Type: way, Ref: 147463458, Geom points: 41
First: (48.2113547, 16.3848946)
Last: (48.2184891, 16.370149)
Role: outer, Type: way, Ref: 147396089, Geom points: 33
First: (48.2184891, 16.370149)
Last: (48.2143408, 16.3565102)
Role: outer, Type: way, Ref: 147396086, Geom points: 14
First: (48.2143408, 16.3565102)
Last: (48.2069081, 16.3558875)
Role: outer, Type: way, Ref: 147396090, Geom points: 10
First: (48.2069081, 16.3558875)
Last: (48.2026474, 16.3615382)
Role: outer, Type: way, Ref: 147399477, Geom points: 9
First: (48.2026474, 16.3615382)
Last: (48.1996821, 16.3659409)
Role: outer, Type: way, Ref: 147399466, Geom points: 14
First: (48.1996821, 16.3659409)
Last: (48.1999528, 16.375215)
Role: label, Type: node, Ref: 10919470469, Geom points: 0
Polygonize produced 1 polygons
Merged line type: MultiLineString
Number of geom parts: 7
Total coordinates: 176
First: (16.375215, 48.1999528), Last: (16.375215, 48.1999528)
95:51
The issue is that polygonize is producing a lot of small polygon fragments because the member lines are not being connected properly into a ring. The individual ways might have overlapping start/end points but polygonize is creating many tiny slivers. The problem is that `polygonize` takes individual linestrings and finds all possible polygons formed by their intersections. When district boundaries overlap (shared boundaries between adjacent districts), polygonize creates slivers. I need a different approach. Instead of polygonizing the boundary lines, I should: 1. Take each district's member ways (which form closed rings when concatenated properly) 2. Build the actual polygon from the outer ring(s) Actually, the issue is that the ways overlap between districts - one district's outer boundary is shared with another's. When we polygonize all the lines together, we get many small slivers. A better approach: Instead of polygonizing the lines, I should construct the polygon from the ring formed by the outer members, respecting the order they appear in the relation members list. The ways should connect end-to-end to form a closed ring. Let me rewrite the approach to properly build a single polygon from the ordered member ways.
95:51
Write
/work/process.py
content · 359 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
"""
import json
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon, MultiPolygon, LineString, LinearRing, shape, mapping, Point
from shapely.ops import unary_union, polygonize, linemerge
from shapely.validation import make_valid
import re
from collections import Counter
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry.
The member ways are ordered and their start/end points match sequentially
forming a closed ring. We extract coordinates preserving this order.
"""
members = elem.get('members', [])
if not members:
return None
outer_ways = []
inner_ways = []
for m in members:
geom = m.get('geometry', [])
if not geom or len(geom) < 2:
continue
role = m.get('role', 'outer')
coords = [(p['lon'], p['lat']) for p in geom]
if role == 'inner':
inner_ways.append(coords)
else:
outer_ways.append(coords)
if not outer_ways:
return None
# Build the outer ring by concatenating coordinates in order,
# removing duplicate endpoints between consecutive segments
def build_ring(way_coords_list):
if not way_coords_list:
return None
all_coords = []
for i, coords in enumerate(way_coords_list):
if i == 0:
all_coords.extend(coords)
else:
# Skip the first point if it matches the last point added
if coords and all_coords and coords[0] == all_coords[-1]:
all_coords.extend(coords[1:])
elif coords and all_coords and coords[-1] == all_coords[0]:
# Reversed direction
all_coords = coords[:-1] + all_coords
else:
all_coords.extend(coords)
if len(all_coords) >= 4 and all_coords[0] == all_coords[-1]:
return LinearRing(all_coords)
elif len(all_coords) >= 3:
# Close it
all_coords.append(all_coords[0])
return LinearRing(all_coords)
return None
outer_ring = build_ring(outer_ways)
if outer_ring is None:
return None
# Build inner rings
inner_rings = []
for inner_coords in inner_ways:
ring = build_ring([inner_coords])
if ring:
inner_rings.append(ring)
try:
poly = Polygon(outer_ring, inner_rings)
if not poly.is_valid:
poly = make_valid(poly)
return poly
except Exception as e:
print(f" Error building polygon: {e}")
return None
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
n = name.lower().strip()
n = n.replace('straße', 'strasse')
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
return n
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type in ('Polygon', 'MultiPolygon'):
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type in ('Polygon', 'MultiPolygon'):
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
for k, (n, p) in sorted(cur_polys.items()):
print(f" {n} ({k}) - area: {p.area:.6f}")
print(f"\n2014 districts: {len(old_polys)}")
for k, (n, p) in sorted(old_polys.items()):
print(f" {n} ({k}) - area: {p.area:.6f}")
# Find common district names
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"\nCommon: {sorted(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Match by ref if needed
if only_cur or only_old:
print("\nAttempting ref-based name matching...")
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' -> mapping")
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
print(f"Common after ref matching: {sorted(common_norms)}")
# Compute overall boundary extents
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
print(f"\nCurrent union area: {all_current_union.area:.6f}")
print(f"Old union area: {all_old_union.area:.6f}")
# For each district, compute boundary changes
district_change_geoms = {}
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...")
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly)
removed_part = old_poly.difference(cur_poly)
print(f" Intersection area: {intersection.area:.8f}")
print(f" Added area: {added_part.area:.8f}")
print(f" Removed area: {removed_part.area:.8f}")
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Create features
features = []
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
# Helper to add geometries
def add_geom(geom, change_type, name):
if geom is None or geom.is_empty:
return
if geom.geom_type == 'MultiPolygon':
for poly in geom.geoms:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(poly)
})
elif geom.geom_type == 'Polygon':
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(geom)
})
add_geom(info['added'], 'added_since_2014', cur_name)
add_geom(info['removed'], 'removed_since_2014', cur_name)
add_geom(info['intersection'], 'unchanged', cur_name)
print(f"\nTotal raw features: {len(features)}")
# Merge features by district_name and change_type
# First, collect geometries per group
from collections import defaultdict
groups = defaultdict(list)
for f in features:
key = (f['properties']['district_name'], f['properties']['change_type'])
groups[key].append(shape(f['geometry']))
merged_features = []
for (district_name, change_type), geoms in groups.items():
merged = unary_union(geoms)
if merged.is_empty:
continue
if merged.geom_type == 'MultiPolygon':
polys = list(merged.geoms)
else:
polys = [merged]
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district_name,
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Now compute touches_changed for each feature
# A district touches a changed area if any boundary of the district
# is shared with any added_since_2014 or removed_since_2014 fragment
# Collect all changed geometries
changed_geoms = []
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_geoms.append(shape(f['geometry']))
changed_union = unary_union(changed_geoms) if changed_geoms else None
changed_boundary = changed_union.boundary if changed_union and not changed_union.is_empty else None
print(f"\nChanged union area: {changed_union.area if changed_union else 0:.8f}")
# For each feature, check if its district borders any changed area
# We check by looking at the district's full polygon (both current and old)
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
touches = False
# Get both current and old polygons for this district
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
if changed_union and not changed_union.is_empty:
# Check if the district polygon touches the changed union
if cur_info:
cur_poly = cur_info[1]
try:
if cur_poly.touches(changed_union):
touches = True
except Exception:
pass
if not touches and old_info:
old_poly = old_info[1]
try:
if old_poly.touches(changed_union):
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"\nChange type counts: {dict(change_counts)}")
# Districts with any changes
districts_with_changes = set()
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
districts_with_changes.add(f['properties']['district_name'])
print(f"Districts with changes: {sorted(districts_with_changes)}")
# Districts touching changed
touch_districts = set()
for f in merged_features:
if f['properties']['touches_changed']:
touch_districts.add(f['properties']['district_name'])
print(f"Districts touching changed: {sorted(touch_districts)}")
# Verify some properties
print(f"\nSample features:")
for f in merged_features[:5]:
print(f" {f['properties']['district_name']}: {f['properties']['change_type']}, touches={f['properties']['touches_changed']}, area={shape(f['geometry']).area:.8f}")
Wrote /work/process.py (11917 chars).
95:55
Bash
python3 /work/process.py
Building geometries for current data...
Warning: Leopoldstadt has type GeometryCollection
Warning: Mariahilf has type GeometryCollection
Warning: Meidling has type GeometryCollection
Warning: Neubau has type GeometryCollection
Warning: Brigittenau has type GeometryCollection
Warning: Donaustadt has type GeometryCollection
Warning: Döbling has type GeometryCollection
Warning: Floridsdorf has type GeometryCollection
Warning: Hernals has type GeometryCollection
Warning: Liesing has type GeometryCollection
Warning: Margareten has type GeometryCollection
Warning: Ottakring has type GeometryCollection
Warning: Simmering has type GeometryCollection
Warning: Penzing has type GeometryCollection
Warning: Rudolfsheim-Fünfhaus has type GeometryCollection
Warning: Hietzing has type GeometryCollection
Warning: Währing has type GeometryCollection
Warning: Landstraße has type GeometryCollection
Building geometries for 2014 data...
Warning: Hietzing has type GeometryCollection
Warning: Leopoldstadt has type GeometryCollection
Warning: Meidling has type GeometryCollection
Warning: Neubau has type GeometryCollection
Warning: Penzing has type GeometryCollection
Warning: Rudolfsheim-Fünfhaus has type GeometryCollection
Warning: Währing has type GeometryCollection
Warning: Landstraße has type GeometryCollection
Warning: Brigittenau has type GeometryCollection
Warning: Döbling has type GeometryCollection
Warning: Favoriten has type GeometryCollection
Warning: Floridsdorf has type GeometryCollection
Warning: Hernals has type GeometryCollection
Warning: Liesing has type GeometryCollection
Warning: Margareten has type GeometryCollection
Warning: Ottakring has type GeometryCollection
Warning: Simmering has type GeometryCollection
Warning: Wieden has type GeometryCollection
Current districts: 5
Alsergrund (alsergrund) - area: 0.000317
Favoriten (favoriten) - area: 0.003825
Innere Stadt (innere stadt) - area: 0.000347
Josefstadt (josefstadt) - area: 0.000132
Wieden (wieden) - area: 0.000207
2014 districts: 5
Alsergrund (alsergrund) - area: 0.000289
Donaustadt (donaustadt) - area: 0.007257
Innere Stadt (innere stadt) - area: 0.000347
Josefstadt (josefstadt) - area: 0.000125
Mariahilf (mariahilf) - area: 0.000174
Common: ['alsergrund', 'innere stadt', 'josefstadt']
Only current: ['favoriten', 'wieden']
Only 2014: ['donaustadt', 'mariahilf']
Attempting ref-based name matching...
Common after ref matching: ['alsergrund', 'innere stadt', 'josefstadt']
Current union area: 0.004828
Old union area: 0.008188
Processing Innere Stadt...
Intersection area: 0.00034698
Added area: 0.00000009
Removed area: 0.00000024
Processing Josefstadt...
Intersection area: 0.00012138
Added area: 0.00001034
Removed area: 0.00000374
Processing Alsergrund...
Intersection area: 0.00025966
Added area: 0.00005739
Removed area: 0.00002983
Total raw features: 185
Merged features: 185
Changed union area: 0.00010151
Saved to /work/vienna_boundary_changes.geojson
Change type counts: {'added_since_2014': 82, 'removed_since_2014': 102, 'unchanged': 1}
Districts with changes: ['Alsergrund', 'Innere Stadt', 'Josefstadt']
Districts touching changed: []
Sample features:
Innere Stadt: added_since_2014, touches=False, area=0.00000000
Innere Stadt: added_since_2014, touches=False, area=0.00000000
Innere Stadt: added_since_2014, touches=False, area=0.00000000
Innere Stadt: added_since_2014, touches=False, area=0.00000000
Innere Stadt: added_since_2014, touches=False, area=0.00000000
96:11
The issue is that many districts produce GeometryCollection instead of Polygon when building from members. The build_ring function is not handling the coordinate ordering correctly for cases where the ways don't perfectly connect in order. Let me look more carefully at the member geometry structure. Let me check the actual structure of members for a problematic district.
96:11
Bash
python3 -c "
python3 -c "
import json
from shapely.geometry import LineString, Polygon, MultiPolygon, GeometryCollection
from shapely.ops import unary_union, polygonize
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
# Look at Leopoldstadt (1990594)
for e in data['elements']:
if e['id'] == 1990594:
members = e['members']
outer_members = [m for m in members if m['role'] == 'outer']
print(f'Outer members: {len(outer_members)}')
# Build lines from each outer member
lines = []
for m in outer_members:
geom = m.get('geometry', [])
if len(geom) >= 2:
coords = [(p['lon'], p['lat']) for p in geom]
lines.append(LineString(coords))
# Try polygonize
polys = list(polygonize(lines))
print(f'Polygonize produced {len(polys)} polygons')
# Try unary_union of lines to create a merged boundary
merged = unary_union(lines)
print(f'Merged line type: {merged.geom_type}')
if merged.geom_type == 'MultiLineString':
# Try to create polygon from merged boundary
try:
from shapely.ops import polygonize
result_polys = list(polygonize(merged))
print(f'Polygonize merged produced {len(result_polys)} polygons')
for p in result_polys:
print(f' Area: {p.area:.8f}')
except Exception as e:
print(f'Error: {e}')
# Let me try a different approach - build the ring manually
# For districts, the outer ways form a closed ring when connected
# Let's extract all coordinates in order
all_coords = []
for m in outer_members:
geom = m.get('geometry', [])
coords = [(p['lon'], p['lat']) for p in geom]
all_coords.append(coords)
print(f'\\nCoordinate segments:')
for i, seg in enumerate(all_coords):
print(f' Segment {i}: {len(seg)} points, first={seg[0]}, last={seg[-1]}')
# Check connections
for i in range(len(all_coords)):
j = (i + 1) % len(all_coords)
if all_coords[i][-1] != all_coords[j][0]:
print(f' Gap between {i} (end={all_coords[i][-1]}) and {j} (start={all_coords[j][0]})')
else:
print(f' Connected: {i} -> {j}')
break
"
Outer members: 11 Polygonize produced 1 polygons Merged line type: MultiLineString Polygonize merged produced 1 polygons Area: 0.00232716 Coordinate segments: Segment 0: 46 points, first=(16.3674575, 48.2251075), last=(16.3952727, 48.2348255) Segment 1: 15 points, first=(16.3674575, 48.2251075), last=(16.370149, 48.2184891) Segment 2: 41 points, first=(16.3848946, 48.2113547), last=(16.370149, 48.2184891) Segment 3: 60 points, first=(16.3848946, 48.2113547), last=(16.4301686, 48.1866113) Segment 4: 2 points, first=(16.4301686, 48.1866113), last=(16.4310992, 48.1863419) Segment 5: 4 points, first=(16.4605003, 48.1777857), last=(16.4310992, 48.1863419) Segment 6: 10 points, first=(16.4797281, 48.1706977), last=(16.4605003, 48.1777857) Segment 7: 13 points, first=(16.4946641, 48.1653265), last=(16.4797281, 48.1706977) Segment 8: 3 points, first=(16.497085, 48.1662737), last=(16.4946641, 48.1653265) Segment 9: 22 points, first=(16.398134, 48.2363479), last=(16.497085, 48.1662737) Segment 10: 2 points, first=(16.3952727, 48.2348255), last=(16.398134, 48.2363479) Gap between 0 (end=(16.3952727, 48.2348255)) and 1 (start=(16.3674575, 48.2251075)) Gap between 1 (end=(16.370149, 48.2184891)) and 2 (start=(16.3848946, 48.2113547)) Gap between 2 (end=(16.370149, 48.2184891)) and 3 (start=(16.3848946, 48.2113547)) Connected: 3 -> 4 Gap between 4 (end=(16.4310992, 48.1863419)) and 5 (start=(16.4605003, 48.1777857)) Gap between 5 (end=(16.4310992, 48.1863419)) and 6 (start=(16.4797281, 48.1706977)) Gap between 6 (end=(16.4605003, 48.1777857)) and 7 (start=(16.4946641, 48.1653265)) Gap between 7 (end=(16.4797281, 48.1706977)) and 8 (start=(16.497085, 48.1662737)) Gap between 8 (end=(16.4946641, 48.1653265)) and 9 (start=(16.398134, 48.2363479)) Gap between 9 (end=(16.497085, 48.1662737)) and 10 (start=(16.3952727, 48.2348255)) Gap between 10 (end=(16.398134, 48.2363479)) and 0 (start=(16.3674575, 48.2251075))
96:32
Ah, I see the issue. The ways are not in order and some share endpoints. Let me look more carefully. Actually polygonize with unary_union of the lines works fine (it produced 1 polygon with area 0.00232716). The issue is that my build_ring function is getting confused by the unordered segments. Let me use polygonize approach instead - merge all outer lines and polygonize them, which is more robust.
96:32
96:32
Write
/work/process.py
content · 355 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
Uses polygonize on merged outer lines to handle unordered way segments.
"""
import json
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon, MultiPolygon, LineString, LinearRing, shape, mapping, Point
from shapely.ops import unary_union, polygonize, linemerge
from shapely.validation import make_valid
from collections import Counter, defaultdict
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry.
Uses polygonize on merged outer way lines, then picks the largest
polygon and subtracts inner ways.
"""
members = elem.get('members', [])
if not members:
return None
outer_lines = []
inner_lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) < 2:
continue
coords = [(p['lon'], p['lat']) for p in geom]
line = LineString(coords)
role = m.get('role', 'outer')
if role == 'inner':
inner_lines.append(line)
else:
outer_lines.append(line)
if not outer_lines:
return None
# Merge outer lines and polygonize
try:
merged_outer = unary_union(outer_lines)
except Exception:
merged_outer = outer_lines[0]
for l in outer_lines[1:]:
merged_outer = merged_outer.union(l)
# polygonize the merged linework
polygons = list(polygonize(merged_outer))
if not polygons:
return None
# Pick the largest polygon (the actual district)
main_poly = max(polygons, key=lambda p: p.area)
# Subtract inner areas
if inner_lines:
try:
inner_merged = unary_union(inner_lines)
inner_polys = list(polygonize(inner_merged))
if inner_polys:
inner_area = unary_union(inner_polys)
main_poly = main_poly.difference(inner_area)
except Exception:
pass
# Make valid
if not main_poly.is_valid:
main_poly = make_valid(main_poly)
return main_poly
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
n = name.lower().strip()
n = n.replace('straße', 'strasse')
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
return n
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type == 'MultiPolygon':
# Keep only the largest polygon if multipolygon with tiny slivers
if len(poly.geoms) > 1:
sorted_geoms = sorted(poly.geoms, key=lambda p: p.area, reverse=True)
main = sorted_geoms[0]
total = sum(p.area for p in sorted_geoms)
if main.area / total > 0.99:
poly = main
cur_polys[norm] = (name, poly)
elif poly.geom_type == 'Polygon':
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
if poly.geom_type == 'MultiPolygon':
if len(poly.geoms) > 1:
sorted_geoms = sorted(poly.geoms, key=lambda p: p.area, reverse=True)
main = sorted_geoms[0]
total = sum(p.area for p in sorted_geoms)
if main.area / total > 0.99:
poly = main
old_polys[norm] = (name, poly)
elif poly.geom_type == 'Polygon':
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} has type {poly.geom_type}")
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
for k, (n, p) in sorted(cur_polys.items()):
print(f" {n} ({k}) - area: {p.area:.8f}")
print(f"\n2014 districts: {len(old_polys)}")
for k, (n, p) in sorted(old_polys.items()):
print(f" {n} ({k}) - area: {p.area:.8f}")
# Find common district names
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"\nCommon: {sorted(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Match by ref if needed
if only_cur or only_old:
print("\nAttempting ref-based name matching...")
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' -> mapping")
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
print(f"\nCommon after ref matching: {sorted(common_norms)}")
# Compute overall boundary extents
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
print(f"\nCurrent union area: {all_current_union.area:.8f}")
print(f"Old union area: {all_old_union.area:.8f}")
# For each district, compute boundary changes
district_change_geoms = {}
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...")
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly)
removed_part = old_poly.difference(cur_poly)
print(f" Intersection area: {intersection.area:.10f}")
print(f" Added area: {added_part.area:.10f}")
print(f" Removed area: {removed_part.area:.10f}")
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Create features
features = []
def add_geom(geom, change_type, name):
if geom is None or geom.is_empty:
return
if geom.geom_type == 'MultiPolygon':
for poly in geom.geoms:
if poly.area > 1e-12: # Filter tiny slivers
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(poly)
})
elif geom.geom_type == 'Polygon':
if geom.area > 1e-12:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(geom)
})
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
add_geom(info['added'], 'added_since_2014', cur_name)
add_geom(info['removed'], 'removed_since_2014', cur_name)
add_geom(info['intersection'], 'unchanged', cur_name)
print(f"\nTotal raw features: {len(features)}")
# Merge features by district_name and change_type
groups = defaultdict(list)
for f in features:
key = (f['properties']['district_name'], f['properties']['change_type'])
groups[key].append(shape(f['geometry']))
merged_features = []
for (district_name, change_type), geoms in groups.items():
merged = unary_union(geoms)
if merged.is_empty:
continue
if merged.geom_type == 'MultiPolygon':
polys = [g for g in merged.geoms if g.area > 1e-12]
else:
polys = [merged] if merged.area > 1e-12 else []
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district_name,
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Compute touches_changed for each feature
# Collect all changed geometries
changed_geoms = []
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_geoms.append(shape(f['geometry']))
changed_union = unary_union(changed_geoms) if changed_geoms else None
print(f"\nChanged union area: {changed_union.area if changed_union else 0:.10f}")
# For each feature, check if its district borders any changed area
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
touches = False
if changed_union and not changed_union.is_empty:
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
if cur_info:
cur_poly = cur_info[1]
try:
if cur_poly.touches(changed_union):
touches = True
except Exception:
pass
if not touches and old_info:
old_poly = old_info[1]
try:
if old_poly.touches(changed_union):
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"\nChange type counts: {dict(change_counts)}")
# Districts with any changes
districts_with_changes = set()
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
districts_with_changes.add(f['properties']['district_name'])
print(f"Districts with changes: {sorted(districts_with_changes)}")
# Districts touching changed
touch_districts = set()
for f in merged_features:
if f['properties']['touches_changed']:
touch_districts.add(f['properties']['district_name'])
print(f"Districts touching changed: {sorted(touch_districts)}")
print(f"\nSample features (first 10):")
for f in merged_features[:10]:
g = shape(f['geometry'])
print(f" {f['properties']['district_name']}: {f['properties']['change_type']}, touches={f['properties']['touches_changed']}, area={g.area:.10f}")
Wrote /work/process.py (11698 chars).
96:40
The process script still has issues with building polygons. The problem is that `polygonize` works on the linework but the issue is likely that the ways don't form proper closed rings when polygonized because they're not all connected into a single network. Let me debug the actual geometry more carefully. Actually, the issue is that for complex districts like Leopoldstadt, the district has 11 outer way members but they don't connect contiguously - some of them are separate islands (outer ways for different parts of the boundary). When we do `unary_union` then `polygonize`, it should work for any collection of lines that form closed rings. Let me try the polygonize approach more carefully.
96:40
Bash
python3 -c "
python3 -c "
import json
from shapely.geometry import LineString
from shapely.ops import unary_union, polygonize
with open('/work/districts_current_geom.json') as f:
data = json.load(f)
for e in data['elements']:
name = e.get('tags', {}).get('name', '')
members = e['members']
outer_lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) < 2:
continue
coords = [(p['lon'], p['lat']) for p in geom]
line = LineString(coords)
if m.get('role', 'outer') == 'outer':
outer_lines.append(line)
merged = unary_union(outer_lines)
polys = list(polygonize(merged))
if polys:
main = max(polys, key=lambda p: p.area)
print(f'{name}: polygonize produced {len(polys)} polygons, main area={main.area:.8f}')
else:
print(f'{name}: polygonize produced 0 polygons!')
# Check if merged boundary is valid
print(f' Merged type: {merged.geom_type}, is_empty={merged.is_empty}')
" 2>&1 | head -30
Leopoldstadt: polygonize produced 1 polygons, main area=0.00232716 Mariahilf: polygonize produced 1 polygons, main area=0.00017618 Meidling: polygonize produced 1 polygons, main area=0.00097967 Neubau: polygonize produced 1 polygons, main area=0.00019452 Brigittenau: polygonize produced 1 polygons, main area=0.00069161 Donaustadt: polygonize produced 1 polygons, main area=0.01238082 Döbling: polygonize produced 1 polygons, main area=0.00302130 Favoriten: polygonize produced 1 polygons, main area=0.00384630 Floridsdorf: polygonize produced 1 polygons, main area=0.00538470 Hernals: polygonize produced 1 polygons, main area=0.00137956 Liesing: polygonize produced 1 polygons, main area=0.00387493 Margareten: polygonize produced 1 polygons, main area=0.00024329 Ottakring: polygonize produced 1 polygons, main area=0.00104948 Simmering: polygonize produced 1 polygons, main area=0.00281169 Wieden: polygonize produced 1 polygons, main area=0.00021484 Penzing: polygonize produced 1 polygons, main area=0.00408591 Rudolfsheim-Fünfhaus: polygonize produced 1 polygons, main area=0.00047390 Alsergrund: polygonize produced 1 polygons, main area=0.00035913 Hietzing: polygonize produced 1 polygons, main area=0.00456027 Innere Stadt: polygonize produced 1 polygons, main area=0.00034707 Josefstadt: polygonize produced 1 polygons, main area=0.00013189 Währing: polygonize produced 1 polygons, main area=0.00076838 Landstraße: polygonize produced 1 polygons, main area=0.00089591
97:07
It looks like the polygonize approach works fine! The issue earlier was that my `build_polygon_from_members` was producing GeometryCollections because `make_valid` might convert some invalid polygons to GeometryCollections. Let me fix the script to use polygonize directly and handle the output properly.
97:07
97:07
Write
/work/process.py
content · 333 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
Uses polygonize on merged outer lines to handle unordered way segments.
"""
import json
from shapely.geometry import Polygon, MultiPolygon, LineString, shape, mapping
from shapely.ops import unary_union, polygonize
from shapely.validation import make_valid
from collections import Counter, defaultdict
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry.
Uses polygonize on merged outer way lines, then picks the largest
polygon and subtracts inner ways.
"""
members = elem.get('members', [])
if not members:
return None
outer_lines = []
inner_lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) < 2:
continue
coords = [(p['lon'], p['lat']) for p in geom]
line = LineString(coords)
role = m.get('role', 'outer')
if role == 'inner':
inner_lines.append(line)
else:
outer_lines.append(line)
if not outer_lines:
return None
# Merge outer lines and polygonize
merged_outer = unary_union(outer_lines)
polygons = list(polygonize(merged_outer))
if not polygons:
return None
# Pick the largest polygon (the actual district)
main_poly = max(polygons, key=lambda p: p.area)
# Subtract inner areas
if inner_lines:
try:
inner_merged = unary_union(inner_lines)
inner_polys = list(polygonize(inner_merged))
if inner_polys:
inner_area = unary_union(inner_polys)
main_poly = main_poly.difference(inner_area)
except Exception:
pass
# Make valid if needed
if not main_poly.is_valid:
valid = make_valid(main_poly)
if valid.geom_type == 'Polygon':
main_poly = valid
elif valid.geom_type == 'MultiPolygon':
# Take the largest component
main_poly = max(valid.geoms, key=lambda p: p.area)
# else keep original
return main_poly
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
n = name.lower().strip()
n = n.replace('straße', 'strasse')
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
n = n.replace('-', '')
n = n.replace(' ', '')
return n
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
for k, (n, p) in sorted(cur_polys.items()):
print(f" {n} - area: {p.area:.8f}")
print(f"\n2014 districts: {len(old_polys)}")
for k, (n, p) in sorted(old_polys.items()):
print(f" {n} - area: {p.area:.8f}")
# Match names
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"\nCommon: {sorted(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Match by ref if needed
if only_cur or only_old:
print("\nAttempting ref-based name matching...")
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' -> mapping")
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
print(f"\nCommon after ref matching: {sorted(common_norms)}")
# Compute overall boundary extents
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
print(f"\nCurrent union area: {all_current_union.area:.8f}")
print(f"Old union area: {all_old_union.area:.8f}")
# For each district, compute boundary changes
district_change_geoms = {}
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...")
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly)
removed_part = old_poly.difference(cur_poly)
print(f" Intersection area: {intersection.area:.10f}")
print(f" Added area: {added_part.area:.10f}")
print(f" Removed area: {removed_part.area:.10f}")
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Create features
features = []
def add_geom(geom, change_type, name):
if geom is None or geom.is_empty:
return
if geom.geom_type == 'MultiPolygon':
for poly in geom.geoms:
if poly.area > 1e-12:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(poly)
})
elif geom.geom_type == 'Polygon':
if geom.area > 1e-12:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(geom)
})
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
add_geom(info['added'], 'added_since_2014', cur_name)
add_geom(info['removed'], 'removed_since_2014', cur_name)
add_geom(info['intersection'], 'unchanged', cur_name)
print(f"\nTotal raw features: {len(features)}")
# Merge features by district_name and change_type
groups = defaultdict(list)
for f in features:
key = (f['properties']['district_name'], f['properties']['change_type'])
groups[key].append(shape(f['geometry']))
merged_features = []
for (district_name, change_type), geoms in groups.items():
merged = unary_union(geoms)
if merged.is_empty:
continue
if merged.geom_type == 'MultiPolygon':
polys = [g for g in merged.geoms if g.area > 1e-12]
else:
polys = [merged] if merged.area > 1e-12 else []
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district_name,
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Compute touches_changed
changed_geoms = []
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_geoms.append(shape(f['geometry']))
changed_union = unary_union(changed_geoms) if changed_geoms else None
print(f"\nChanged union area: {changed_union.area if changed_union else 0:.10f}")
# For each feature, check if its district borders any changed area
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
touches = False
if changed_union and not changed_union.is_empty:
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
if cur_info:
cur_poly = cur_info[1]
try:
if cur_poly.touches(changed_union):
touches = True
except Exception:
pass
if not touches and old_info:
old_poly = old_info[1]
try:
if old_poly.touches(changed_union):
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"\nChange type counts: {dict(change_counts)}")
districts_with_changes = set()
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
districts_with_changes.add(f['properties']['district_name'])
print(f"Districts with changes: {sorted(districts_with_changes)}")
touch_districts = set()
for f in merged_features:
if f['properties']['touches_changed']:
touch_districts.add(f['properties']['district_name'])
print(f"Districts touching changed: {sorted(touch_districts)}")
print(f"\nSample features (first 10):")
for f in merged_features[:10]:
g = shape(f['geometry'])
print(f" {f['properties']['district_name']}: {f['properties']['change_type']}, touches={f['properties']['touches_changed']}, area={g.area:.10f}")
# Check file size
import os
size = os.path.getsize('/work/vienna_boundary_changes.geojson')
print(f"\nFile size: {size} bytes")
Wrote /work/process.py (10611 chars).
97:14
Bash
python3 /work/process.py
Building geometries for current data...
Building geometries for 2014 data...
Current districts: 23
Alsergrund - area: 0.00035913
Brigittenau - area: 0.00069161
Döbling - area: 0.00302130
Donaustadt - area: 0.01238082
Favoriten - area: 0.00384630
Floridsdorf - area: 0.00538470
Hernals - area: 0.00137956
Hietzing - area: 0.00456027
Innere Stadt - area: 0.00034707
Josefstadt - area: 0.00013189
Landstraße - area: 0.00089591
Leopoldstadt - area: 0.00232716
Liesing - area: 0.00387493
Margareten - area: 0.00024329
Mariahilf - area: 0.00017618
Meidling - area: 0.00097965
Neubau - area: 0.00019452
Ottakring - area: 0.00104948
Penzing - area: 0.00408591
Rudolfsheim-Fünfhaus - area: 0.00047390
Simmering - area: 0.00281169
Währing - area: 0.00076838
Wieden - area: 0.00021484
2014 districts: 23
Alsergrund - area: 0.00035903
Brigittenau - area: 0.00069151
Döbling - area: 0.00302121
Donaustadt - area: 0.01237388
Favoriten - area: 0.00384632
Floridsdorf - area: 0.00538470
Hernals - area: 0.00137908
Hietzing - area: 0.00456027
Innere Stadt - area: 0.00034722
Josefstadt - area: 0.00013207
Landstraße - area: 0.00089523
Leopoldstadt - area: 0.00233471
Liesing - area: 0.00387577
Margareten - area: 0.00024324
Mariahilf - area: 0.00017579
Meidling - area: 0.00097973
Neubau - area: 0.00019409
Ottakring - area: 0.00104998
Penzing - area: 0.00408590
Rudolfsheim-Fünfhaus - area: 0.00047445
Simmering - area: 0.00281086
Währing - area: 0.00076839
Wieden - area: 0.00021481
Common: ['alsergrund', 'brigittenau', 'doebling', 'donaustadt', 'favoriten', 'floridsdorf', 'hernals', 'hietzing', 'innerestadt', 'josefstadt', 'landstrasse', 'leopoldstadt', 'liesing', 'margareten', 'mariahilf', 'meidling', 'neubau', 'ottakring', 'penzing', 'rudolfsheimfuenfhaus', 'simmering', 'waehring', 'wieden']
Only current: []
Only 2014: []
Common after ref matching: ['alsergrund', 'brigittenau', 'doebling', 'donaustadt', 'favoriten', 'floridsdorf', 'hernals', 'hietzing', 'innerestadt', 'josefstadt', 'landstrasse', 'leopoldstadt', 'liesing', 'margareten', 'mariahilf', 'meidling', 'neubau', 'ottakring', 'penzing', 'rudolfsheimfuenfhaus', 'simmering', 'waehring', 'wieden']
Current union area: 0.05019849
Old union area: 0.05019825
Processing Hietzing...
Intersection area: 0.0045599172
Added area: 0.0000003479
Removed area: 0.0000003517
Processing Hernals...
Intersection area: 0.0013781265
Added area: 0.0000014371
Removed area: 0.0000009508
Processing Ottakring...
Intersection area: 0.0010488677
Added area: 0.0000006143
Removed area: 0.0000011111
Processing Landstraße...
Intersection area: 0.0008950055
Added area: 0.0000009006
Removed area: 0.0000002287
Processing Floridsdorf...
Intersection area: 0.0053844130
Added area: 0.0000002850
Removed area: 0.0000002918
Processing Wieden...
Intersection area: 0.0002146687
Added area: 0.0000001684
Removed area: 0.0000001432
Processing Brigittenau...
Intersection area: 0.0006913255
Added area: 0.0000002889
Removed area: 0.0000001870
Processing Meidling...
Intersection area: 0.0009792523
Added area: 0.0000004004
Removed area: 0.0000004806
Processing Donaustadt...
Intersection area: 0.0123733615
Added area: 0.0000074603
Removed area: 0.0000005143
Processing Innere Stadt...
Intersection area: 0.0003469769
Added area: 0.0000000895
Removed area: 0.0000002386
Processing Neubau...
Intersection area: 0.0001939566
Added area: 0.0000005659
Removed area: 0.0000001306
Processing Favoriten...
Intersection area: 0.0038445072
Added area: 0.0000017924
Removed area: 0.0000018118
Processing Margareten...
Intersection area: 0.0002432142
Added area: 0.0000000785
Removed area: 0.0000000279
Processing Leopoldstadt...
Intersection area: 0.0023267692
Added area: 0.0000003926
Removed area: 0.0000079395
Processing Liesing...
Intersection area: 0.0038737405
Added area: 0.0000011884
Removed area: 0.0000020251
Processing Simmering...
Intersection area: 0.0028106739
Added area: 0.0000010154
Removed area: 0.0000001881
Processing Rudolfsheim-Fünfhaus...
Intersection area: 0.0004732979
Added area: 0.0000005993
Removed area: 0.0000011551
Processing Döbling...
Intersection area: 0.0030209021
Added area: 0.0000003971
Removed area: 0.0000003101
Processing Währing...
Intersection area: 0.0007677834
Added area: 0.0000005990
Removed area: 0.0000006099
Processing Josefstadt...
Intersection area: 0.0001318689
Added area: 0.0000000165
Removed area: 0.0000002046
Processing Mariahilf...
Intersection area: 0.0001757549
Added area: 0.0000004291
Removed area: 0.0000000378
Processing Alsergrund...
Intersection area: 0.0003588467
Added area: 0.0000002837
Removed area: 0.0000001799
Processing Penzing...
Intersection area: 0.0040845676
Added area: 0.0000013448
Removed area: 0.0000013303
Total raw features: 2584
Merged features: 2584
Changed union area: 0.0000224023
Saved to /work/vienna_boundary_changes.geojson
Change type counts: {'added_since_2014': 1273, 'removed_since_2014': 1288, 'unchanged': 23}
Districts with changes: ['Alsergrund', 'Brigittenau', 'Donaustadt', 'Döbling', 'Favoriten', 'Floridsdorf', 'Hernals', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Landstraße', 'Leopoldstadt', 'Liesing', 'Margareten', 'Mariahilf', 'Meidling', 'Neubau', 'Ottakring', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Simmering', 'Wieden', 'Währing']
Districts touching changed: []
Sample features (first 10):
Hietzing: added_since_2014, touches=False, area=0.0000000000
Hietzing: added_since_2014, touches=False, area=0.0000000040
Hietzing: added_since_2014, touches=False, area=0.0000000213
Hietzing: added_since_2014, touches=False, area=0.0000000000
Hietzing: added_since_2014, touches=False, area=0.0000000489
Hietzing: added_since_2014, touches=False, area=0.0000000058
Hietzing: added_since_2014, touches=False, area=0.0000000000
Hietzing: added_since_2014, touches=False, area=0.0000000360
Hietzing: added_since_2014, touches=False, area=0.0000000005
Hietzing: added_since_2014, touches=False, area=0.0000000139
File size: 2998924 bytes
97:30
There are far too many tiny fragments - 2584 features for tiny changes. The issue is that when computing differences between two polygons, the result is a lot of tiny sliver polygons because the boundaries don't exactly match due to slightly different coordinate representations. I need to apply a simplification/tolerance to remove these tiny slivers. Let me use a buffer-based approach: buffer by 0 (or very small positive amount) to snap, or use a tolerance to filter out slivers. Better yet, I should use a snapping tolerance when computing the difference. Actually, the proper approach is to: 1. Use a small tolerance (e.g., 1e-7 degrees ≈ 1cm) to simplify the geometries slightly before comparison 2. Filter out fragments smaller than a threshold Let me think about what tolerance makes sense. Differences in the member way geometries between 2014 and current could be due to: - Actual boundary changes - Different node coordinates (e.g., node movement within OSM) - Different way compositions (the boundary might use different ways now) The tiny slivers (1e-15 to 1e-8 area) are almost certainly numerical artifacts. Let me use a realistic threshold. Actually, the better approach is to: 1. Snap geometries to a small grid before comparison 2. Use a minimum area filter Let me use shapely's `snap` function or a buffer(0) approach. Actually the simplest is to use a tolerance in the difference operation. Let me restructure: - For each district, compute the symmetric difference - Filter out fragments below a meaningful area threshold (like 1e-10 square degrees ≈ 1 sq meter) - Also snap coordinates to a grid for cleaner results Actually the most reliable approach might be: 1. Simplify both polygons slightly (e.g., 1e-8 tolerance) 2. Compute difference/intersection 3. Remove tiny slivers (area < 1e-10) Let me also check if the "unchanged" polygons need simplification - they're the intersection which could also be sliver-filled.
97:30
Bash
python3 -c "
python3 -c "
# Check the scale of real changes vs artifacts
import json
from shapely.geometry import shape
with open('/work/vienna_boundary_changes.geojson') as f:
data = json.load(f)
areas = {}
for f in data['features']:
g = shape(f['geometry'])
ct = f['properties']['change_type']
if ct != 'unchanged':
area = g.area
key = (f['properties']['district_name'], ct)
if key not in areas:
areas[key] = 0
areas[key] += area
for key, total in sorted(areas.items(), key=lambda x: -x[1]):
print(f'{key[0]:30s} {key[1]:20s} total_area={total:.10f}')
# Check min/max areas of individual fragments
frag_areas = []
for f in data['features']:
if f['properties']['change_type'] != 'unchanged':
frag_areas.append(shape(f['geometry']).area)
if frag_areas:
print(f'\\nFragment areas: min={min(frag_areas):.15f}, max={max(frag_areas):.10f}, count={len(frag_areas)}')
# count above thresholds
for thresh in [1e-15, 1e-12, 1e-10, 1e-8, 1e-6]:
cnt = sum(1 for a in frag_areas if a > thresh)
print(f' > {thresh}: {cnt}')
"
Leopoldstadt removed_since_2014 total_area=0.0000079395 Donaustadt added_since_2014 total_area=0.0000074603 Liesing removed_since_2014 total_area=0.0000020251 Favoriten removed_since_2014 total_area=0.0000018118 Favoriten added_since_2014 total_area=0.0000017924 Hernals added_since_2014 total_area=0.0000014371 Penzing added_since_2014 total_area=0.0000013448 Penzing removed_since_2014 total_area=0.0000013303 Liesing added_since_2014 total_area=0.0000011884 Rudolfsheim-Fünfhaus removed_since_2014 total_area=0.0000011551 Ottakring removed_since_2014 total_area=0.0000011111 Simmering added_since_2014 total_area=0.0000010154 Hernals removed_since_2014 total_area=0.0000009508 Landstraße added_since_2014 total_area=0.0000009006 Ottakring added_since_2014 total_area=0.0000006143 Währing removed_since_2014 total_area=0.0000006099 Rudolfsheim-Fünfhaus added_since_2014 total_area=0.0000005993 Währing added_since_2014 total_area=0.0000005990 Neubau added_since_2014 total_area=0.0000005659 Donaustadt removed_since_2014 total_area=0.0000005143 Meidling removed_since_2014 total_area=0.0000004806 Mariahilf added_since_2014 total_area=0.0000004291 Meidling added_since_2014 total_area=0.0000004004 Döbling added_since_2014 total_area=0.0000003971 Leopoldstadt added_since_2014 total_area=0.0000003926 Hietzing removed_since_2014 total_area=0.0000003517 Hietzing added_since_2014 total_area=0.0000003479 Döbling removed_since_2014 total_area=0.0000003101 Floridsdorf removed_since_2014 total_area=0.0000002918 Brigittenau added_since_2014 total_area=0.0000002889 Floridsdorf added_since_2014 total_area=0.0000002850 Alsergrund added_since_2014 total_area=0.0000002837 Innere Stadt removed_since_2014 total_area=0.0000002386 Landstraße removed_since_2014 total_area=0.0000002287 Josefstadt removed_since_2014 total_area=0.0000002046 Simmering removed_since_2014 total_area=0.0000001881 Brigittenau removed_since_2014 total_area=0.0000001870 Alsergrund removed_since_2014 total_area=0.0000001799 Wieden added_since_2014 total_area=0.0000001684 Wieden removed_since_2014 total_area=0.0000001432 Neubau removed_since_2014 total_area=0.0000001306 Innere Stadt added_since_2014 total_area=0.0000000895 Margareten added_since_2014 total_area=0.0000000785 Mariahilf removed_since_2014 total_area=0.0000000378 Margareten removed_since_2014 total_area=0.0000000278 Josefstadt added_since_2014 total_area=0.0000000165 Fragment areas: min=0.000000000001005, max=0.0000046140, count=2561 > 1e-15: 2561 > 1e-12: 2561 > 1e-10: 1973 > 1e-08: 430 > 1e-06: 7
98:06
The results look plausible but there are many tiny fragments that are likely numerical artifacts from polygon operations. I need to filter these out with a reasonable area threshold. Also, the `touches_changed` is False for all - that's because the tiny fragments are so small they don't touch. Let me use a more robust approach. Let me use a minimum area threshold (e.g., 1e-10 or even 1e-8 in degrees-squared) to filter out numerical noise. Let me also compute total changes in a more meaningful way and set touches_changed properly. Looking at the data: - The largest real changes are Leopoldstadt removed (0.00000794 sq.deg) ≈ ~0.88 km² (very roughly, near Vienna 1 deg ≈ 111km) - Donaustadt added (0.00000746) - These are real changes (district boundary adjustments) The tiny fragments (1e-15 to 1e-10) are numerical noise from the intersection/difference operations on nearly-identical geometries. Let me filter with a threshold that keeps real changes but removes noise. A threshold of 1e-10 degrees² (about 0.000012 km² or 12 m²) should work. Actually, looking at the data, even 1e-9 might be too aggressive. Let me use something like 1e-12 to keep the very tiny ones but still filter out extreme noise. Actually, let me rethink. The issue is that the polygon operations on large polygons generate tiny sliver polygons at boundaries where coordinates don't perfectly align. These are mathematical artifacts, not real changes. A better approach: use a minimum area threshold based on the coordinate precision. Overpass coordinates have ~7 decimal places, so precision is about 1e-7 degrees. A meaningful area threshold could be around 1e-10 degrees². Let me filter with 1e-10 as threshold and see what remains.
98:06
Write
/work/process.py
content · 363 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
Uses polygonize on merged outer lines to handle unordered way segments.
Filters out tiny sliver artifacts from geometric computations.
"""
import json
from shapely.geometry import Polygon, MultiPolygon, LineString, shape, mapping
from shapely.ops import unary_union, polygonize
from shapely.validation import make_valid
from collections import Counter, defaultdict
# Minimum area threshold in square degrees to filter numerical artifacts
# Coordinate precision is ~1e-7 deg, so smallest meaningful area ~1e-14 deg²
# Real changes should be > 1e-10 deg² (about 1 m² at Vienna's latitude)
MIN_AREA = 1e-10
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry."""
members = elem.get('members', [])
if not members:
return None
outer_lines = []
inner_lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) < 2:
continue
coords = [(p['lon'], p['lat']) for p in geom]
line = LineString(coords)
role = m.get('role', 'outer')
if role == 'inner':
inner_lines.append(line)
else:
outer_lines.append(line)
if not outer_lines:
return None
merged_outer = unary_union(outer_lines)
polygons = list(polygonize(merged_outer))
if not polygons:
return None
main_poly = max(polygons, key=lambda p: p.area)
if inner_lines:
try:
inner_merged = unary_union(inner_lines)
inner_polys = list(polygonize(inner_merged))
if inner_polys:
inner_area = unary_union(inner_polys)
main_poly = main_poly.difference(inner_area)
except Exception:
pass
if not main_poly.is_valid:
valid = make_valid(main_poly)
if valid.geom_type == 'Polygon':
main_poly = valid
elif valid.geom_type == 'MultiPolygon':
main_poly = max(valid.geoms, key=lambda p: p.area)
return main_poly
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
n = name.lower().strip()
n = n.replace('straße', 'strasse')
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
n = n.replace('-', '')
n = n.replace(' ', '')
return n
def filter_geom(geom, min_area=MIN_AREA):
"""Remove tiny polygons from a geometry."""
if geom is None or geom.is_empty:
return None
if geom.geom_type == 'MultiPolygon':
filtered = [p for p in geom.geoms if p.area >= min_area]
if not filtered:
return None
if len(filtered) == 1:
return filtered[0]
return MultiPolygon(filtered)
elif geom.geom_type == 'Polygon':
if geom.area >= min_area:
return geom
return None
return geom
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
print(f"2014 districts: {len(old_polys)}")
# Match names
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"Common: {len(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Match by ref if needed
if only_cur or only_old:
print("\nAttempting ref-based name matching...")
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' -> mapping")
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
print(f"Common after ref matching: {len(common_norms)}")
# Compute overall boundary extents
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
print(f"\nCurrent union area: {all_current_union.area:.8f}")
print(f"Old union area: {all_old_union.area:.8f}")
# For each district, compute boundary changes
district_change_geoms = {}
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...", end=" ")
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly)
removed_part = old_poly.difference(cur_poly)
# Filter out tiny slivers
added_part = filter_geom(added_part)
removed_part = filter_geom(removed_part)
intersection = filter_geom(intersection)
print(f"added={added_part.area if added_part else 0:.10f} removed={removed_part.area if removed_part else 0:.10f}")
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Create features
features = []
def add_geom(geom, change_type, name):
if geom is None or geom.is_empty:
return
if geom.geom_type == 'MultiPolygon':
for poly in geom.geoms:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(poly)
})
elif geom.geom_type == 'Polygon':
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(geom)
})
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
add_geom(info['added'], 'added_since_2014', cur_name)
add_geom(info['removed'], 'removed_since_2014', cur_name)
add_geom(info['intersection'], 'unchanged', cur_name)
print(f"\nTotal raw features: {len(features)}")
# Merge features by district_name and change_type
groups = defaultdict(list)
for f in features:
key = (f['properties']['district_name'], f['properties']['change_type'])
groups[key].append(shape(f['geometry']))
merged_features = []
for (district_name, change_type), geoms in groups.items():
merged = unary_union(geoms)
if merged.is_empty:
continue
if merged.geom_type == 'MultiPolygon':
polys = list(merged.geoms)
else:
polys = [merged]
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district_name,
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Compute touches_changed
# Collect all changed geometries
changed_geoms = []
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_geoms.append(shape(f['geometry']))
changed_union = unary_union(changed_geoms) if changed_geoms else None
print(f"Changed union area: {changed_union.area if changed_union else 0:.10f}")
# For each feature, determine if its district touches any changed area
# A district touches changed if its boundary (current or 2014) shares a border
# with any changed fragment
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
touches = False
if changed_union and not changed_union.is_empty:
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
# Check if the full district polygon touches the changed union
if cur_info:
try:
if cur_info[1].touches(changed_union):
touches = True
except Exception:
pass
if not touches and old_info:
try:
if old_info[1].touches(changed_union):
touches = True
except Exception:
pass
# Also check if this specific fragment itself touches
if not touches:
try:
frag = shape(f['geometry'])
if frag.touches(changed_union):
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"\nChange type counts: {dict(change_counts)}")
# Report total areas per district and change type
district_summary = defaultdict(lambda: {'added': 0.0, 'removed': 0.0, 'unchanged': 0.0})
for f in merged_features:
ct = f['properties']['change_type']
district = f['properties']['district_name']
area = shape(f['geometry']).area
if ct == 'added_since_2014':
district_summary[district]['added'] += area
elif ct == 'removed_since_2014':
district_summary[district]['removed'] += area
else:
district_summary[district]['unchanged'] += area
print("\nDistrict change summary (areas in square degrees):")
print(f"{'District':30s} {'Added':>15s} {'Removed':>15s} {'Touches Changed':>15s}")
print("-" * 80)
for district in sorted(district_summary.keys()):
s = district_summary[district]
# Find touches_changed for this district
touches = False
for f in merged_features:
if f['properties']['district_name'] == district:
if f['properties']['touches_changed']:
touches = True
break
print(f"{district:30s} {s['added']:15.10f} {s['removed']:15.10f} {str(touches):>15s}")
import os
size = os.path.getsize('/work/vienna_boundary_changes.geojson')
print(f"\nFile size: {size} bytes")
Wrote /work/process.py (11676 chars).
98:12
Bash
python3 /work/process.py
Building geometries for current data...
Building geometries for 2014 data...
Current districts: 23
2014 districts: 23
Common: 23
Only current: []
Only 2014: []
Common after ref matching: 23
Current union area: 0.05019849
Old union area: 0.05019825
Processing Alsergrund... added=0.0000002835 removed=0.0000001797
Processing Neubau... added=0.0000005658 removed=0.0000001304
Processing Favoriten... added=0.0000017919 removed=0.0000018114
Processing Rudolfsheim-Fünfhaus... added=0.0000005993 removed=0.0000011550
Processing Brigittenau... added=0.0000002887 removed=0.0000001868
Processing Simmering... added=0.0000010151 removed=0.0000001877
Processing Hernals... added=0.0000014362 removed=0.0000009499
Processing Donaustadt... added=0.0000074599 removed=0.0000005140
Processing Meidling... added=0.0000004001 removed=0.0000004805
Processing Mariahilf... added=0.0000004289 removed=0.0000000377
Processing Margareten... added=0.0000000783 removed=0.0000000276
Processing Währing... added=0.0000005986 removed=0.0000006094
Processing Ottakring... added=0.0000006136 removed=0.0000011108
Processing Hietzing... added=0.0000003476 removed=0.0000003517
Processing Liesing... added=0.0000011880 removed=0.0000020242
Processing Landstraße... added=0.0000009005 removed=0.0000002285
Processing Josefstadt... added=0.0000000164 removed=0.0000002046
Processing Döbling... added=0.0000003968 removed=0.0000003099
Processing Wieden... added=0.0000001681 removed=0.0000001431
Processing Penzing... added=0.0000013436 removed=0.0000013284
Processing Innere Stadt... added=0.0000000895 removed=0.0000002385
Processing Floridsdorf... added=0.0000002847 removed=0.0000002915
Processing Leopoldstadt... added=0.0000003924 removed=0.0000079392
Total raw features: 1996
Merged features: 1996
Changed union area: 0.0000223930
Saved to /work/vienna_boundary_changes.geojson
Change type counts: {'added_since_2014': 981, 'removed_since_2014': 992, 'unchanged': 23}
District change summary (areas in square degrees):
District Added Removed Touches Changed
--------------------------------------------------------------------------------
Alsergrund 0.0000002835 0.0000001797 True
Brigittenau 0.0000002887 0.0000001868 True
Donaustadt 0.0000074599 0.0000005140 False
Döbling 0.0000003968 0.0000003099 True
Favoriten 0.0000017919 0.0000018114 True
Floridsdorf 0.0000002847 0.0000002915 True
Hernals 0.0000014362 0.0000009499 True
Hietzing 0.0000003476 0.0000003517 False
Innere Stadt 0.0000000895 0.0000002385 True
Josefstadt 0.0000000164 0.0000002046 True
Landstraße 0.0000009005 0.0000002285 True
Leopoldstadt 0.0000003924 0.0000079392 True
Liesing 0.0000011880 0.0000020242 True
Margareten 0.0000000783 0.0000000276 True
Mariahilf 0.0000004289 0.0000000377 True
Meidling 0.0000004001 0.0000004805 False
Neubau 0.0000005658 0.0000001304 True
Ottakring 0.0000006136 0.0000011108 True
Penzing 0.0000013436 0.0000013284 True
Rudolfsheim-Fünfhaus 0.0000005993 0.0000011550 True
Simmering 0.0000010151 0.0000001877 True
Wieden 0.0000001681 0.0000001431 True
Währing 0.0000005986 0.0000006094 False
File size: 2612648 bytes
98:27
The touches_changed doesn't seem right - Donaustadt has changes but touches_changed=False? Let me investigate. The issue is that touches() checks if geometries share boundary points but don't overlap interior. The changed areas are within the district boundary, so the district polygon actually contains the changed parts rather than just touching them. Actually, the spec says: "For each district flag whether its geometry borders any area that was added or removed." This means: if the district's border (perimeter) touches any changed area. Since changed areas are INSIDE the district, the district geometry overlaps/contains them, not touches them. Let me think about this differently. "Borders" might mean: the district shares a border with changed areas. Since the changed areas are fragments of the district boundary itself (boundary changes), the district should "border" those changes inherently. Actually, re-reading: "For each district flag whether its geometry borders any area that was added or removed." I think this means: for each district, does this district's boundary border (touch) any area that was either added or removed? The changed areas are the boundary changes - they are at the edges of the district. So a district that had boundary changes would have its boundary touching those changed fragments. But with the `touches()` predicate in Shapely: A.touches(B) means they share boundary points but have no interior intersection. Since the changed fragment is inside the district polygon, the district polygon doesn't touch it - it contains it. I should change the logic: check if the district's boundary (ring) touches the changed areas, or perhaps check if the district's perimeter line has any contact with the changed fragments. Actually, the simplest interpretation: if the district has any added or removed area, then it borders changed areas (since the changes are at its boundary). Let me check if this is what makes sense. Actually, let me re-read: "For each district flag whether its geometry borders any area that was added or removed." This could mean: does the district's geometry border (i.e. is adjacent to) any area that was added or removed? The changed areas are at the boundary of districts. So if a district has added_since_2014 areas, those areas share a boundary with the district. Let me fix the logic to check if the district's boundary line touches/intersects the changed fragments' boundaries.
98:27
Bash
python3 -c "
python3 -c "
from shapely.geometry import Polygon, LineString, Point, MultiPolygon
from shapely.ops import unary_union
# Simulate: a district (current) with a small added part on its boundary
district = Polygon([(0,0), (10,0), (10,10), (0,10), (0,0)])
old_district = Polygon([(1,1), (9,1), (9,9), (1,9), (1,1)])
added = district.difference(old_district)
removed = old_district.difference(district)
print('Added area:', added.area)
print('Removed area:', removed.area)
# Does district touch the added part?
print('District touches added?', district.touches(added)) # False - contains it
# Does district boundary touch added?
boundary = district.boundary
print('District boundary touches added?', boundary.touches(added))
# Does added share boundary with district?
print('Added touches district?', added.touches(district))
# Intersection of boundaries?
print('Boundary intersection area:', district.boundary.intersection(added.boundary).length)
"
Added area: 36.0 Removed area: 0.0 District touches added? False District boundary touches added? True Added touches district? False Boundary intersection area: 40.0
99:22
I see - `touches` checks if the geometry touches at the boundary but doesn't overlap. For a difference operation, the added part is inside the district polygon. So `district.touches(added)` returns False because the added part is actually inside/overlapping, not touching at the boundary. Actually wait, `added` = current district minus old district. The added part is part of the current district polygon, so they overlap. `touches` returns False for overlapping geometries. The question says "flag whether its geometry borders any area that was added or removed". So the question is about whether the district's geometry borders (touches) any changed area. For `added_since_2014` fragments: these are parts of the current district that weren't in 2014. The current district contains them, so the district's geometry doesn't "border" them - it contains them. But the boundary of the added part IS part of the district's boundary. Actually, let me re-read: "For each district flag whether its geometry borders any area that was added or removed." I think the intent is: for each district, does its boundary (as seen in either 2014 or current version) touch any area that has been added or removed? An added area is part of the current district - so the current district contains it, the boundary between the added part and the rest of the district is interior to the current district. But the added part's outer boundary (the part that's not shared with the district) borders neighboring districts. Similarly, a removed area was part of the old district - so the old district contained it. So "borders any area that was added or removed" means: does this district share a boundary with a changed fragment? For a district that HAS added parts: those added parts are within the district itself, so the district doesn't border them. The district borders them only along the boundary where the added part meets neighboring districts. For a district that HAS removed parts: those removed parts are the ones that changed. So "touches_changed" should be True if the district's boundary geometry touches any added or removed fragment. Let me think about this differently: - The changed fragments are: added_since_2014 and removed_since_2014 - A district "borders" a changed area if its boundary touches it For a district that contains an added_since_2014 fragment: the district doesn't "border" it because it's inside. But neighboring districts DO border it. For a district that lost area (removed_since_2014): nearby districts may now border that area. So touches_changed should be True if the district's boundary touches any changed fragment. This means: 1. For districts that have added fragments: the fragments are inside the district, so the district doesn't border them. But check if the fragment's boundary touches the district boundary - that would be the case where the added area is on the edge. 2. For districts that have removed fragments: the fragments were inside the old district. The district [current district] might not border the removed area (since it's no longer part of the district). Actually, I think the most meaningful approach is: a district borders a changed area if the district (either current or old version) shares a boundary with any changed fragment. Wait, actually a simpler and more logical interpretation: - If a district has an `added_since_2014` fragment, that fragment is part of the district, so the district _contains_ it, not borders it. But border districts do border it. - If a district has a `removed_since_2014` fragment, the current district doesn't contain it anymore, and border districts border it. Let me just check: does the feature's geometry itself touch any changed area? And does the full district's boundary touch any changed area? I think the most useful definition: for each feature in the output, touches_changed = True if the district's boundary (in either snapshot) touches any changed fragment (not in the same district). Actually, re-reading the problem: "For each district flag whether its geometry borders any area that was added or removed." This means: for each district (overall, not per fragment), does its geometry border any area that is added or removed? So a district borders an added area if they share a boundary. A district borders a removed area if they share a boundary. Let me think about what this means for each district: - Leopoldstadt has a large removed area. That removed area borders Leopoldstadt (it was part of it) and also borders neighboring districts. - For Leopoldstadt, touches_changed should be True (it borders its own removed area). - For a neighboring district like Brigittenau, touches_changed should also be True because it borders the area that Leopoldstadt has lost. My current approach checks if the district polygon touches the changed_union. But `touches` returns False when one contains the other. So for a district that has added parts, `district.touches(changed_union)` would be False because the district contains the added parts. I need to check if the district's BOUNDARY touches the changed fragments. Let me use `district.boundary.intersects(changed_union)` or check if they share boundary. Actually, let me reconsider. The simplest correct definition: if any part of the changed area touches the district's boundary, then touches_changed=True. For an added fragment that's inside the district: the fragment's boundary touches the rest-of-district which is inside the district, but the district's outer boundary doesn't necessarily touch it. Unless the added fragment extends to the district boundary. Actually, I think I'm overcomplicating this. Let me just use: does the district boundary intersect with any changed fragment boundary? If so, touches_changed = True. Let me update the code.
99:22
Write
/work/process.py
content · 425 lines · py
"""
Process current and 2014 district geometries, compute boundary changes,
and output vienna_boundary_changes.geojson
Uses polygonize on merged outer lines to handle unordered way segments.
Filters out tiny sliver artifacts from geometric computations.
"""
import json
from shapely.geometry import Polygon, MultiPolygon, LineString, shape, mapping
from shapely.ops import unary_union, polygonize
from shapely.validation import make_valid
from collections import Counter, defaultdict
# Minimum area threshold to filter numerical artifacts
MIN_AREA = 1e-10
def build_polygon_from_members(elem):
"""Build a Polygon from a relation's member ways with geometry."""
members = elem.get('members', [])
if not members:
return None
outer_lines = []
inner_lines = []
for m in members:
geom = m.get('geometry', [])
if len(geom) < 2:
continue
coords = [(p['lon'], p['lat']) for p in geom]
line = LineString(coords)
role = m.get('role', 'outer')
if role == 'inner':
inner_lines.append(line)
else:
outer_lines.append(line)
if not outer_lines:
return None
merged_outer = unary_union(outer_lines)
polygons = list(polygonize(merged_outer))
if not polygons:
return None
main_poly = max(polygons, key=lambda p: p.area)
if inner_lines:
try:
inner_merged = unary_union(inner_lines)
inner_polys = list(polygonize(inner_merged))
if inner_polys:
inner_area = unary_union(inner_polys)
main_poly = main_poly.difference(inner_area)
except Exception:
pass
if not main_poly.is_valid:
valid = make_valid(main_poly)
if valid.geom_type == 'Polygon':
main_poly = valid
elif valid.geom_type == 'MultiPolygon':
main_poly = max(valid.geoms, key=lambda p: p.area)
return main_poly
def normalize_name(name):
"""Normalize district names so they match between 2014 and current."""
if not name:
return name
n = name.lower().strip()
n = n.replace('straße', 'strasse')
n = n.replace('ä', 'ae').replace('ö', 'oe').replace('ü', 'ue')
n = n.replace('ß', 'ss')
n = n.replace('-', '')
n = n.replace(' ', '')
return n
def filter_geom(geom, min_area=MIN_AREA):
"""Remove tiny polygons from a geometry."""
if geom is None or geom.is_empty:
return None
if geom.geom_type == 'MultiPolygon':
filtered = [p for p in geom.geoms if p.area >= min_area]
if not filtered:
return None
if len(filtered) == 1:
return filtered[0]
return MultiPolygon(filtered)
elif geom.geom_type == 'Polygon':
if geom.area >= min_area:
return geom
return None
return geom
# Load data
with open('/work/districts_current_geom.json') as f:
cur_data = json.load(f)
with open('/work/districts_2014_geom.json') as f:
old_data = json.load(f)
print("Building geometries for current data...")
cur_polys = {}
for e in cur_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
cur_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print("Building geometries for 2014 data...")
old_polys = {}
for e in old_data['elements']:
name = e.get('tags', {}).get('name', '')
norm = normalize_name(name)
poly = build_polygon_from_members(e)
if poly:
old_polys[norm] = (name, poly)
else:
print(f" Warning: {name} produced no geometry")
print(f"\nCurrent districts: {len(cur_polys)}")
print(f"2014 districts: {len(old_polys)}")
# Match names
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
only_cur = set(cur_polys.keys()) - set(old_polys.keys())
only_old = set(old_polys.keys()) - set(cur_polys.keys())
print(f"Common: {len(common_norms)}")
print(f"Only current: {sorted(only_cur)}")
print(f"Only 2014: {sorted(only_old)}")
# Match by ref if needed
if only_cur or only_old:
print("\nAttempting ref-based name matching...")
cur_by_ref = {}
for e in cur_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
cur_by_ref[ref] = normalize_name(tags.get('name', ''))
old_by_ref = {}
for e in old_data['elements']:
tags = e.get('tags', {})
ref = tags.get('ref', '')
if ref:
old_by_ref[ref] = normalize_name(tags.get('name', ''))
for ref, cur_norm in cur_by_ref.items():
if ref in old_by_ref:
old_norm = old_by_ref[ref]
if cur_norm != old_norm:
print(f" Ref {ref}: cur='{cur_norm}', old='{old_norm}' -> mapping")
if old_norm in old_polys and cur_norm not in old_polys:
old_polys[cur_norm] = old_polys[old_norm]
del old_polys[old_norm]
common_norms = set(cur_polys.keys()) & set(old_polys.keys())
print(f"Common after ref matching: {len(common_norms)}")
# Compute overall boundary extents
all_current_union = unary_union([p for _, p in cur_polys.values()])
all_old_union = unary_union([p for _, p in old_polys.values()])
print(f"\nCurrent union area: {all_current_union.area:.8f}")
print(f"Old union area: {all_old_union.area:.8f}")
# For each district, compute boundary changes
district_change_geoms = {}
for norm in common_norms:
cur_name, cur_poly = cur_polys[norm]
old_name, old_poly = old_polys[norm]
print(f"\nProcessing {cur_name}...", end=" ")
intersection = cur_poly.intersection(old_poly)
added_part = cur_poly.difference(old_poly)
removed_part = old_poly.difference(cur_poly)
# Filter out tiny slivers
added_part = filter_geom(added_part)
removed_part = filter_geom(removed_part)
intersection = filter_geom(intersection)
print(f"added={added_part.area if added_part else 0:.10f} removed={removed_part.area if removed_part else 0:.10f}")
district_change_geoms[norm] = {
'cur_name': cur_name,
'added': added_part,
'removed': removed_part,
'intersection': intersection
}
# Create features
features = []
def add_geom(geom, change_type, name):
if geom is None or geom.is_empty:
return
if geom.geom_type == 'MultiPolygon':
for poly in geom.geoms:
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(poly)
})
elif geom.geom_type == 'Polygon':
features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': name,
},
'geometry': mapping(geom)
})
for norm in common_norms:
info = district_change_geoms[norm]
cur_name = info['cur_name']
add_geom(info['added'], 'added_since_2014', cur_name)
add_geom(info['removed'], 'removed_since_2014', cur_name)
add_geom(info['intersection'], 'unchanged', cur_name)
print(f"\nTotal raw features: {len(features)}")
# Merge features by district_name and change_type
groups = defaultdict(list)
for f in features:
key = (f['properties']['district_name'], f['properties']['change_type'])
groups[key].append(shape(f['geometry']))
merged_features = []
for (district_name, change_type), geoms in groups.items():
merged = unary_union(geoms)
if merged.is_empty:
continue
if merged.geom_type == 'MultiPolygon':
polys = list(merged.geoms)
else:
polys = [merged]
for poly in polys:
merged_features.append({
'type': 'Feature',
'properties': {
'change_type': change_type,
'district_name': district_name,
},
'geometry': mapping(poly)
})
print(f"Merged features: {len(merged_features)}")
# Compute touches_changed for each feature
# A district "borders" a changed area if its boundary shares any edge with
# a changed fragment. We check this by seeing if the union of all changed
# fragments touches the district's boundary.
# Collect all changed geometries
changed_geoms = []
for f in merged_features:
if f['properties']['change_type'] in ('added_since_2014', 'removed_since_2014'):
changed_geoms.append(shape(f['geometry']))
changed_union = unary_union(changed_geoms) if changed_geoms else None
print(f"Changed union area: {changed_union.area if changed_union else 0:.10f}")
# Precompute boundary of changed union
changed_boundary = changed_union.boundary if changed_union and not changed_union.is_empty else None
# For each district, check if its boundary (current or 2014) shares any edge
# with the changed fragments. This is the true "borders" check.
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
touches = False
if changed_boundary and not changed_boundary.is_empty:
# Get the full district polygon (both versions)
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
# Check if the district's boundary intersects the changed boundary
# Use intersects rather than touches since we want to catch shared edges
if cur_info:
try:
# A district borders a changed area if the changed area's boundary
# touches the district (shares an edge)
if cur_info[1].touches(changed_union):
touches = True
# Also check: the fragment's boundary might be inside the district
# (for added parts that are fully inside). In that case, the district
# boundary doesn't touch the changed area.
# But for added/removed parts on the edge, they do.
except Exception:
pass
if not touches and old_info:
try:
if old_info[1].touches(changed_union):
touches = True
except Exception:
pass
f['properties']['touches_changed'] = touches
# However, the above might not catch cases where a district's own added area
# is on its boundary (which borders neighboring districts but not the district itself).
# Let's also check: does this specific feature touch any changed area that belongs
# to a DIFFERENT district?
# For the feature itself: if it's an added fragment of district A, it borders
# district A in the sense that it's part of A. But does it border other districts?
# The problem says "For each district flag whether its geometry borders any area
# that was added or removed."
#
# I think the intent is: does this district share a border with any changed area?
# This means we need to check ALL changed areas, not just those in the same district.
# And the changed areas could be:
# 1. Inside the district -> district doesn't border them (they're inside)
# 2. On the edge of the district -> district borders them
# 3. Outside the district -> district might border them
# So let me recompute touches_changed properly:
# For each district, does its boundary share any edge with changed fragments?
# We'll check ALL districts against ALL changed fragments
# First, compute the "border" relationship:
# A district's boundary borders a changed area if:
# - The changed area touches the district, OR
# - The changed area is inside the district (added part) - in this case the
# district doesn't "border" it, it CONTAINS it.
# Recompute:
district_touches = {}
for norm in common_norms:
cur_info = cur_polys.get(norm)
old_info = old_polys.get(norm)
touches = False
if changed_union and not changed_union.is_empty:
# Check if district boundary touches changed union
if cur_info:
try:
if cur_info[1].touches(changed_union) or cur_info[1].boundary.intersects(changed_boundary):
touches = True
except Exception:
pass
if not touches and old_info:
try:
if old_info[1].touches(changed_union) or old_info[1].boundary.intersects(changed_boundary):
touches = True
except Exception:
pass
district_touches[norm] = touches
# Apply to features
for f in merged_features:
district_name = f['properties']['district_name']
norm = normalize_name(district_name)
f['properties']['touches_changed'] = district_touches.get(norm, False)
# Write output
output = {
'type': 'FeatureCollection',
'features': merged_features
}
with open('/work/vienna_boundary_changes.geojson', 'w') as f:
json.dump(output, f, indent=2)
print("\nSaved to /work/vienna_boundary_changes.geojson")
# Summary
change_counts = Counter(f['properties']['change_type'] for f in merged_features)
print(f"\nChange type counts: {dict(change_counts)}")
# Report per district
district_summary = defaultdict(lambda: {'added': 0.0, 'removed': 0.0, 'unchanged': 0.0})
touches_map = {}
for f in merged_features:
ct = f['properties']['change_type']
district = f['properties']['district_name']
area = shape(f['geometry']).area
if ct == 'added_since_2014':
district_summary[district]['added'] += area
elif ct == 'removed_since_2014':
district_summary[district]['removed'] += area
else:
district_summary[district]['unchanged'] += area
touches_map[district] = touches_map.get(district, False) or f['properties']['touches_changed']
print("\nDistrict change summary:")
print(f"{'District':30s} {'Added':>15s} {'Removed':>15s} {'Touches':>10s}")
print("-" * 75)
for district in sorted(district_summary.keys()):
s = district_summary[district]
print(f"{district:30s} {s['added']:15.10f} {s['removed']:15.10f} {str(touches_map[district]):>10s}")
import os
size = os.path.getsize('/work/vienna_boundary_changes.geojson')
print(f"\nFile size: {size} bytes")
# Verify output schema
print(f"\nOutput schema check:")
print(f" Features count: {len(merged_features)}")
print(f" All MultiPolygon: {all(shape(f['geometry']).geom_type == 'Polygon' for f in merged_features)}")
print(f" Properties present: change_type, district_name, touches_changed")
Wrote /work/process.py (14627 chars).
99:33
Wait, the output spec says MultiPolygon, but my features are individual Polygon geometries. Let me re-read the requirement: "Output vienna_boundary_changes.geojson, MultiPolygon, each feature with change_type, district_name (normalised), and touches_changed (boolean)." So each feature should be a MultiPolygon geometry. I should merge all polygons of the same change_type for each district into a single MultiPolygon feature, not individual Polygon features. Let me adjust. Actually, looking at the current output, each feature is a Polygon. The requirement says "MultiPolygon" - meaning the geometry type in the GeoJSON should be MultiPolygon. Let me update to wrap each feature's geometry as MultiPolygon.
99:33
99:33
Edit
/work/process.py
Edited /work/process.py.
99:42
Also the add_geom function creates individual polygon features. Let me fix that too so that each feature has MultiPolygon geometry. Actually, the spec says "MultiPolygon" as the geometry type. So each feature should be a MultiPolygon. Let me modify the add_geom function too. Wait, actually looking at the code flow: 1. add_geom creates individual Polygon features (one per polygon in the multipolygon) 2. Then they're grouped by (district_name, change_type) and merged into a single geometry per group 3. The merged geometry is wrapped in MultiPolygon if it's a Polygon So the output features will each be MultiPolygon. That's correct. But the add_geom function splits MultiPolygons into individual Polygons, then they get merged back. That's fine - it would just be more efficient to not split them, but it works. Let me also make sure the unchanged features are also MultiPolygon. Let me check.
99:42
99:42
Edit
/work/process.py
Edited /work/process.py.
99:53
Wait, the issue is that `add_geom` is called with already-filtered geometries (since we call `filter_geom` before). But some of the filtered geometries might still have tiny sub-polygons. Let me also handle the case where we should ensure MultiPolygon output. Actually, looking back at the spec: - "Output vienna_boundary_changes.geojson, MultiPolygon" - So each feature should be a MultiPolygon But we're currently outputting individual Polygon fragments as separate features. The spec says each feature is MultiPolygon. Let me think about what that means... Looking at the output more carefully, the instruction says "each feature with change_type, district_name, and touches_changed". It seems like each feature is a geometry fragment, and the geometry type should be MultiPolygon. Let me update to ensure all features are MultiPolygon.
99:53
Edit
/work/process.py
Edited /work/process.py.
100:02
Wait, I'm overcomplicating things. The spec says "MultiPolygon" for the output geometry type. But typically in GeoJSON, having Polygon or MultiPolygon is fine, and many consumers accept either. Let me re-read the spec: "Output vienna_boundary_changes.geojson, MultiPolygon, each feature with change_type, district_name (normalised), and touches_changed (boolean)." OK, the spec says MultiPolygon geometry. But converting every single polygon to a MultiPolygon with one polygon in it seems unnecessary. However, the user explicitly asked for it. Let me keep the MultiPolygon format for the final output. Actually, let me re-think: the current approach creates many small fragment features (one per polygon in the difference/intersection result). Each is currently stored as Polygon (after my edit, MultiPolygon([poly])). That's fine. But the problem also says the final output should have each fragment as a MultiPolygon. So my edit is correct. Let me run it now.
100:02
Bash
python3 /work/process.py
Building geometries for current data...
Building geometries for 2014 data...
Current districts: 23
2014 districts: 23
Common: 23
Only current: []
Only 2014: []
Common after ref matching: 23
Current union area: 0.05019849
Old union area: 0.05019825
Processing Hietzing... added=0.0000003476 removed=0.0000003517
Processing Liesing... added=0.0000011880 removed=0.0000020242
Processing Penzing... added=0.0000013436 removed=0.0000013284
Processing Favoriten... added=0.0000017919 removed=0.0000018114
Processing Innere Stadt... added=0.0000000895 removed=0.0000002385
Processing Leopoldstadt... added=0.0000003924 removed=0.0000079392
Processing Josefstadt... added=0.0000000164 removed=0.0000002046
Processing Wieden... added=0.0000001681 removed=0.0000001431
Processing Floridsdorf... added=0.0000002847 removed=0.0000002915
Processing Ottakring... added=0.0000006136 removed=0.0000011108
Processing Donaustadt... added=0.0000074599 removed=0.0000005140
Processing Landstraße... added=0.0000009005 removed=0.0000002285
Processing Alsergrund... added=0.0000002835 removed=0.0000001797
Processing Meidling... added=0.0000004001 removed=0.0000004805
Processing Brigittenau... added=0.0000002887 removed=0.0000001868
Processing Rudolfsheim-Fünfhaus... added=0.0000005993 removed=0.0000011550
Processing Hernals... added=0.0000014362 removed=0.0000009499
Processing Döbling... added=0.0000003968 removed=0.0000003099
Processing Margareten... added=0.0000000783 removed=0.0000000276
Processing Währing... added=0.0000005986 removed=0.0000006094
Processing Neubau... added=0.0000005658 removed=0.0000001304
Processing Simmering... added=0.0000010151 removed=0.0000001877
Processing Mariahilf... added=0.0000004289 removed=0.0000000377
Total raw features: 1996
Merged features: 69
Changed union area: 0.0000223930
Saved to /work/vienna_boundary_changes.geojson
Change type counts: {'added_since_2014': 23, 'removed_since_2014': 23, 'unchanged': 23}
District change summary:
District Added Removed Touches
---------------------------------------------------------------------------
Alsergrund 0.0000002835 0.0000001797 True
Brigittenau 0.0000002887 0.0000001868 True
Donaustadt 0.0000074599 0.0000005140 True
Döbling 0.0000003968 0.0000003099 True
Favoriten 0.0000017919 0.0000018114 True
Floridsdorf 0.0000002847 0.0000002915 True
Hernals 0.0000014362 0.0000009499 True
Hietzing 0.0000003476 0.0000003517 True
Innere Stadt 0.0000000895 0.0000002385 True
Josefstadt 0.0000000164 0.0000002046 True
Landstraße 0.0000009005 0.0000002285 True
Leopoldstadt 0.0000003924 0.0000079392 True
Liesing 0.0000011880 0.0000020242 True
Margareten 0.0000000783 0.0000000276 True
Mariahilf 0.0000004289 0.0000000377 True
Meidling 0.0000004001 0.0000004805 True
Neubau 0.0000005658 0.0000001304 True
Ottakring 0.0000006136 0.0000011108 True
Penzing 0.0000013436 0.0000013284 True
Rudolfsheim-Fünfhaus 0.0000005993 0.0000011550 True
Simmering 0.0000010151 0.0000001877 True
Wieden 0.0000001681 0.0000001431 True
Währing 0.0000005986 0.0000006094 True
File size: 2337409 bytes
Output schema check:
Features count: 69
All MultiPolygon: False
Properties present: change_type, district_name, touches_changed
100:11
The output shows "All MultiPolygon: False" - let me check what geometry types we have. Also, the touches_changed is True for all districts, which seems plausible since all districts border their own changes. Let me verify the geometry types and fix them to be MultiPolygon.
100:11
100:11
Bash
python3 -c "
python3 -c "
import json
from shapely.geometry import shape
with open('/work/vienna_boundary_changes.geojson') as f:
data = json.load(f)
from collections import Counter
geom_types = Counter()
for f in data['features']:
g = shape(f['geometry'])
geom_types[g.geom_type] += 1
print('Geometry types:', dict(geom_types))
# Check first feature
f = data['features'][0]
g = shape(f['geometry'])
print(f'First: {f[\"properties\"]}, type={g.geom_type}')
print(f'GeoJSON type: {f[\"geometry\"][\"type\"]}')
"
Geometry types: {'MultiPolygon': 69}
First: {'change_type': 'added_since_2014', 'district_name': 'Hietzing', 'touches_changed': True}, type=MultiPolygon
GeoJSON type: MultiPolygon