dc-l3-vienna-overpass-historical

Status: failed
Score: 0.000
Duration: 42:27
Cost: 10.21¢
Model: deepseek/deepseek-v4-flash

Map (reference ← swipe → agent)

0:00
Fetch Vienna's current district polygons (Bezirke) and their 2014-01-01 counterparts from Overpass and identify which parts of the district boundaries have changed between then and now. District names changed casing and spelling between snapshots, so normalise them to match. Classify each resulting geometry fragment by how it changed: `added_since_2014` (in current but not 2014), `removed_since_2014` (in 2014 but not current), or `unchanged` (present in both snapshots). For each district flag whether its geometry borders any area that was added or removed. Output vienna_boundary_changes.geojson, MultiPolygon, each feature with change_type, district_name (normalised), and touches_changed (boolean).
0:00
System info
{
  "adapter": "openrouter",
  "model": "deepseek/deepseek-v4-flash",
  "base_url": "https://openrouter.ai/api/v1"
}
0:05
Let me break this down:

1. Fetch Vienna's current district polygons from Overpass
2. Fetch Vienna's 2014-01-01 district polygons from Overpass
3. Normalize district names to match between the two snapshots
4. Compute geometric differences (symmetric difference) to find changed areas
5. Classify each fragment as added, removed, or unchanged
6. For each district, flag if it touches any changed area
7. Output as GeoJSON MultiPolygon with change_type, district_name, touches_changed

Let me start by exploring what's available and then write the solution.

First, let me check the working directory and then write a comprehensive Python script.
0:05
Bash ls /work/
(no output)
1:14
The user wants me to:

1. Fetch Vienna's current district polygons (Bezirke) from Overpass
2. Fetch Vienna's 2014-01-01 district polygons from Overpass
3. Normalize district names (handle casing/spelling differences)
4. Find boundary changes between the two snapshots
5. Classify geometry fragments: added_since_2014, removed_since_2014, unchanged
6. For each district, flag whether its geometry borders any area that was added or removed
7. Output vienna_boundary_changes.geojson with MultiPolygon, features with change_type, district_name (normalised), and touches_changed (boolean)

Let me start by understanding what's available. I'll write a Python script that uses Overpass to fetch the data.

For Vienna districts, the relation for Vienna districts can be queried using Overpass. Let me think about the Overpass queries.

For current districts, I can use `relation["admin_level"="9"]["name"~"Wien"]` or similar. Actually, Vienna's districts are admin_level=9 or admin_level=10? Let me think... In Austria, "Bundesland" (state) is admin_level=4, "Bezirk" (district) is admin_level=7, and "Gemeinde" (municipality) is admin_level=8. Vienna is unique because it's both a state and a city. Its districts (Bezirke) are often at admin_level=9 or 10.

Actually, the districts of Vienna are called "Gemeindebezirke" and are tagged as `admin_level=9` in OSM with boundary=administrative. Let me construct the query.

For current data:
```
[out:json][timeout:60];
area["name"="Wien"]["admin_level"=4]->.a;
rel(area.a)["admin_level"="9"]["type"="boundary"];
out geom;
```

For 2014-01-01 data, I need to use the `date` parameter in Overpass. The Overpass API supports querying historical data with `[date:"2014-01-01T00:00:00Z"]`.

Let me write the script step by step.

Actually, let me first check what Overpass instances are available and write a comprehensive script.

Let me write the Python script:

```python
import overpy
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon, MultiPolygon, shape
from shapely.ops import unary_union
import json, time, sys

# Overpass API endpoint
API_URL = "https://overpass-api.de/api/interpreter"

def query_overpass(query, retries=3):
    api = overpy.Overpass(url=API_URL)
    for attempt in range(retries):
        try:
            result = api.query(query)
            return result
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}", file=sys.stderr)
            if attempt < retries - 1:
                time.sleep(2)
    raise RuntimeError("All Overpass queries failed")
```

Hmm, but overpy may not handle the `geom` output well. Let me use a different approach - use the overpass API directly with requests and parse the JSON.

Actually, let me think about this more carefully. I'll write a Python script that uses `requests` to call the Overpass API, parse the results, and convert to GeoDataFrames.

Let me write this step by step.

For the Overpass query, I need to get the relations for Vienna's districts with their geometries.

Query for current districts:
```
[out:json][timeout:120];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
```

Wait, actually for Vienna, the districts are typically tagged as `admin_level=10` or `admin_level=9`. Let me check...

In OSM, Vienna's Bezirke are typically `admin_level=10`. The relation type is `boundary` or `multipolygon`, and they have `boundary=administrative`.

Let me be more careful. I'll query for relations within Vienna that have "Bezirk" in their name or admin_level=9/10.

Actually, let me just use a broader query to get all the district relations in Vienna:

```
[out:json][timeout:120];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"];
out geom;
```

Then I can filter. But I also want to make sure I'm getting the right things.

Let me try with a well-known relation: Vienna districts are defined as `admin_level=9` in many resources. The districts of Vienna have names like "Innere Stadt", "Leopoldstadt", etc.

OK, let me just write the script and run it. I'll use requests to call the Overpass API directly rather than relying on overpy.

For the 2014 data, I'll use the `[date:"2014-01-01T00:00:00Z"]` global query parameter.

Let me write the script:

```python
import requests
import json
import time
import sys
import geopandas as gpd
import pandas as pd
from shapely.geometry import shape, MultiPolygon, Polygon
from shapely.ops import unary_union
from collections import defaultdict

OVERPASS_URL = "https://overpass-api.de/api/interpreter"

def overpass_query(query, retries=3):
    for attempt in range(retries):
        try:
            resp = requests.get(OVERPASS_URL, params={"data": query}, timeout=180)
            resp.raise_for_status()
            return resp.json()
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}", file=sys.stderr)
            if attempt < retries - 1:
                time.sleep(2)
    raise RuntimeError("All Overpass queries failed")

# Query for current districts
current_query = """
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
"""
print("Fetching current districts...")
current_data = overpass_query(current_query)

# Query for 2014-01-01 districts
historical_query = """
[out:json][timeout:180][date:"2014-01-01T00:00:00Z"];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
"""
print("Fetching 2014 districts...")
historical_data = overpass_query(historical_query)
```

Wait, the `[date:"..."]` parameter in Overpass API is a global parameter that applies to the whole query. However, it only works on the main OSM database, not on the augmented diff (attic) data for all objects. Let me check if this works.

Actually, the Overpass API does support historical queries with the `[date:"..."]` parameter. It queries the state of the database at that date. This should work for getting the OSM data as of 2014-01-01.

Let me proceed with this approach.

Then for each district in each snapshot, I need to:
1. Extract the geometry (the relation's geometry from `out geom`)
2. Extract the name tag
3. Normalize names

For name normalization, I need to handle changes like:
- "Wien - Innere Stadt" vs "Wien. Innere Stadt" 
- "Wien - Leopoldstadt" vs "Wien. Leopoldstadt"
- Casing differences

Actually, looking at historical naming, Vienna district names might have changed their format. Let me think about common patterns:
- Current: "1. Bezirk, Innere Stadt" or "Innere Stadt"
- Historical: might be different

Let me just extract the `name` tag and normalize by removing common prefixes/suffixes and normalizing whitespace/case.

A reasonable normalization:
1. Remove "Wien" prefix patterns like "Wien - ", "Wien. ", "Wien, "
2. Remove "Bezirk" patterns
3. Strip whitespace
4. Lowercase for matching

Actually, let me look at what name format the districts use. Current Vienna districts have names like:
- "1. Bezirk, Innere Stadt"
- "Innere Stadt" (some just have the name)
- "Wien - Innere Stadt"

For 2014, the names might be:
- "Wien - Innere Stadt" 
- "Wien. Innere Stadt"
- "1., Innere Stadt"

I'll normalize by extracting the core district name (after removing district number prefixes and "Wien" prefixes).

Let me write the full script now.

Actually, let me think about how to handle the geometry. `out geom` on a relation returns the geometry of each member way. For a boundary relation, this gives the full outline of the boundary as a set of way geometries. I need to assemble these into a MultiPolygon.

Actually, when you do `out geom` on a relation, Overpass returns the geometry in the `members` array. But for boundary relations, the `out geom;` statement with a relation actually returns the geometry in `geometry` field if the relation has a `type=multipolygon` or `type=boundary`.

Hmm, let me think about this differently. Maybe I should use `out geom;` at the end which will give me the full geometry of the relation.

Actually, the simplest way is:
```
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
```

For a relation, `out geom` will return the geometry in the `members` array. Each member way will have its geometry. I need to combine all the way geometries that form the outer boundary of each district.

But this is complex. A simpler approach is to use `make` or `convert` to create geometries from the ways, or use `out bb;` and then query individual ways.

Actually, let me think about another approach. For boundary relations, there's a simpler way:

```
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
(._;>;);
out geom;
```

This gets the relation, then recurses down to get all ways and nodes, and outputs geometries. But this gives me individual ways, not the assembled polygon.

Alternatively, I can use the `map_to_area` or request the ways that form the boundary and assemble them in Python.

Actually, the simplest robust approach might be:

For each relation, I can get the member ways with their roles (outer/inner) and assemble the polygon from those ways' geometries.

Let me use a more specific approach. Let me first just try the query and see what comes back.

Let me start by writing a quick test script to see what the data looks like.

Actually, let me write a more complete script. I'll try to use `out geom;` on the relations directly, which should give me the geometry.

For Overpass, when you query relations with `out geom;`, you get the relation elements with their `members`. Each member (way) has a `geometry` array if it's a way member and `out geom` was specified. I can assemble these into polygons.

Let me write a robust script:

```python
import requests
import json
import time
import sys
from shapely.geometry import shape, Polygon, MultiPolygon, LineString
from shapely.ops import polygonize, unary_union
import geopandas as gpd
import pandas as pd
import re

def overpass_query(query, retries=3):
    for attempt in range(retries):
        try:
            resp = requests.get("https://overpass-api.de/api/interpreter", 
                              params={"data": query}, timeout=180)
            resp.raise_for_status()
            return resp.json()
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}", file=sys.stderr)
            if attempt < retries - 1:
                time.sleep(2)
    raise RuntimeError("All Overpass queries failed")
```

Hmm, assembling polygons from relation member ways is actually quite complex. Let me think of a better approach.

Actually, in Overpass QL, for a relation, if we use `out geom;`, the ways in the relation (which are referenced by `(._;>;);`) will have their geometries. But to build a polygon, I need the ordered ways.

Actually, there's a much simpler approach. In Overpass, we can use `make` or `convert` to create geometries. But the simplest might be to use:

```
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
```

Wait, `out geom;` on a relation doesn't directly return a polygon geometry. It returns the relation with its members and their geometries. I'd need to assemble the polygon.

Let me try a different approach. I can query for all ways that are part of the district boundaries and then use the relation membership information to group them.

Actually, the most reliable way might be to:

1. Query the relations (just IDs and tags)
2. For each relation, query the ways that form its boundary
3. Assemble the polygon from the way geometries

But that's many queries. Let me try a different strategy.

Another approach: Use `out geom;` on the recursion result `(._;>;);` which gives all ways and nodes with geometries. Then for each relation, look at its member way IDs, extract those ways' geometries, and assemble the polygon.

Let me try a single query:

```
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
(._;>;);
out geom;
```

This would return:
1. The relations (with their tags but no geometry)
2. All ways and nodes that are members of those relations, with full geometry

Then I can:
- For each relation, look at its `members` array
- Each member has a `ref` (way ID), `role` (outer/inner), and `type` ("way")
- Find the corresponding way geometry in the elements list
- Assemble the polygon from the way linestrings

But assembling these into proper polygons is tricky - the ways may not be in order.

Let me use a different approach. Instead of manually assembling, I can:

1. Collect all way geometries (as LineStrings)
2. For outer ways, use `polygonize` from shapely to turn the lines into polygons
3. For inner ways (holes), subtract them

Actually, `polygonize` works on a list of line segments and finds closed rings. If all the outer boundary ways form a closed ring (which they should for a boundary relation), `polygonize` will find it.

Let me write the script with this approach.

Actually, I just realized there's a much cleaner approach. I can use the Overpass API's `convert` statement or simply use a post-processing step with the `polygonize` function from shapely.

Let me write the full script now. Here's my plan:

```python
import requests
import time
import sys
from shapely.geometry import shape, Polygon, MultiPolygon, LineString, Point
from shapely.ops import polygonize, unary_union, linemerge
import geopandas as gpd
import pandas as pd
import re
import json

def overpass_query(query, retries=3):
    for attempt in range(retries):
        try:
            resp = requests.get("https://overpass-api.de/api/interpreter",
                              params={"data": query}, timeout=180)
            resp.raise_for_status()
            return resp.json()
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}", file=sys.stderr)
            if attempt < retries - 1:
                time.sleep(2)
    raise RuntimeError("All Overpass queries failed")

def extract_polygons_from_relation(relation, elements):
    """Extract polygon(s) from a boundary relation using member way geometries."""
    outer_lines = []
    inner_lines = []
    
    for member in relation.get("members", []):
        if member["type"] != "way":
            continue
        way_id = member["ref"]
        role = member.get("role", "outer")
        
        # Find the way in elements
        way = None
        for elem in elements:
            if elem["type"] == "way" and elem["id"] == way_id:
                way = elem
                break
        
        if way is None or "geometry" not in way:
            continue
        
        # Convert to LineString
        coords = [(p["lon"], p["lat"]) for p in way["geometry"]]
        if len(coords) < 2:
            continue
        
        line = LineString(coords)
        if role == "outer" or role == "":
            outer_lines.append(line)
        elif role == "inner":
            inner_lines.append(line)
    
    if not outer_lines:
        return None
    
    # Merge and polygonize
    merged = linemerge(outer_lines) if len(outer_lines) > 1 else outer_lines[0]
    # polygonize expects an iterable of line segments
    polygons = list(polygonize([merged] if merged.geom_type == 'LineString' else merged.geoms))
    
    if not polygons:
        # Try with individual lines
        polygons = list(polygonize(outer_lines))
    
    if not polygons:
        return None
    
    # Combine all outer polygons
    result = unary_union(polygons)
    
    # Cut out inner holes
    if inner_lines:
        inner_merged = linemerge(inner_lines) if len(inner_lines) > 1 else inner_lines[0]
        inner_polys = list(polygonize([inner_merged] if inner_merged.geom_type == 'LineString' else inner_merged.geoms))
        if inner_polys:
            inner_union = unary_union(inner_polys)
            result = result.difference(inner_union)
    
    if result.geom_type == 'Polygon':
        return MultiPolygon([result])
    elif result.geom_type == 'MultiPolygon':
        return result
    return None

def normalize_district_name(name):
    """Normalize Vienna district names for comparison."""
    if not name:
        return ""
    # Remove common prefixes
    name = re.sub(r'^\d+\.\s*Bezirk[,.]?\s*', '', name)
    name = re.sub(r'^\d+\.?[,.]?\s*', '', name)
    name = re.sub(r'^Wien[-–—\s.,]+', '', name)
    name = re.sub(r'^Wien\s+', '', name)
    name = re.sub(r'[,.]\s*Wien', '', name)
    # Normalize whitespace
    name = re.sub(r'\s+', ' ', name).strip()
    # Remove trailing "Wien"
    name = re.sub(r'\s*,?\s*Wien$', '', name)
    # Title case
    name = name.title()
    return name

# ... etc
```

Wait, I'm overcomplicating this. Let me also consider an alternative: instead of assembling polygons from ways, I can query Overpass to get the simplified geometry.

Actually, there's a much simpler approach. Overpass supports `convert` which can construct a geometry from relation members. But the syntax is complex.

Let me try yet another approach. I can query the Overpass API using `out skel` for relations (to get tags and member references), then query for the ways separately, or use `out geom` on the way recursion.

OK let me just write the script and test it. I'll start by running a simple query to see the data.

Let me start with a simpler approach - first just fetch and inspect the data.

Actually, let me just write the complete script and run it. I'll handle the assembly using shapely's polygonize function.

But actually, let me simplify even further. Instead of manually assembling polygon geometries from ways, I can use Overpass's built-in functionality. The Overpass Turbo Query Wizard often produces queries like:

```
[out:json][timeout:180];
area["name"="Wien"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
```

And when you do `out geom;` on a relation, Overpass returns the relation with a `geometry` field in the output? No, actually I'm not sure about that for relations. Let me check.

Actually, in Overpass QL, `out geom;` outputs the full geometry for nodes and ways, and for relations, it outputs the geometry for each member (way or node) in the `members` array. It doesn't create a unified polygon.

OK let me just write the script and try it. I'll handle the geometry assembly in Python.

Let me write and run the script step by step:

1. First, query both datasets
2. Parse and assemble geometries
3. Process and output

Let me go ahead.

Wait, actually there's a smarter approach using the Overpass "convert" statement or other features. Let me look at this:

In Overpass, I can use:
```
rel(...);
convert rel ::id=id(), ::geom=geom();
out geom;
```

The `geom()` function on a relation returns the geometry of the relation as a multipolygon (for boundary/multipolygon relations). Let me try this approach!

So my query would be:
```
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
convert rel ::id=id(), ::geom=geom(), name=::tag("name");
out geom;
```

This should give me relations with proper multi-polygon geometries!

Let me try this approach. If `geom()` works on relations, this will be much simpler.

Similarly for the historical query.

OK, let me write the script.

Actually, I should also keep the relation tags. Let me use `convert` to preserve all tags:

```
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
convert rel ::geom=geom(), ::id=id();
out geom;
```

But I also need the tags. Let me use a different approach - the `make` statement or just accept what `convert` gives me.

Actually, the simplest way to get relation geometries in Overpass is:

```
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
out geom;
```

This returns relations with members, and each member way has geometry coordinates. But I need to assemble.

Let me try another approach. I can use:

```
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
map_to_area;
out geom;
```

Hmm, `map_to_area` converts relations to areas. Areas have proper geometries.

Wait, I think I'm overthinking this. Let me just use the `convert` approach:

```
[out:json][timeout:180];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"];
convert rel ::id=id(), ::geom=geom(), ::tags=::keys();
out geom;
```

Actually, `::tags` might not work like that. Let me try preserving specific tags:

```
convert rel ::id=id(), ::geom=geom(), name=t["name"], admin_level=t["admin_level"];
out geom;
```

This should preserve the name and admin_level tags.

Let me just write the script and try it. First, let me do a quick test.
1:14
1:14
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests, json

# Test query for current Vienna district relations using convert to get geom
q = '''
[out:json][timeout:120];
area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;
rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];
convert rel ::id=id(), ::geom=geom(), name=t[\"name\"], ref=t[\"ref\"];
out geom;
'''
r = requests.get('https://overpass-api.de/api/interpreter', params={'data': q}, timeout=120)
print('Status:', r.status_code)
data = r.json()
print('Elements:', len(data.get('elements',[])))
for el in data.get('elements', [])[:5]:
    print('  id:', el.get('id'), 'name:', el.get('tags',{}).get('name'), 'geom_type:', el.get('geometry',{}).get('type') if 'geometry' in el else 'N/A')
"
Status: 406
Traceback (most recent call last):
  File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1116, in json
    return complexjson.loads(self.text, **kwargs)
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/json/__init__.py", line 352, in loads
    return _default_decoder.decode(s)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/local/lib/python3.14/json/decoder.py", line 345, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/json/decoder.py", line 363, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 14, in <module>
    data = r.json()
  File "/usr/local/lib/python3.14/site-packages/requests/models.py", line 1120, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
1:22
Let me fix the Overpass query - the `convert` statement with `::geom=geom()` was probably the issue. Let me try a simpler approach.
1:22
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests

# Simpler query - get relation members with geometry
q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'
r = requests.get('https://overpass-api.de/api/interpreter', params={'data': q}, timeout=120)
print('Status:', r.status_code)
print('Text preview:', r.text[:500])
"
Status: 406
Text preview: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
1:28
The 406 error is likely related to the Content-Type header. Let me try a different approach using POST or setting headers.
1:28
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests

q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'

# Try POST
r = requests.post('https://overpass-api.de/api/interpreter', data={'data': q}, timeout=120, headers={'Content-Type': 'application/x-www-form-urlencoded', 'Accept': 'application/json'})
print('POST Status:', r.status_code)
print('Text preview:', r.text[:500] if r.text else 'EMPTY')
if r.status_code == 200:
    data = r.json()
    print('Elements:', len(data.get('elements',[])))
"
POST Status: 406
Text preview: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
1:37
It seems the Overpass API is rejecting requests with certain headers. Let me try with no Accept header, or use a different endpoint.
1:37
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests

q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'

# Try with curl-style approach
r = requests.get('https://overpass-api.de/api/interpreter', params={'data': q}, headers={'Accept': '*/*', 'User-Agent': 'curl/8.0'}, timeout=120)
print('Status:', r.status_code)
print('Text:', r.text[:500])
"
Status: 406
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
1:54
The 406 error seems to be coming from the overpass-api.de server. Let me try a different Overpass instance, like the one at overpass.kumi.systems or lz4.overpass-api.de.
1:54
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests

q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'

# Try overpass.kumi.systems
url = 'https://overpass.kumi.systems/api/interpreter'
r = requests.post(url, data={'data': q}, timeout=120)
print('kumi Status:', r.status_code)
if r.status_code == 200:
    data = r.json()
    print('Elements:', len(data.get('elements',[])))
else:
    print('Text:', r.text[:500])
"
Traceback (most recent call last):
  File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 534, in _make_request
    response = conn.getresponse()
  File "/usr/local/lib/python3.14/site-packages/urllib3/connection.py", line 571, in getresponse
    httplib_response = super().getresponse()
  File "/usr/local/lib/python3.14/http/client.py", line 1459, in getresponse
    response.begin()
    ~~~~~~~~~~~~~~^^
  File "/usr/local/lib/python3.14/http/client.py", line 336, in begin
    version, status, reason = self._read_status()
                              ~~~~~~~~~~~~~~~~~^^
  File "/usr/local/lib/python3.14/http/client.py", line 297, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/socket.py", line 729, in readinto
    return self._sock.recv_into(b)
           ~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/local/lib/python3.14/ssl.py", line 1304, in recv_into
    return self.read(nbytes, buffer)
           ~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/ssl.py", line 1138, in read
    return self._sslobj.read(len, buffer)
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 696, in send
    resp = conn.urlopen(
        method=request.method,
    ...<9 lines>...
        chunked=chunked,
    )
  File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 842, in urlopen
    retries = retries.increment(
        method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    )
  File "/usr/local/lib/python3.14/site-packages/urllib3/util/retry.py", line 498, in increment
    raise reraise(type(error), error, _stacktrace)
          ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/site-packages/urllib3/util/util.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 788, in urlopen
    response = self._make_request(
        conn,
    ...<10 lines>...
        **response_kw,
    )
  File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/site-packages/urllib3/connectionpool.py", line 367, in _raise_timeout
    raise ReadTimeoutError(
        self, url, f"Read timed out. (read timeout={timeout_value})"
    ) from err
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=120)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 8, in <module>
    r = requests.post(url, data={'data': q}, timeout=120)
  File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 134, in post
    return request("post", url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.14/site-packages/requests/api.py", line 71, in request
    return session.request(method=method, url=url, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 651, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.14/site-packages/requests/sessions.py", line 784, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.14/site-packages/requests/adapters.py", line 742, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=120)
4:00
The main Overpass API is returning 406, and the kumi mirror timed out. Let me try a different approach - use `lz4.overpass-api.de` with correct headers, or try to use the OSM API directly. Actually, let me try the main API with a GET request and the data properly URL-encoded.

Actually, the 406 might be related to the Accept header. Let me try with proper headers.

Actually, let me try a different Overpass API endpoint.
4:00
Bash cd /work && python3 -c "
cd /work && python3 -c "
import requests

q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'

# Try lz4 mirror
r = requests.get('https://lz4.overpass-api.de/api/interpreter', params={'data': q}, timeout=120)
print('lz4 Status:', r.status_code)
print('Text:', r.text[:500] if r.text else 'EMPTY')
"
lz4 Status: 406
Text: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at lz4.overpass-api.de Port 443</address>
</body></html>
4:05
The 406 error suggests that the requests library's default Accept header is causing issues. Let me try with explicit Accept header or use the API differently. Actually the 406 is "Not Acceptable" - this usually means the server can't serve the requested format. Let me check the exact request headers and try a raw curl approach.
4:05
Bash curl -s -o /tmp/overpass_out.json -w "%{http_code}" -H "Accept: application/json...
bash: line 1: curl: command not found
4:09
The issue is the Accept header. The overpass-api.de server returns 406 if it doesn't like the Accept header. Let me try with proper content negotiation using Python's urllib directly.
4:09
Bash cd /work && python3 -c "
cd /work && python3 -c "
import urllib.request, urllib.parse

q = '[out:json][timeout:120];area[\"name\"=\"Wien\"][\"admin_level\"=\"4\"]->.a;rel(area.a)[\"admin_level\"=\"9\"][\"boundary\"=\"administrative\"];out geom;'
data = urllib.parse.urlencode({'data': q}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data, method='POST')
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
try:
    with urllib.request.urlopen(req, timeout=120) as resp:
        txt = resp.read().decode()
        print('Status:', resp.status)
        print('Length:', len(txt))
        if resp.status == 200:
            import json
            d = json.loads(txt)
            print('Elements:', len(d.get('elements',[])))
            print('First element:', list(d['elements'][0].keys()) if d['elements'] else 'NONE')
except Exception as e:
    print('Error:', e)
    # Try reading anyway
    if hasattr(e, 'read'):
        body = e.read().decode()
        print('Body:', body[:500])
"
Error: HTTP Error 406: Not Acceptable
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
4:13
The 406 error seems to be an HTTP Accept header issue. Let me check what Python requests sends and try to mimic what works. Actually, the Python requests library defaults to sending `Accept: */*` but there's something else going on. Let me check with wget.
4:13
Bash which wget curl python3 && python3 -c "import requests; print(requests.__version...
/usr/local/bin/python3
4:20
No curl or wget. Let me try using urllib differently - maybe the issue is that the overpass API expects the Accept header differently or the query format. Let me try a different approach.

Actually, the 406 error typically means the server doesn't like the Accept header. Let me try without any Accept header or with a more permissive one.

Let me also try using the Overpass API via the OSM API's "map" call or use osmium on a PBF extract. Actually, let me first try to fix the Overpass query.
4:20
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import urllib.request, urllib.parse
import json

# The issue might be the Accept header. Overpass returns 406 when it can't match.
# Let's use a raw socket or try with different headers.
q = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

data = urllib.parse.urlencode({'data': q}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
req.add_header('User-Agent', 'Mozilla/5.0')
# Don't set Accept - let it default

try:
    with urllib.request.urlopen(req, timeout=120) as resp:
        txt = resp.read().decode()
        print('Status:', resp.status)
        print('Headers:', dict(resp.headers))
        d = json.loads(txt)
        print('Elements:', len(d.get('elements',[])))
        for el in d.get('elements', [])[:3]:
            print(' ', el.get('id'), el.get('tags',{}), 'geom keys:', list(el.keys()))
except Exception as e:
    print('Error:', type(e).__name__, e)
    if hasattr(e, 'read'):
        print('Body:', e.read().decode()[:500])
PYEOF
Error: HTTPError HTTP Error 406: Not Acceptable
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
4:25
The 406 error from Overpass is about content negotiation. The Overpass API expects the `Accept` header to include `application/json`. Let me try explicitly setting that.
4:25
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import urllib.request, urllib.parse
import json

q = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

data = urllib.parse.urlencode({'data': q}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
req.add_header('Accept', 'application/json')
req.add_header('User-Agent', 'Mozilla/5.0')

try:
    with urllib.request.urlopen(req, timeout=120) as resp:
        txt = resp.read().decode()
        print('Status:', resp.status)
        d = json.loads(txt)
        print('Elements:', len(d.get('elements',[])))
        for el in d.get('elements', [])[:3]:
            print(' ', el.get('id'), el.get('tags',{}), 'geom keys:', [k for k in el.keys() if k not in ('type','id','tags')])
except Exception as e:
    print('Error:', type(e).__name__, e)
    if hasattr(e, 'read'):
        print('Body:', e.read().decode()[:500])
PYEOF
Error: HTTPError HTTP Error 406: Not Acceptable
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
4:33
The Overpass API is returning 406 Not Acceptable. This might be because the server doesn't accept our request format. Let me try a different approach - maybe the issue is with the `out geom` on relations. Overpass uses `geom` differently - for relations, you might need `out geom` or you need to get the ways and nodes of the relation.

Let me try a simpler query first, or use the `out body;` approach which should work, then get geometry via member ways.

Actually, I think the problem might be the server rejecting based on the `Accept` header. Let me try a different approach - use `out body` instead of `out geom`, and then separately get the way geometry from the ways.

Or actually, let me try with `out body;` first without geometry, then fetch geometry for the relation members.

Wait - the issue might be simpler. Let me try using `out skel;` or just `out;` instead of `out geom;`.

Actually, the 406 means Overpass can't produce content matching the Accept header. Some servers are picky about this. Let me try not sending an Accept header at all, or sending `*/*`.

But we already tried that... Let me look at the actual response more carefully with curl if available, or try a different approach entirely.

Let me try installing curl or use an alternative approach.
4:33
Bash pip install curl --quiet 2>/dev/null; apt-get update -qq && apt-get install -y -...
no curl
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
4:41
Let me try a completely different approach. Maybe the Overpass API is just rejecting because of some quirk. Let me try sending the request with a raw socket or use OSMnx which can get administrative boundaries. Or better yet, let me try using the Overpass API through HTTP GET with the query encoded in the URL path, which is the legacy method.

Actually, the standard Overpass API endpoint uses POST with `data` parameter. The 406 could be an issue with the URL vs endpoint. Let me try the correct way using `https://overpass-api.de/api/interpreter` and the `data` parameter with requests library.

Wait, I did that before and got 406. Let me try with a simpler query first to see if it's a library issue.
4:41
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import http.client
import json

conn = http.client.HTTPSConnection("overpass-api.de", timeout=120)
body = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'
conn.request("POST", "/api/interpreter", body=body, 
             headers={"Content-Type": "application/x-www-form-urlencoded", "Accept": "application/json"})
resp = conn.getresponse()
print('Status:', resp.status, resp.reason)
data = resp.read().decode()
print('Length:', len(data))
print('Preview:', data[:300])
if resp.status == 200:
    d = json.loads(data)
    print('Elements:', len(d.get('elements',[])))
    if d['elements']:
        print('First keys:', list(d['elements'][0].keys()))
conn.close()
PYEOF
Status: 406 Not Acceptable
Length: 371
Preview: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.6
4:47
The Overpass API is returning 406 Not Acceptable. This might be because the server requires the Accept header to include certain content types. Let me try the Overpass Turbo approach where we use `out:json` properly as part of the content type negotiation.

Actually, looking at this more carefully, the issue might be that the overpass-api.de server is configured to not serve requests without a proper User-Agent or something specific about the environment. Let me try a completely different approach.

Let me use the `overturemaps` library or `osmnx` instead to get the data. OSMnx can download Vienna boundaries directly.
4:47
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox

# Get Vienna districts (current)
ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

try:
    gdf = ox.geometries_from_place("Vienna, Austria", tags={'boundary':'administrative', 'admin_level':'9'})
    print('Columns:', list(gdf.columns))
    print('CRS:', gdf.crs)
    print('Shape:', gdf.shape)
    print(gdf[['name','admin_level']].head(10))
except Exception as e:
    print('Error:', type(e).__name__, e)
PYEOF
Error: AttributeError module 'osmnx' has no attribute 'geometries_from_place'
4:51
Let me check the OSMnx version and available functions.
4:51
Bash cd /work && python3 -c "import osmnx; print(osmnx.__version__); print([x for x i...
2.1.0
['features', 'features_from_address', 'features_from_bbox', 'features_from_place', 'features_from_point', 'features_from_polygon', 'features_from_xml', 'geocode', 'geocode_to_gdf', 'geocoder', 'graph_from_place', 'save_graph_geopackage', 'utils_geo']
5:03
Let me use `features_from_place` which is the correct method in osmnx 2.x.
5:03
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox

ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

# Get current Vienna districts
gdf = ox.features_from_place("Vienna, Austria", tags={'boundary':'administrative', 'admin_level':'9'})
print('Columns:', list(gdf.columns))
print('CRS:', gdf.crs)
print('Shape:', gdf.shape)
print(gdf[['name','admin_level']].head(20))
# Save to file for inspection
gdf.to_parquet('/work/current_districts.parquet')
PYEOF
Columns: ['geometry', 'ISO3166-1:alpha2', 'admin_level', 'capital', 'description', 'name', 'name:ab', 'name:ace', 'name:af', 'name:am', 'name:an', 'name:ar', 'name:arc', 'name:arz', 'name:as', 'name:ast', 'name:av', 'name:az', 'name:ba', 'name:bar', 'name:bat-smg', 'name:be', 'name:be-tarask', 'name:bg', 'name:bi', 'name:bn', 'name:bo', 'name:br', 'name:bs', 'name:ca', 'name:ce', 'name:ceb', 'name:ch', 'name:ckb', 'name:co', 'name:crh', 'name:cs', 'name:cu', 'name:cv', 'name:cy', 'name:de', 'name:diq', 'name:ee', 'name:el', 'name:en', 'name:eo', 'name:es', 'name:et', 'name:eu', 'name:ext', 'name:fa', 'name:ff', 'name:fi', 'name:fr', 'name:frp', 'name:fur', 'name:fy', 'name:ga', 'name:gag', 'name:gd', 'name:gl', 'name:gn', 'name:gv', 'name:ha', 'name:he', 'name:hi', 'name:hif', 'name:hr', 'name:ht', 'name:hu', 'name:hy', 'name:ia', 'name:id', 'name:ie', 'name:ilo', 'name:is', 'name:it', 'name:ja', 'name:jv', 'name:ka', 'name:kab', 'name:kk', 'name:kn', 'name:ko', 'name:koi', 'name:krc', 'name:ku', 'name:kv', 'name:ky', 'name:la', 'name:lez', 'name:li', 'name:lij', 'name:lmo', 'name:ln', 'name:lt', 'name:lv', 'name:mg', 'name:mhr', 'name:mi', 'name:mk', 'name:ml', 'name:mn', 'name:mr', 'name:mrj', 'name:ms', 'name:mt', 'name:my', 'name:nah', 'name:nds', 'name:nds-nl', 'name:ne', 'name:nl', 'name:no', 'name:nv', 'name:oc', 'name:or', 'name:os', 'name:pa', 'name:pap', 'name:pih', 'name:pl', 'name:pms', 'name:pnb', 'name:ps', 'name:pt', 'name:rm', 'name:rmy', 'name:ro', 'name:roa-tara', 'name:ru', 'name:rue', 'name:sah', 'name:sc', 'name:scn', 'name:sco', 'name:sd', 'name:sh', 'name:si', 'name:sk', 'name:sl', 'name:sn', 'name:so', 'name:sq', 'name:sr', 'name:st', 'name:sv', 'name:sw', 'name:szl', 'name:ta', 'name:te', 'name:tg', 'name:th', 'name:tk', 'name:tl', 'name:tr', 'name:tt', 'name:tw', 'name:tzl', 'name:udm', 'name:ug', 'name:uk', 'name:ur', 'name:uz', 'name:vec', 'name:vep', 'name:vi', 'name:vro', 'name:wa', 'name:war', 'name:wo', 'name:wuu', 'name:yi', 'name:yo', 'name:yue', 'name:zh', 'name:zh-Hans', 'name:zh-Hant', 'population', 'population:date', 'source:population', 'website', 'wikidata', 'wikipedia', 'ISO3166-1', 'alt_name:eo', 'currency', 'flag', 'name:ak', 'name:chr', 'name:da', 'name:dz', 'name:fo', 'name:gu', 'name:haw', 'name:hsb', 'name:io', 'name:jbo', 'name:kl', 'name:ksh', 'name:kw', 'name:lb', 'name:lo', 'name:nn', 'name:se', 'name:smn', 'name:sms', 'name:tok', 'name:vo', 'name:zu', 'official_name', 'official_name:be', 'official_name:ckb', 'official_name:cs', 'official_name:el', 'official_name:eo', 'official_name:hu', 'source', 'boundary', 'image', 'ref', 'alt_name', 'width', 'wikimedia_commons', 'note', 'long_name', 'addr:city', 'addr:country', 'addr:housenumber', 'addr:postcode', 'addr:street', 'government', 'office', 'wheelchair', 'opening_hours', 'highway', 'maxspeed', 'sidewalk', 'source:maxspeed', 'lane_markings', 'lit', 'surface', 'oneway', 'oneway:bicycle', 'cycleway:left', 'cycleway:right', 'lanes', 'sidewalk:left', 'sidewalk:right', 'destination', 'destination:symbol', 'lanes:bus', 'sidewalk:left:surface', 'smoothness', 'turn:lanes', 'cycleway', 'cycleway:lane', 'parking:both', 'parking:left:orientation', 'parking:right:orientation', 'cycleway:both', 'parking:both:orientation', 'sidewalk:both', 'maxspeed:type', 'sidewalk:right:surface', 'cycleway:both:lane', 'network', 'embedded_rails', 'sidewalk:both:surface', 'est_width:carriageway', 'parking:left', 'parking:right', 'source:est_width:carriageway', 'name:etymology:wikidata', 'bridge', 'lanes:backward', 'lanes:forward', 'layer', 'turn:lanes:backward', 'turn:lanes:forward', 'lcn', 'cycleway:left:lane', 'name:etymology', 'name:start_date', 'maxweight:signed', 'bicycle', 'cycleway:left:oneway', 'loc_name', 'motor_vehicle:conditional', 'old_ref', 'traffic_sign', 'electrified', 'frequency', 'gauge', 'operator', 'operator:wikidata', 'railway', 'railway:preferred_direction', 'voltage', 'oneway:psv', 'odbl', 'boundary_type', 'cycleway:right:lane', 'shoulder', 'check_date', 'oneway:bus', 'source:width', 'amenity', 'building', 'building:levels', 'check_date:cycleway', 'check_date:smoothness', 'name:wikipedia', 'maxheight', 'level', 'destination:backward', 'destination:ref:backward', 'destination:forward', 'destination:ref:forward', 'parking:right:restriction', 'type', 'ISO3166-1:alpha3', 'ISO3166-1:numeric', 'TMC:cid_58:tabcd_1:Class', 'TMC:cid_58:tabcd_1:LCLversion', 'TMC:cid_58:tabcd_1:LocationCode', 'default_language', 'driving_side', 'int_name', 'land_area', 'name:als', 'name:ang', 'name:ay', 'name:bcl', 'name:bh', 'name:bpy', 'name:bug', 'name:bxr', 'name:cbk-zam', 'name:cdo', 'name:chm', 'name:chy', 'name:csb', 'name:din', 'name:dsb', 'name:dv', 'name:frr', 'name:hak', 'name:ig', 'name:kaa', 'name:kbd', 'name:kg', 'name:lad', 'name:lld', 'name:ltg', 'name:lzh', 'name:mdf', 'name:mwl', 'name:na', 'name:nan', 'name:new', 'name:nov', 'name:nrm', 'name:om', 'name:pag', 'name:pam', 'name:pcd', 'name:pdc', 'name:pfl', 'name:pi', 'name:pnt', 'name:prefix:at', 'name:qu', 'name:roa-rup', 'name:rw', 'name:sa', 'name:sm', 'name:ss', 'name:stq', 'name:su', 'name:tet', 'name:tpi', 'name:ts', 'name:vls', 'name:xal', 'name:xmf', 'name:za', 'name:zea', 'official_name:be-tarask', 'official_name:ca', 'official_name:mi', 'ref:at:gkz', 'timezone', 'ISO3166-2', 'alt_name:cs', 'coat_of_arms', 'ref:nuts', 'ref:nuts:2', 'ref:at:okz', 'admin_level:6', 'admin_level:8', 'short_name', 'admin_title', 'designation', 'official_status', 'ref:at:kgnr', 'street_names:wikipedia', 'old_name', 'wikipedia:ru', 'roof:levels', 'related:wikipedia']
CRS: epsg:4326
Shape: (901, 411)
                                                     name admin_level
element  id                                                          
node     2518271394                 Bezirksgesundheitsamt           9
         2518271399                               Mag Elf           9
         2518271406  Bürgerdienst Kundenzentrum Simmering           9
relation 16239                                 Österreich           2
         63619                          Bezirk St. Pölten           6
         77189                           Niederösterreich           4
         103662                            Deutsch-Wagram           8
         103894                                  Aderklaa           8
         103999                                  Raasdorf           8
         104072                        Bezirk Gänserndorf           6
         104864                           Groß-Enzersdorf           8
         106250                              Bezirk Tulln           6
         106740                         Bezirk Korneuburg           6
         109166                                      Wien           4
         111817                                 Schwechat           8
         111818                                Fischamend           8
         111829                                Lanzendorf           8
         111838                              Leopoldsdorf           8
         111858                            Bezirk Mödling           6
         111881                               Hennersdorf           8
5:28
The query returns too many things - nodes, relations for all levels. I need to filter specifically for Vienna districts (admin_level=9, boundary=administrative) and also filter by area (Vienna city). Let me refine.

Also, I need to get the 2014-01-01 data. OSMnx or Overpass can query with a specific date using the `[date:"2014-01-01T00:00:00Z"]` syntax.

Let me first get the current districts properly filtered.
5:28
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox

ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

# Get only Vienna (admin_level=4) area, then get districts
# Use features_from_place with specific tags
gdf = ox.features_from_place("Vienna, Austria", tags={'admin_level':'9', 'boundary':'administrative'})
# Filter to only relation type elements (polygons)
gdf = gdf[gdf.index.get_level_values('element_type') == 'relation']
print('Shape:', gdf.shape)
# Only keep name and geometry
print(gdf[['name', 'ref']].head(25))
# Check unique names
print('\nUnique names:', gdf['name'].unique().tolist())
print('\nRefs:', gdf['ref'].unique().tolist())
# Save
gdf.to_parquet('/work/current_districts.parquet')
PYEOF
Traceback (most recent call last):
  File "/usr/local/lib/python3.14/site-packages/pandas/core/indexes/multi.py", line 1762, in _get_level_number
    level = self.names.index(level)
ValueError: list.index(x): x not in list

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 10, in <module>
  File "/usr/local/lib/python3.14/site-packages/pandas/core/indexes/multi.py", line 2022, in get_level_values
    level = self._get_level_number(level)
  File "/usr/local/lib/python3.14/site-packages/pandas/core/indexes/multi.py", line 1765, in _get_level_number
    raise KeyError(f"Level {level} not found") from err
KeyError: 'Level element_type not found'
5:53
The index probably has different level names. Let me check the index structure.
5:53
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox
import pandas as pd

ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

gdf = ox.features_from_place("Vienna, Austria", tags={'admin_level':'9', 'boundary':'administrative'})
print('Index:', gdf.index)
print('Index names:', gdf.index.names)
print('Shape:', gdf.shape)

# Filter relations
gdf = gdf[gdf.index.get_level_values(0) == 'relation']
print('After filtering relations:', gdf.shape)

# Show district names
print(gdf[['name', 'ref']].head(25))
print('\nUnique names:', gdf['name'].dropna().unique().tolist())
PYEOF
Index: MultiIndex([(    'node', 2518271394),
            (    'node', 2518271399),
            (    'node', 2518271406),
            ('relation',      16239),
            ('relation',      63619),
            ('relation',      77189),
            ('relation',     103662),
            ('relation',     103894),
            ('relation',     103999),
            ('relation',     104072),
            ...
            (     'way', 1509410450),
            (     'way', 1509410455),
            (     'way', 1509410456),
            (     'way', 1509410457),
            (     'way', 1509410458),
            (     'way', 1509410459),
            (     'way', 1509410460),
            (     'way', 1509410461),
            (     'way', 1509410471),
            (     'way', 1509410472)],
           names=['element', 'id'], length=901)
Index names: ['element', 'id']
Shape: (901, 411)
After filtering relations: (173, 411)
                                       name  ref
element  id                                     
relation 16239                   Österreich  NaN
         63619            Bezirk St. Pölten   PL
         77189             Niederösterreich  NaN
         103662              Deutsch-Wagram  NaN
         103894                    Aderklaa  NaN
         103999                    Raasdorf  NaN
         104072          Bezirk Gänserndorf   GF
         104864             Groß-Enzersdorf  NaN
         106250                Bezirk Tulln   TU
         106740           Bezirk Korneuburg   KO
         109166                        Wien  NaN
         111817                   Schwechat  NaN
         111818                  Fischamend  NaN
         111829                  Lanzendorf  NaN
         111838                Leopoldsdorf  NaN
         111858              Bezirk Mödling   MD
         111881                 Hennersdorf  NaN
         111887        Breitenfurt bei Wien  NaN
         111930               Laab im Walde  NaN
         111931             Kaltenleutgeben  NaN
         111946              Perchtoldsdorf  NaN
         111948            Brunn am Gebirge  NaN
         111956                   Vösendorf  NaN
         112050  Bezirk Bruck an der Leitha   BL
         112467                 Purkersdorf  NaN

Unique names: ['Österreich', 'Bezirk St. Pölten', 'Niederösterreich', 'Deutsch-Wagram', 'Aderklaa', 'Raasdorf', 'Bezirk Gänserndorf', 'Groß-Enzersdorf', 'Bezirk Tulln', 'Bezirk Korneuburg', 'Wien', 'Schwechat', 'Fischamend', 'Lanzendorf', 'Leopoldsdorf', 'Bezirk Mödling', 'Hennersdorf', 'Breitenfurt bei Wien', 'Laab im Walde', 'Kaltenleutgeben', 'Perchtoldsdorf', 'Brunn am Gebirge', 'Vösendorf', 'Bezirk Bruck an der Leitha', 'Purkersdorf', 'Langenzersdorf', 'Hagenbrunn', 'Mauerbach', 'Klosterneuburg', 'Gerasdorf bei Wien', 'Katastralgemeinde Gerasdorf', 'Katastralgemeinde Mauer', 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Leopoldstadt', 'Mariahilf', 'Meidling', 'Neubau', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Währing', 'Landstraße', 'Brigittenau', 'Donaustadt', 'Döbling', 'Favoriten', 'Floridsdorf', 'Hernals', 'Liesing', 'Margareten', 'Ottakring', 'Simmering', 'Wieden', 'Amtshaus für den 16. Bezirk', 'Katastralgemeinde Inzersdorf Stadt', 'Katastralgemeinde Favoriten', 'Katastralgemeinde Oberlaa Stadt', 'Katastralgemeinde Rothneusiedl', 'Katastralgemeinde Oberlaa Land', 'Katastralgemeinde Unterlaa', 'Katastralgemeinde Inzersdorf', 'Katastralgemeinde Siebenhirten', 'Katastralgemeinde Erlaa', 'Katastralgemeinde Atzgersdorf', 'Katastralgemeinde Margarethen', 'Katastralgemeinde Rodaun', 'Katastralgemeinde Liesing', 'Katastralgemeinde Kalksburg', 'Katastralgemeinde Altmannsdorf', 'Katastralgemeinde Innere Stadt', 'Katastralgemeinde Rosenberg', 'Katastralgemeinde Albern', 'Katastralgemeinde Hetzendorf', 'Katastralgemeinde Schönbrunn', 'Katastralgemeinde Meidling', 'Katastralgemeinde Gaudenzdorf', 'Katastralgemeinde Auhof', 'Katastralgemeinde Weidlingau', 'Katastralgemeinde Speising', 'Katastralgemeinde Lainz', 'Katastralgemeinde Hietzing', 'Katastralgemeinde Unter St. Veit', 'Katastralgemeinde Ober St. Veit', 'Katastralgemeinde Hacking', 'Katastralgemeinde Mariahilf', 'Katastralgemeinde Neubau', 'Katastralgemeinde Josefstadt', 'Katastralgemeinde Alsergrund', 'Katastralgemeinde Kaiserebersdorf', 'Katastralgemeinde Simmering', 'Katastralgemeinde Landstraße', 'Katastralgemeinde Leopoldstadt', 'Katastralgemeinde Brigittenau', 'Katastralgemeinde Kaiserebersdorf Herrschaft', 'Katastralgemeinde Landjägermeisteramt', 'Katastralgemeinde Eßling', 'Katastralgemeinde Süssenbrunn', 'Katastralgemeinde Breitenlee', 'Katastralgemeinde Aspern', 'Katastralgemeinde Hirschstetten', 'Katastralgemeinde Stadlau', 'Katastralgemeinde Kaisermühlen', 'Katastralgemeinde Kagran', 'Katastralgemeinde Floridsdorf', 'Katastralgemeinde Donaufeld', 'Katastralgemeinde Leopoldau', 'Katastralgemeinde Stammersdorf', 'Katastralgemeinde Großjedlersdorf I', 'Katastralgemeinde Großjedlersdorf II', 'Katastralgemeinde Strebersdorf', 'Katastralgemeinde Jedlesee', 'Katastralgemeinde Schwarze Lackenau', 'Katastralgemeinde Kahlenbergerdorf', 'Katastralgemeinde Josefsdorf', 'Katastralgemeinde Nußdorf', 'Katastralgemeinde Heiligenstadt', 'Katastralgemeinde Unterdöbling', 'Katastralgemeinde Grinzing', 'Katastralgemeinde Oberdöbling', 'Katastralgemeinde Untersievering', 'Katastralgemeinde Obersievering', 'Katastralgemeinde Salmannsdorf', 'Katastralgemeinde Neustift am Walde', 'Katastralgemeinde Pötzleinsdorf', 'Katastralgemeinde Gersthof', 'Katastralgemeinde Weinhaus', 'Katastralgemeinde Währing', 'Katastralgemeinde Neuwaldegg', 'Katastralgemeinde Dornbach', 'Katastralgemeinde Hernals', 'Katastralgemeinde Hadersdorf', 'Katastralgemeinde Ottakring', 'Katastralgemeinde Neulerchenfeld', 'Katastralgemeinde Sechshaus', 'Katastralgemeinde Rudolfsheim', 'Katastralgemeinde Fünfhaus', 'Katastralgemeinde Penzing', 'Katastralgemeinde Breitensee', 'Katastralgemeinde Unterbaumgarten', 'Katastralgemeinde Oberbaumgarten', 'Katastralgemeinde Hütteldorf', 'Katastralgemeinde Breitenfurt', 'Katastralgemeinde Hagenbrunn', 'Katastralgemeinde Schönau an der Donau', 'Katastralgemeinde Mühlleiten', 'Katastralgemeinde Großenzersdorf', 'Katastralgemeinde Stallingerfeld', 'Katastralgemeinde Raasdorf', 'Katastralgemeinde Leopoldsdorf', 'Katastralgemeinde Rustenfeld', 'Katastralgemeinde Oberlanzendorf', 'Katastralgemeinde Unterlanzendorf', 'Katastralgemeinde Kledering', 'Katastralgemeinde Rannersdorf', 'Katastralgemeinde Schwechat', 'Katastralgemeinde Mannswörth', 'Katastralgemeinde Fischamend Dorf', 'Katastralgemeinde Langenzersdorf', 'Katastralgemeinde Kapellerfeld', 'Katastralgemeinde Weidlingbach', 'Katastralgemeinde Weidling', 'Katastralgemeinde Klosterneuburg', 'Katastralgemeinde Aderklaa', 'Katastralgemeinde Mauerbach', 'Katastralgemeinde Laab im Walde', 'Katastralgemeinde Purkersdorf', 'Katastralgemeinde Kaltenleutgeben', 'Katastralgemeinde Perchtoldsdorf', 'Katastralgemeinde Vösendorf', 'Katastralgemeinde Brunn am Gebirge']
6:05
I see - the data contains many administrative boundaries at various levels, including Vienna city itself (admin_level=4), districts (admin_level=9), and even lower-level boundaries. I need to get only the Vienna district relations (Bezirke) - the ones with names like "Innere Stadt", "Leopoldstadt", "Landstraße", etc.

Let me filter more carefully and also find the historical data. For the 2014-01-01 version, I can use Overpass's date feature or use a historic OSM snapshot.

Let me think about how to get the 2014 data. Overpass API supports querying data as of a specific date using the `[date:"..."]` syntax. I can use `[date:"2014-01-01T00:00:00Z"]`.

Actually, let me first cleanly get the current districts and understand the names, then get the 2014 versions.
6:05
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox

ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

gdf = ox.features_from_place("Vienna, Austria", tags={'admin_level':'9', 'boundary':'administrative'})
# Filter relations
gdf = gdf[gdf.index.get_level_values(0) == 'relation']

# Filter to only Vienna districts - these have ref like "1", "2", ... or name without "Bezirk" prefix
# Let's find which have numeric ref or are named like Vienna districts
# The Vienna districts have ref values like "1", "2", ..., "23"
print("Rows with numeric ref:", gdf[gdf['ref'].notna() & gdf['ref'].str.match(r'^\d+$')][['name','ref']])

# Also look for rows where name is a known Vienna district name
known_districts = ['Innere Stadt', 'Leopoldstadt', 'Landstraße', 'Wieden', 'Margareten', 
                   'Mariahilf', 'Neubau', 'Josefstadt', 'Alsergrund', 'Favoriten', 
                   'Simmering', 'Meidling', 'Hietzing', 'Penzing', 'Rudolfsheim-Fünfhaus',
                   'Ottakring', 'Hernals', 'Währing', 'Döbling', 'Brigittenau', 
                   'Floridsdorf', 'Donaustadt', 'Liesing']

district_gdf = gdf[gdf['name'].isin(known_districts)]
print(f"\nFound {len(district_gdf)} district rows")
print(district_gdf[['name','ref']])
district_gdf.to_parquet('/work/current_districts.parquet')
PYEOF
Rows with numeric ref:                                   name ref
element  id                               
relation 1990590            Alsergrund   9
         1990591              Hietzing  13
         1990592          Innere Stadt   1
         1990593            Josefstadt   8
         1990594          Leopoldstadt   2
         1990595             Mariahilf   6
         1990596              Meidling  12
         1990597                Neubau   7
         1990598               Penzing  14
         1990599  Rudolfsheim-Fünfhaus  15
         1990600               Währing  18
         1991416            Landstraße   3
         1991433           Brigittenau  20
         1991434            Donaustadt  22
         1991435               Döbling  19
         1991436             Favoriten  10
         1991437           Floridsdorf  21
         1991438               Hernals  17
         1991439               Liesing  23
         1991440            Margareten   5
         1991441             Ottakring  16
         1991442             Simmering  11
         1991443                Wieden   4

Found 24 district rows
                                  name  ref
element  id                                
relation 1990590            Alsergrund    9
         1990591              Hietzing   13
         1990592          Innere Stadt    1
         1990593            Josefstadt    8
         1990594          Leopoldstadt    2
         1990595             Mariahilf    6
         1990596              Meidling   12
         1990597                Neubau    7
         1990598               Penzing   14
         1990599  Rudolfsheim-Fünfhaus   15
         1990600               Währing   18
         1991416            Landstraße    3
         1991433           Brigittenau   20
         1991434            Donaustadt   22
         1991435               Döbling   19
         1991436             Favoriten   10
         1991437           Floridsdorf   21
         1991438               Hernals   17
         1991439               Liesing   23
         1991440            Margareten    5
         1991441             Ottakring   16
         1991442             Simmering   11
         1991443                Wieden    4
         5248033                Wieden  NaN
6:16
I see 24 rows but only 23 districts. There's a duplicate Wieden (one without ref, id 5248033). Let me check that. Also, I notice there's 23 districts total but the row with id 1990590 is Alsergrund with ref 9 - but there's also Wieden as 5248033 without ref. Let me check if that's a different geometry or just a duplicate.

Now I need to get the 2014-01-01 snapshots. Overpass supports the `[date:"2014-01-01T00:00:00Z"]` syntax to query historical data. Let me use that.
6:16
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox
import pandas as pd

ox.settings.timeout = 180
ox.settings.max_triple_attempts = 3

gdf = ox.features_from_place("Vienna, Austria", tags={'admin_level':'9', 'boundary':'administrative'})
gdf = gdf[gdf.index.get_level_values(0) == 'relation']
gdf = gdf[gdf['ref'].notna() & gdf['ref'].str.match(r'^\d+$')]
print("Found", len(gdf), "district relations")
print(gdf[['name', 'ref', 'wikidata']])
print("\nIndex values:", gdf.index.tolist())
PYEOF
Found 23 district relations
                                  name ref wikidata
element  id                                        
relation 1990590            Alsergrund   9  Q257780
         1990591              Hietzing  13  Q636910
         1990592          Innere Stadt   1  Q267329
         1990593            Josefstadt   8  Q256778
         1990594          Leopoldstadt   2  Q259120
         1990595             Mariahilf   6  Q654812
         1990596              Meidling  12  Q697557
         1990597                Neubau   7  Q664835
         1990598               Penzing  14  Q690335
         1990599  Rudolfsheim-Fünfhaus  15  Q653289
         1990600               Währing  18  Q690417
         1991416            Landstraße   3  Q519552
         1991433           Brigittenau  20  Q265655
         1991434            Donaustadt  22  Q258333
         1991435               Döbling  19  Q267360
         1991436             Favoriten  10  Q697548
         1991437           Floridsdorf  21  Q264645
         1991438               Hernals  17  Q262808
         1991439               Liesing  23  Q278900
         1991440            Margareten   5  Q252731
         1991441             Ottakring  16  Q265945
         1991442             Simmering  11  Q279028
         1991443                Wieden   4  Q268001

Index values: [('relation', 1990590), ('relation', 1990591), ('relation', 1990592), ('relation', 1990593), ('relation', 1990594), ('relation', 1990595), ('relation', 1990596), ('relation', 1990597), ('relation', 1990598), ('relation', 1990599), ('relation', 1990600), ('relation', 1991416), ('relation', 1991433), ('relation', 1991434), ('relation', 1991435), ('relation', 1991436), ('relation', 1991437), ('relation', 1991438), ('relation', 1991439), ('relation', 1991440), ('relation', 1991441), ('relation', 1991442), ('relation', 1991443)]
6:30
Good, so I have the current districts. There are 23 Vienna districts. Now I need to get the 2014-01-01 versions. This is tricky - Overpass can query data from specific dates using the `[date:"YYYY-MM-DDTHH:MM:SSZ"]` syntax in the OSM API, but Overpass API supports a date parameter. Let me think about how to do this.

For historical data, I can use the Overpass date parameter. The query would be:

```
[date:"2014-01-01T00:00:00Z"]
```

But the question is whether the Overpass API servers support this with the `date` parameter. Let me try.

Actually, Overpass API does support a date parameter for querying historical data. The syntax is adding `[date:"2014-01-01T00:00:00Z"]` at the very beginning of the query, or using `out geom;` with the date in the query.

Actually, looking at the Overpass API documentation, the `date` parameter is a global setting. The format is `[date:"YYYY-MM-DDTHH:MM:SSZ"]`.

Let me try to query historical data from Overpass.
6:30
6:30
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox
import requests

# Try the Overpass date parameter to get 2014 data
# First, let me check if osmnx supports date parameter
# Actually let's try directly with requests using the Overpass API date parameter

# The key is to use Overpass turbo-style query with date
# [date:"2014-01-01T00:00:00Z"] before the rest

query = '''
[out:json][timeout:180][date:"2014-01-01T00:00:00Z"];
area["name"="Wien"]["admin_level"="4"]->.a;
rel(area.a)["admin_level"="9"]["boundary"="administrative"][ref];
out geom;
'''

# Let's try with urllib
import urllib.request, urllib.parse
data = urllib.parse.urlencode({'data': query}).encode()
req = urllib.request.Request('https://overpass-api.de/api/interpreter', data=data)
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
req.add_header('User-Agent', 'Mozilla/5.0')

try:
    with urllib.request.urlopen(req, timeout=180) as resp:
        txt = resp.read().decode()
        print('Status:', resp.status)
        print('Length:', len(txt))
        import json
        d = json.loads(txt)
        print('Elements:', len(d.get('elements',[])))
        if d['elements']:
            for el in d['elements'][:5]:
                print(' ', el.get('id'), el.get('tags',{}), 'geom' in el)
except Exception as e:
    print('Error:', type(e).__name__, e)
    if hasattr(e, 'read'):
        print('Body:', e.read().decode()[:1000])
PYEOF
Error: HTTPError HTTP Error 406: Not Acceptable
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
6:41
The 406 error is strange - it seems the overpass-api.de server is rejecting requests. Let me try a different approach. Perhaps the issue is that the server expects the query formatted differently, or maybe the server configuration has changed. Let me try with `requests` library again but check the actual response headers.

Actually, looking at this more carefully, the 406 is coming from Apache, not from Overpass itself. This might be a server configuration issue. Let me try with different headers or a different approach.

Wait, let me check if the overpass turbo instance has changed URLs. Let me also try the main overpass-api.de without any special accept headers.

Let me try using the requests library with proper content-type headers and no Accept header.
6:41
Bash pip install --user httpx 2>/dev/null; cd /work && python3 << 'PYEOF'
pip install --user httpx 2>/dev/null; cd /work && python3 << 'PYEOF'
import requests

# Let me try different endpoints
endpoints = [
    'https://overpass-api.de/api/interpreter',
    'https://overpass.kumi.systems/api/interpreter',
]

query = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

for url in endpoints:
    print(f"\nTrying {url}")
    try:
        r = requests.get(url, params={'data': query}, timeout=30, 
                         headers={'User-Agent': 'Mozilla/5.0'})
        print(f"  Status: {r.status_code}")
        if r.status_code == 200:
            print(f"  Content-Type: {r.headers.get('Content-Type')}")
            data = r.json()
            print(f"  Elements: {len(data.get('elements', []))}")
        else:
            print(f"  Response: {r.text[:200]}")
    except Exception as e:
        print(f"  Error: {type(e).__name__}: {e}")
PYEOF
Collecting httpx
  Downloading httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting anyio (from httpx)
  Downloading anyio-4.14.0-py3-none-any.whl.metadata (4.6 kB)
Requirement already satisfied: certifi in /usr/local/lib/python3.14/site-packages (from httpx) (2026.5.20)
Collecting httpcore==1.* (from httpx)
  Downloading httpcore-1.0.9-py3-none-any.whl.metadata (21 kB)
Requirement already satisfied: idna in /usr/local/lib/python3.14/site-packages (from httpx) (3.18)
Collecting h11>=0.16 (from httpcore==1.*->httpx)
  Downloading h11-0.16.0-py3-none-any.whl.metadata (8.3 kB)
Downloading httpx-0.28.1-py3-none-any.whl (73 kB)
Downloading httpcore-1.0.9-py3-none-any.whl (78 kB)
Downloading h11-0.16.0-py3-none-any.whl (37 kB)
Downloading anyio-4.14.0-py3-none-any.whl (123 kB)
Installing collected packages: h11, anyio, httpcore, httpx

Successfully installed anyio-4.14.0 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1

Trying https://overpass-api.de/api/interpreter
  Status: 406
  Response: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate repres

Trying https://overpass.kumi.systems/api/interpreter
  Error: ReadTimeout: HTTPSConnectionPool(host='overpass.kumi.systems', port=443): Read timed out. (read timeout=30)
7:21
The Overpass API is returning 406 errors. Let me try with different HTTP libraries or maybe the issue is with the query format. The 406 might be specific to the Python requests default Accept header. Let me try with httpx or raw sockets.
7:21
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import httpx

query = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

# Try GET with httpx
url = 'https://overpass-api.de/api/interpreter'
try:
    with httpx.Client(timeout=120.0) as client:
        r = client.get(url, params={'data': query})
        print('Status:', r.status_code)
        print('Content-Type:', r.headers.get('content-type'))
        if r.status_code == 200:
            data = r.json()
            print('Elements:', len(data.get('elements', [])))
        else:
            print('Body:', r.text[:300])
except Exception as e:
    print('Error:', type(e).__name__, e)
PYEOF
Status: 406
Content-Type: text/html; charset=iso-8859-1
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.6
7:30
The Overpass API is returning 406. This might be because the server is checking the "Accept" header. Let me check what httpx sends by default and try with different Accept headers. Actually, I think this might be a WAF (Web Application Firewall) blocking the requests. Let me try using a different approach - maybe I need to embed the query differently or use a different User-Agent.

Wait, actually the response says "Apache/2.4.67 (Debian) Server at overpass-api.de Port 443" - this is the actual Apache server before the Overpass module. The 406 might be because Apache is configured to require a specific Accept header.

Let me try with different Accept headers.
7:30
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import httpx

query = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

url = 'https://overpass-api.de/api/interpreter'

# Try different Accept headers
accepts = ['*/*', 'application/json, text/plain, */*', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'text/html,application/json', 'application/json']

for a in accepts:
    try:
        with httpx.Client(timeout=30.0) as client:
            r = client.get(url, params={'data': query}, headers={'Accept': a, 'User-Agent': 'Mozilla/5.0'})
            print(f'Accept: {a[:30]:30s} -> Status: {r.status_code}', end='')
            if r.status_code == 200:
                data = r.json()
                print(f', Elements: {len(data.get("elements", []))}')
            else:
                print(f', Body: {r.text[:80]}')
    except Exception as e:
        print(f'Accept: {a[:30]:30s} -> Error: {type(e).__name__}')
PYEOF
Accept: */*                            -> Status: 406, Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
Accept: application/json, text/plain,  -> Status: 406, Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
Accept: text/html,application/xhtml+xm -> Status: 406, Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
Accept: text/html,application/json     -> Status: 406, Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
Accept: application/json               -> Status: 406, Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
7:43
Hmm, the Overpass API is consistently returning 406. The issue might be that the server is checking something else - maybe the User-Agent or the request format. Let me try POST with proper form-encoding and without the Accept header that might cause issues.

Actually wait - maybe the issue is that the Apache server sits in front and might be blocking something. Let me look at this from a different angle. Perhaps I can use a different URL encoding or use the Overpass Turbo style query format.

Let me also try the `lz4.overpass-api.de` mirror with POST.
7:43
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import httpx

# Maybe the issue is how we encode the query. Let's try with POST and data in the body
query = '[out:json][timeout:120];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

url = 'https://overpass-api.de/api/interpreter'

try:
    with httpx.Client(timeout=120.0) as client:
        # POST with form data properly
        r = client.post(url, data={'data': query}, headers={'User-Agent': 'Mozilla/5.0'})
        print('Status:', r.status_code)
        print('Body:', r.text[:500])
except Exception as e:
    print('Error:', type(e).__name__, e)
PYEOF
Status: 406
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
7:50
The Overpass API is returning 406 Not Acceptable. This is unusual. Let me try with a simpler query first to see if it even works.
7:50
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import httpx

# Simple query first
query = '[out:json];node(48.2,16.3,48.3,16.4);out 1;'

url = 'https://overpass-api.de/api/interpreter'
try:
    with httpx.Client(timeout=60.0) as client:
        r = client.post(url, data={'data': query})
        print('Status:', r.status_code)
        print('Body:', r.text[:500])
except Exception as e:
    print('Error:', type(e).__name__, e)
PYEOF
Status: 406
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate representation of the requested resource could not be found on this server.</p>
<hr>
<address>Apache/2.4.67 (Debian) Server at overpass-api.de Port 443</address>
</body></html>
7:58
The Overpass API is returning 406 for all requests, even simple ones. This is very strange. Let me check if maybe the environment has network restrictions or if there's a proxy setting issue. Let me try a different approach - maybe I should try to use the overpass-api via Python's osmnx which uses a different mechanism. Actually, osmnx worked fine above! It was able to get features from Overpass. Let me use osmnx to get the data since it clearly works.

For the 2014 data, I can use Overpass's `[date:"2014-01-01T00:00:00Z"]` feature, but I need to figure out how to do it with osmnx or by using the same mechanism.

Let me check how osmnx makes its requests.
7:58
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox

# Check how osmnx is configured to make requests
print("settings.default_accept_language:", ox.settings.default_accept_language)
print("settings.user_agent:", ox.settings.user_agent)

# The features_from_place uses overpass. Let me see internal details
import inspect
src = inspect.getsource(ox._overpass)
print(src[:2000])
PYEOF
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
AttributeError: module 'osmnx.settings' has no attribute 'default_accept_language'. Did you mean: 'http_accept_language'?
8:05
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox
print("default_accept:", ox.settings.default_accept)
print("user_agent:", ox.settings.default_user_agent)
print("http_accept_language:", ox.settings.http_accept_language)
if hasattr(ox.settings, 'overpass_endpoint'):
    print("overpass_endpoint:", ox.settings.overpass_endpoint)
elif hasattr(ox.settings, 'overpass_url'):
    print("overpass_url:", ox.settings.overpass_url)
PYEOF
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: module 'osmnx.settings' has no attribute 'default_accept'. Did you mean: 'default_access'?
8:11
Let me check the osmnx settings module directly.
8:11
Bash cd /work && python3 -c "import osmnx.settings; print([x for x in dir(osmnx.setti...
['Any', 'TYPE_CHECKING', 'all_oneway', 'annotations', 'bidirectional_network_types', 'cache_folder', 'cache_only_mode', 'data_folder', 'default_access', 'default_crs', 'doh_url_template', 'elevation_url_template', 'http_accept_language', 'http_referer', 'http_user_agent', 'imgs_folder', 'lg', 'log_console', 'log_file', 'log_filename', 'log_level', 'log_name', 'logs_folder', 'max_query_area_size', 'nominatim_key', 'nominatim_url', 'overpass_memory', 'overpass_rate_limit', 'overpass_settings', 'overpass_url', 'requests_kwargs', 'requests_timeout', 'use_cache', 'useful_tags_node', 'useful_tags_way']
8:22
So osmnx 2.x uses `overpass_url` and `overpass_settings`. Let me check how osmnx makes its requests to Overpass, because it seems to work (the features_from_place query worked). So osmnx's internal mechanism works, but my direct requests are failing with 406. Let me look at how osmnx sends requests.

Actually, looking back - my `features_from_place` query via osmnx DID work and returned data. So the overpass endpoint works fine. The issue is with how I'm encoding the request directly. osmnx must be doing something slightly different. Let me look at what `overpass_settings` contains and replicate it.

Actually, since osmnx works, I should just use osmnx to get both current and historical data. osmnx can accept a `date` parameter or `overpass_settings` can include the date.

Wait - osmnx 2.x has `overpass_settings` which gets prepended. So I can set `ox.settings.overpass_settings = '[out:json][timeout:180][date:"2014-01-01T00:00:00Z"]'` and then call features_from_place.
8:22
Bash cd /work && python3 -c "
cd /work && python3 -c "
import osmnx.settings
print('overpass_url:', osmnx.settings.overpass_url)
print('overpass_settings:', osmnx.settings.overpass_settings)
"
overpass_url: https://overpass-api.de/api
overpass_settings: [out:json][timeout:{timeout}]{maxsize}
8:30
Interesting - osmnx is using overpass-url ending in /api, and appending /interpreter. Let me look at how osmnx makes the request internally.

Actually, the 406 error suggests the overpass-api.de server is rejecting the request. Let me check if maybe the server requires specific formatting or if the request needs to be URL-encoded differently.

Let me try using the osmnx internal functions to see exactly how it works.
8:30
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx as ox
import inspect

# Find where overpass is called
try:
    src = inspect.getsource(ox._overpass) 
except:
    pass
try:
    src = inspect.getsource(ox._query)
    print(src[:3000])
except:
    pass
PYEOF
(no output)
8:37
Let me try looking at how osmnx actually runs its overpass queries internally, and check if the `overpass_url` is constructed properly.
8:37
Bash cd /work && python3 -c "
cd /work && python3 -c "
import osmnx as ox
import os
print(ox.__file__)
"
/usr/local/lib/python3.14/site-packages/osmnx/__init__.py
9:06
Let me look at the osmnx overpass module to understand how it makes requests.
9:06
Bash ls /usr/local/lib/python3.14/site-packages/osmnx/
__init__.py
__pycache__
_api_v1.py
_errors.py
_http.py
_nominatim.py
_osm_xml.py
_overpass.py
_validate.py
bearing.py
convert.py
distance.py
elevation.py
features.py
geocoder.py
graph.py
io.py
plot.py
projection.py
py.typed
routing.py
settings.py
simplification.py
stats.py
truncate.py
utils.py
utils_geo.py
9:17
The 406 error from Overpass is strange. But osmnx's `features_from_place` works fine (we already got data). Let me look at how osmnx makes its HTTP requests to the Overpass API.

Let me check the `_http.py` file.
9:17
Read /usr/local/lib/python3.14/site-packages/osmnx/_http.py
file_path must be inside the working directory (/work): '/usr/local/lib/python3.14/site-packages/osmnx/_http.py'
9:25
I can't read files outside the working directory. Let me try to understand how osmnx makes its HTTP requests and see why it works for osmnx but not direct requests.

The key issue is the 406 error. Let me check if osmnx works using its own internal mechanism, and see if maybe the overpass API is being blocked differently.

Let me actually try using osmnx to get the current data (which worked earlier) and save it. Then for historical data, maybe I need to try an alternative approach - perhaps I can use the Overpass API with a different query format, or maybe I can get historical data through a different approach like Osmosis or a local OSM extract.

Wait - actually the osmnx features_from_place call worked earlier and returned the data! The issue is just with my direct requests to the API. But osmnx uses the same URL. Let me check what headers osmnx sets.
9:25
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import osmnx.settings

# Try the exact same way osmnx does it - copy their settings
url = osmnx.settings.overpass_url
print('URL:', url)

# osmnx uses requests library
# Let's try with the exact same parameters
query = '[out:json][timeout:180];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'

# Try POST
r = requests.post(url + '/interpreter', data={'data': query}, 
                  headers={'User-Agent': 'Mozilla/5.0', 'Accept': '*/*'},
                  timeout=120)
print('Status:', r.status_code)
print('Body:', r.text[:200] if len(r.text) < 200 else r.text[:200] + '...')
if r.status_code == 200:
    import json
    data = r.json()
    print('Elements:', len(data.get('elements',[])))
PYEOF
URL: https://overpass-api.de/api
Status: 406
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<title>406 Not Acceptable</title>
</head><body>
<h1>Not Acceptable</h1>
<p>An appropriate repres...
9:32
The issue is that the Overpass API is returning 406 for our requests. But osmnx already worked and fetched data successfully. So let me look at what osmnx actually does internally. Let me check the _overpass.py file.
9:32
Bash cd /work && python3 -c "
cd /work && python3 -c "
import osmnx._overpass as m
import inspect
print(inspect.getsource(m))
" 2>&1 | head -100
"""Tools to work with the Overpass API."""

from __future__ import annotations

import datetime as dt
import logging as lg
import time
from collections import OrderedDict
from typing import TYPE_CHECKING
from typing import Any

import numpy as np
import requests
from requests.exceptions import ConnectionError as RequestsConnectionError

from . import _http
from . import projection
from . import settings
from . import utils
from . import utils_geo
from ._errors import InsufficientResponseError

if TYPE_CHECKING:
    from collections.abc import Iterator

    from shapely import MultiPolygon
    from shapely import Polygon


def _get_network_filter(network_type: str) -> str:
    """
    Create a filter to query Overpass for the specified network type.

    The filter queries Overpass for every OSM way with a "highway" tag but
    excludes ways that are incompatible with the requested network type. You
    can choose from the following types:

    "all" retrieves all public and private-access ways currently in use,
    excluding those that represent areas either explicitly (area=yes) or by
    convention (rest_area, services).

    "all_public" retrieves all public ways currently in use.

    "bike" retrieves public bikeable ways and excludes foot ways, motor ways,
    and anything tagged biking=no.

    "drive" retrieves public drivable streets and excludes service roads,
    anything tagged motor=no, and certain non-service roads tagged as
    providing certain services (such as alleys or driveways).

    "drive_service" retrieves public drivable streets including service roads
    but excludes certain services (such as parking or emergency access).

    "walk" retrieves public walkable ways and excludes cycle ways, motor ways,
    and anything tagged foot=no. It includes service roads like parking lot
    aisles and alleys that you can walk on even if they are unpleasant walks.

    Parameters
    ----------
    network_type
        {"all", "all_public", "bike", "drive", "drive_service", "walk"}
        What type of street network to retrieve.

    Returns
    -------
    way_filter
        The Overpass query filter.
    """
    # define built-in queries to send to the API. specifying way["highway"]
    # means that all ways returned must have a highway tag. the filters then
    # remove ways by tag/value.
    filters = {}

    # driving: filter out un-drivable roads, service roads, private ways, and
    # anything tagged motor=no. also filter out any non-service roads that are
    # tagged as providing certain services
    filters["drive"] = (
        f'["highway"]["area"!~"yes"]{settings.default_access}'
        f'["highway"!~"abandoned|bridleway|bus_guideway|construction|corridor|'
        f"cycleway|elevator|escalator|footway|no|path|pedestrian|planned|platform|"
        f'proposed|raceway|razed|rest_area|service|services|steps|track"]'
        f'["motor_vehicle"!~"no"]["motorcar"!~"no"]'
        f'["service"!~"alley|driveway|emergency_access|parking|parking_aisle|private"]'
    )

    # drive+service: allow ways tagged 'service' but filter out certain types
    filters["drive_service"] = (
        f'["highway"]["area"!~"yes"]{settings.default_access}'
        f'["highway"!~"abandoned|bridleway|bus_guideway|construction|corridor|'
        f"cycleway|elevator|escalator|footway|no|path|pedestrian|planned|platform|"
        f'proposed|raceway|razed|rest_area|services|steps|track"]'
        f'["motor_vehicle"!~"no"]["motorcar"!~"no"]'
        f'["service"!~"emergency_access|parking|parking_aisle|private"]'
    )

    # walking: filter out cycle ways, motor ways, private ways, and anything
    # tagged foot=no. allow service roads, permitting things like parking lot
    # aisles, alleys, etc that you *can* walk on even if they're not exactly
    # pleasant walks. some cycleways may allow pedestrians, but this filter
    # ignores such cycleways.
9:38
The Overpass API is returning 406. This seems to be a configuration issue. Let me look at how osmnx actually makes HTTP requests internally.
9:38
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
# Let me look at _overpass.py to see how osmnx downloads data
# and then try to replicate
import osmnx._overpass as m
import inspect

src = inspect.getsource(m)
# Find the download function
idx = src.find('def _download_overpass')
print(src[idx:idx+1500])
PYEOF
def _download_overpass_network(
    polygon: Polygon | MultiPolygon,
    network_type: str,
    custom_filter: str | list[str] | None,
) -> Iterator[dict[str, Any]]:
    """
    Retrieve networked ways and nodes within boundary from the Overpass API.

    Parameters
    ----------
    polygon
        The boundary to fetch the network ways/nodes within.
    network_type
        What type of street network to get if `custom_filter` is None.
    custom_filter
        A custom "ways" filter to be used instead of `network_type` presets.

    Yields
    ------
    response_json
        JSON response from the Overpass server.
    """
    # create filter(s) to exclude certain kinds of ways based on the requested
    # network_type, if provided, otherwise use custom_filter
    way_filters = []
    if isinstance(custom_filter, list):
        way_filters = custom_filter
    elif isinstance(custom_filter, str):
        way_filters = [custom_filter]
    else:
        way_filters = [_get_network_filter(network_type)]

    # create overpass settings string
    overpass_settings = _make_overpass_settings()

    # subdivide query polygon to get list of sub-divided polygon coord strings
    polygon_coord_strs = _make_overpass_polygon_coord_strs(polygon)
    msg = f"Requesting data from API in {len(polygon_coord_strs)} request(s)"
    utils.log(msg, level=lg.INFO)

    # pass exterior coordinates of each polygon in list to API, one at a time
    # the '>' makes it recurse so we get ways and the
9:47
The Overpass API is giving 406 errors for direct requests. But osmnx's `features_from_place` works fine. Let me look at how osmnx makes the Overpass request to replicate that exactly. Let me look at _http.py or _overpass.py more closely.
9:47
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx._overpass as m
import inspect

src = inspect.getsource(m)
idx = src.find('def _make_overpass_settings')
print(src[idx:idx+800])
PYEOF
def _make_overpass_settings() -> str:
    """
    Make settings string to send in Overpass query.

    Returns
    -------
    overpass_settings
        The `settings.overpass_settings` string formatted with "timeout" and
        "maxsize" values.
    """
    maxsize = "" if settings.overpass_memory is None else f"[maxsize:{settings.overpass_memory}]"
    return settings.overpass_settings.format(timeout=settings.requests_timeout, maxsize=maxsize)


def _make_overpass_polygon_coord_strs(polygon: Polygon | MultiPolygon) -> list[str]:
    """
    Subdivide query polygon and return list of coordinate strings.

    Project to UTM, divide `polygon` up into sub-polygons if area exceeds a
    max size (in meters), project back to lat-lon, then get a list of
    polygon(s) exterior coordinates. Ign
9:54
The Overpass API keeps returning 406. Let me look at how osmnx makes requests from _http.py to understand the HTTP layer they use.
9:54
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx._overpass as m
import inspect

src = inspect.getsource(m)
idx = src.find('def _download_overpass_network')
# Find the actual HTTP call
rest = src[idx:]
print(rest[:2000])
PYEOF
def _download_overpass_network(
    polygon: Polygon | MultiPolygon,
    network_type: str,
    custom_filter: str | list[str] | None,
) -> Iterator[dict[str, Any]]:
    """
    Retrieve networked ways and nodes within boundary from the Overpass API.

    Parameters
    ----------
    polygon
        The boundary to fetch the network ways/nodes within.
    network_type
        What type of street network to get if `custom_filter` is None.
    custom_filter
        A custom "ways" filter to be used instead of `network_type` presets.

    Yields
    ------
    response_json
        JSON response from the Overpass server.
    """
    # create filter(s) to exclude certain kinds of ways based on the requested
    # network_type, if provided, otherwise use custom_filter
    way_filters = []
    if isinstance(custom_filter, list):
        way_filters = custom_filter
    elif isinstance(custom_filter, str):
        way_filters = [custom_filter]
    else:
        way_filters = [_get_network_filter(network_type)]

    # create overpass settings string
    overpass_settings = _make_overpass_settings()

    # subdivide query polygon to get list of sub-divided polygon coord strings
    polygon_coord_strs = _make_overpass_polygon_coord_strs(polygon)
    msg = f"Requesting data from API in {len(polygon_coord_strs)} request(s)"
    utils.log(msg, level=lg.INFO)

    # pass exterior coordinates of each polygon in list to API, one at a time
    # the '>' makes it recurse so we get ways and the ways' nodes.
    for polygon_coord_str in polygon_coord_strs:
        for way_filter in way_filters:
            query_str = f"{overpass_settings};(way{way_filter}(poly:{polygon_coord_str!r});>;);out;"
            yield _overpass_request(OrderedDict(data=query_str))


def _download_overpass_features(
    polygon: Polygon,
    tags: dict[str, bool | str | list[str]],
) -> Iterator[dict[str, Any]]:
    """
    Retrieve OSM features within some boundary polygon from the Overpass API.

    Paramete
10:02
Let me look at how _overpass_request works to see what the actual HTTP call is.
10:02
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx._overpass as m
import inspect

src = inspect.getsource(m)
idx = src.find('def _overpass_request')
print(src[idx:idx+2500])
PYEOF
def _overpass_request(data: OrderedDict[str, Any]) -> dict[str, Any]:
    """
    Send a HTTP POST request to the Overpass API and return response.

    Parameters
    ----------
    data
        Key-value pairs of parameters.

    Returns
    -------
    response_json
        The Overpass API's response.
    """
    # resolve url to same IP even if there is server round-robin redirecting
    _http._config_dns(settings.overpass_url)

    # prepare the Overpass API URL and see if request already exists in cache
    url = settings.overpass_url.rstrip("/") + "/interpreter"
    prepared_url = str(requests.Request("GET", url, params=data).prepare().url)
    cached_response_json = _http._retrieve_from_cache(prepared_url)
    if isinstance(cached_response_json, dict):
        return cached_response_json

    # pause then request this URL
    pause = _get_overpass_pause(settings.overpass_url)
    hostname = _http._hostname_from_url(url)
    msg = f"Pausing {pause} second(s) before making HTTP POST request to {hostname!r}"
    utils.log(msg, level=lg.INFO)
    time.sleep(pause)

    # transmit the HTTP POST request
    msg = f"Post {prepared_url} with timeout={settings.requests_timeout}"
    utils.log(msg, level=lg.INFO)
    response = requests.post(
        url,
        data=data,
        timeout=settings.requests_timeout,
        headers=_http._get_http_headers(),
        **settings.requests_kwargs,
    )

    # handle 429 and 504 errors by pausing then recursively re-trying request
    if response.status_code in {429, 504}:  # pragma: no cover
        error_pause = 55
        msg = (
            f"{hostname!r} responded {response.status_code} {response.reason}: "
            f"we'll retry in {error_pause} secs"
        )
        utils.log(msg, level=lg.WARNING)
        time.sleep(error_pause)
        return _overpass_request(data)

    response_json = _http._parse_response(response)
    if not isinstance(response_json, dict):  # pragma: no cover
        msg = "Overpass API did not return a dict of results."
        raise InsufficientResponseError(msg)
    _http._save_to_cache(prepared_url, response_json, response.ok)
    return response_json
10:09
I see - osmnx uses `requests.post` with `data=OrderedDict(data=query_str)` - meaning it passes the query string as a POST parameter. And it uses `_http._get_http_headers()`. Let me check what that returns.
10:09
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx._http as h
import inspect

src = inspect.getsource(h)
print(src)
PYEOF
"""Handle HTTP requests to web APIs."""

from __future__ import annotations

import json
import logging as lg
import socket
from hashlib import sha1
from pathlib import Path
from typing import Any
from urllib.parse import urlparse

import requests
from requests.exceptions import JSONDecodeError

from . import settings
from . import utils
from ._errors import InsufficientResponseError
from ._errors import ResponseStatusCodeError

# capture getaddrinfo function to use original later after mutating it
_original_getaddrinfo = socket.getaddrinfo


def _save_to_cache(
    url: str,
    response_json: dict[str, Any] | list[dict[str, Any]],
    ok: bool,  # noqa: FBT001
) -> None:
    """
    Save a HTTP response JSON object to a file in the cache folder.

    If request was sent to server via POST instead of GET, then `url` should
    be a GET-style representation of the request. Response is only saved to a
    cache file if `settings.use_cache` is True, `ok` is True, `response_json`
    is not None, and `response_json` does not contain a server "remark."

    Users should always pass OrderedDicts instead of dicts of parameters into
    request functions, so the parameters remain in the same order each time,
    producing the same URL string, and thus the same hash. Otherwise you will
    get a cache miss when the URL's parameters appeared in a different order.

    Parameters
    ----------
    url
        The URL of the request.
    response_json
        The JSON HTTP response.
    ok
        A `requests.response.ok` value.
    """
    if settings.use_cache:
        if not ok:  # pragma: no cover
            msg = "Did not save to cache because HTTP status code is not OK"
            utils.log(msg, level=lg.WARNING)
        elif isinstance(response_json, dict) and ("remark" in response_json):  # pragma: no cover
            msg = f"Did not save to cache because response contains remark: {response_json['remark']!r}"
            utils.log(msg, lg.WARNING)
        else:
            # create cache folder on disk if it doesn't already exist
            cache_filepath = _resolve_cache_filepath(url)
            cache_filepath.parent.mkdir(parents=True, exist_ok=True)
            cache_filepath.write_text(json.dumps(response_json), encoding="utf-8")
            msg = f"Saved response to cache file {str(cache_filepath)!r}"
            utils.log(msg, level=lg.INFO)


def _resolve_cache_filepath(key: str, extension: str = "json") -> Path:
    """
    Determine a cache key's corresponding cache file path.

    This uses the configured `settings.cache_folder` and calculates the 160
    bit SHA-1 hash digest (40 hexadecimal characters) of `key` to determine a
    succinct but unique cache filename.

    Parameters
    ----------
    key
        The key for which to generate a cache file path, for example, a URL.
    extension
        The desired cache file's extension.

    Returns
    -------
    cache_filepath
        Cache file path corresponding to `key`.
    """
    digest = sha1(key.encode("utf-8")).hexdigest()  # noqa: S324
    return Path(settings.cache_folder) / f"{digest}.{extension}"


def _check_cache(key: str) -> Path | None:
    """
    Check if a key exists in the cache, and return its cache file path if so.

    Parameters
    ----------
    key
        The key to look for in the cache.

    Returns
    -------
    cache_filepath
        Filepath to cached data for `key` if it exists, otherwise None.
    """
    cache_filepath = _resolve_cache_filepath(key)
    return cache_filepath if cache_filepath.is_file() else None


def _retrieve_from_cache(url: str) -> dict[str, Any] | list[dict[str, Any]] | None:
    """
    Retrieve a HTTP response JSON object from the cache if it exists.

    A cache hit returns the data. A cache miss returns None.

    Parameters
    ----------
    url
        The URL of the request.

    Returns
    -------
    response_json
        The cached response for `url` if it exists, otherwise None.
    """
    # if the tool is configured to use the cache
    if settings.use_cache:
        # return cached response for this url if exists, otherwise return None
        cache_filepath = _check_cache(url)
        if cache_filepath is not None:
            response_json: dict[str, Any] | list[dict[str, Any]]
            response_json = json.loads(cache_filepath.read_text(encoding="utf-8"))
            msg = f"Retrieved response from cache file {str(cache_filepath)!r}"
            utils.log(msg, lg.INFO)
            return response_json

    return None


def _get_http_headers(
    *,
    user_agent: str | None = None,
    referer: str | None = None,
    accept_language: str | None = None,
) -> dict[str, str]:
    """
    Update the default requests HTTP headers with OSMnx information.

    Parameters
    ----------
    user_agent
        The user agent. If None, use `settings.http_user_agent` value.
    referer
        The referer. If None, use `settings.http_referer` value.
    accept_language
        The accept language. If None, use `settings.http_accept_language`
        value.

    Returns
    -------
    headers
        The updated HTTP headers.
    """
    if user_agent is None:
        user_agent = settings.http_user_agent
    if referer is None:
        referer = settings.http_referer
    if accept_language is None:
        accept_language = settings.http_accept_language

    info = {"User-Agent": user_agent, "referer": referer, "Accept-Language": accept_language}
    headers = dict(requests.utils.default_headers())
    headers.update(info)
    return headers


def _resolve_host_via_doh(hostname: str) -> str:
    """
    Resolve hostname to IP address via Google's public DNS-over-HTTPS API.

    Necessary fallback as socket.gethostbyname will not always work when using
    a proxy. See https://developers.google.com/speed/public-dns/docs/doh/json
    If the user has set `settings.doh_url_template=None` or if resolution
    fails (e.g., due to local network blocking DNS-over-HTTPS) the hostname
    itself will be returned instead. Note that this means that server slot
    management may be violated: see `_config_dns` documentation for details.

    Parameters
    ----------
    hostname
        The hostname to consistently resolve the IP address of.

    Returns
    -------
    ip_address
        Resolved IP address of host, or hostname itself if resolution failed.
    """
    if settings.doh_url_template is None:
        # if user has set the url template to None, return hostname itself
        msg = "User set `doh_url_template=None`, requesting host by name"
        utils.log(msg, level=lg.WARNING)
        return hostname

    err_msg = f"Failed to resolve {hostname!r} IP via DoH, requesting host by name"
    try:
        url = settings.doh_url_template.format(hostname=hostname)
        response = requests.get(url, timeout=settings.requests_timeout)
        data = response.json()

    # if we cannot reach DoH server or resolve host, return hostname itself
    except requests.exceptions.RequestException:  # pragma: no cover
        utils.log(err_msg, level=lg.ERROR)
        return hostname

    # if there were no request exceptions, return
    else:
        if response.ok and data["Status"] == 0:
            # status 0 means NOERROR, so return the IP address
            ip_address: str = data["Answer"][0]["data"]
            return ip_address

        # otherwise, if we cannot reach DoH server or cannot resolve host
        # just return the hostname itself
        utils.log(err_msg, level=lg.ERROR)
        return hostname


def _config_dns(url: str) -> None:
    """
    Force socket.getaddrinfo to use IP address instead of hostname.

    Resolves URL's hostname to an IP address so that we use the same server
    for both 1) checking the necessary pause duration and 2) sending the query
    itself even if there is round-robin redirecting among multiple server
    machines on the server-side. Mutates the getaddrinfo function so it uses
    the same IP address everytime it finds the hostname in the URL.

    For example, the server overpass-api.de just redirects to one of the other
    servers (currently gall.openstreetmap.de and lambert.openstreetmap.de). So
    if we check the status endpoint of overpass-api.de, we may see results for
    server gall, but when we submit the query itself it gets redirected to
    server lambert. This could result in violating server lambert's slot
    management timing.

    Parameters
    ----------
    url
        The URL to consistently resolve the IP address of.
    """
    hostname = _hostname_from_url(url)
    try:
        ip = socket.gethostbyname(hostname)
    except socket.gaierror:  # pragma: no cover
        # may occur when using a proxy, so instead resolve IP address via DoH
        msg = f"Encountered gaierror while trying to resolve {hostname!r}, trying again via DoH..."
        utils.log(msg, level=lg.ERROR)
        ip = _resolve_host_via_doh(hostname)

    # mutate socket.getaddrinfo to map hostname -> IP address
    def _getaddrinfo(*args: Any, **kwargs: Any) -> Any:  # noqa: ANN401
        if hostname == next(iter(args), kwargs.get("host")):
            # remove "host" from kwargs to avoid TypeError with positional argument
            kwargs.pop("host", None)
            msg = f"Resolved {hostname!r} to {ip!r}"
            utils.log(msg, level=lg.INFO)
            return _original_getaddrinfo(ip, *args[1:], **kwargs)

        # otherwise
        return _original_getaddrinfo(*args, **kwargs)

    socket.getaddrinfo = _getaddrinfo


def _hostname_from_url(url: str) -> str:
    """
    Extract the hostname (domain) from a URL.

    Parameters
    ----------
    url
        The url from which to extract the hostname.

    Returns
    -------
    hostname
        The extracted hostname (domain).
    """
    return urlparse(url).netloc.split(":")[0]


def _parse_response(response: requests.Response) -> dict[str, Any] | list[dict[str, Any]]:
    """
    Parse JSON from a requests response and log the details.

    Parameters
    ----------
    response
        The response object.

    Returns
    -------
    response_json
        Value will be a dict if the response is from the Google or Overpass
        APIs, and a list if the response is from the Nominatim API.
    """
    # log the response size and hostname
    hostname = _hostname_from_url(response.url)
    size_kb = len(response.content) / 1000
    msg = f"Downloaded {size_kb:,.1f}kB from {hostname!r} with status {response.status_code}"
    utils.log(msg, level=lg.INFO)

    # parse the response to JSON and log/raise exceptions
    try:
        response_json: dict[str, Any] | list[dict[str, Any]] = response.json()
    except JSONDecodeError as e:  # pragma: no cover
        msg = f"{hostname!r} responded: {response.status_code} {response.reason} {response.text}"
        utils.log(msg, level=lg.ERROR)
        if response.ok:
            raise InsufficientResponseError(msg) from e
        raise ResponseStatusCodeError(msg) from e

    # log any remarks if they exist
    if isinstance(response_json, dict) and "remark" in response_json:  # pragma: no cover
        msg = f"{hostname!r} remarked: {response_json['remark']!r}"
        utils.log(msg, level=lg.WARNING)

    # log if the response status_code is not OK
    if not response.ok:
        msg = f"{hostname!r} returned HTTP status code {response.status_code}"
        utils.log(msg, level=lg.WARNING)

    return response_json
10:22
So osmnx uses `requests.post` with `data=data` where data is an OrderedDict with a `data` key. Okay, I see the issue now - the 406 is from the Overpass server not our code. Let me try with the exact approach that osmnx uses - POST with data as an OrderedDict.

Wait, but osmnx successfully downloads data! It worked earlier when I called `ox.features_from_place`. So the server does work. Let me try replicating the exact approach - using POST with data as an OrderedDict.

Actually, when I look at the osmnx code more carefully:

```python
response = requests.post(
    url,
    data=data,
    timeout=settings.requests_timeout,
    headers=_http._get_http_headers(),
    **settings.requests_kwargs,
)
```

And `_get_http_headers()` returns:
```
{"User-Agent": ..., "referer": ..., "Accept-Language": ...}
```

No explicit `Accept` header. Let me try without setting Accept.
10:22
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings

# Try the same way osmnx does it 
osmnx.settings.requests_timeout = 180
url = osmnx.settings.overpass_url.rstrip("/") + "/interpreter"

query = '[out:json][timeout:180];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out geom;'
data = OrderedDict(data=query)

headers = {"User-Agent": osmnx.settings.http_user_agent, 
           "referer": osmnx.settings.http_referer,
           "Accept-Language": osmnx.settings.http_accept_language}

print("headers:", headers)
response = requests.post(url, data=data, timeout=180, headers=headers)
print("Status:", response.status_code)
print("Body:", response.text[:500])
if response.status_code == 200:
    import json
    d = response.json()
    print("Elements:", len(d.get('elements',[])))
PYEOF
headers: {'User-Agent': 'OSMnx Python package (https://github.com/gboeing/osmnx)', 'referer': 'OSMnx Python package (https://github.com/gboeing/osmnx)', 'Accept-Language': 'en'}
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
  <title>OSM3S Response</title>
</head>
<body>

<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
10:38
Interesting - the 406 issue was with the Accept header. When using osmnx's headers (which don't set Accept), we get a 504 (Gateway Timeout) which is better - it means the query is being processed. The 504 is different from 406. Let me try again, but maybe the query is too big. Let me try without the `out geom;` which returns full geometries, or use `out body;` and then get the geometry separately.

Actually, the issue might be that the query is too large with `out geom;`. Let me try getting just the relation members first.
10:38
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json

osmnx.settings.requests_timeout = 180
osmnx.settings.overpass_memory = 2147483648  # 2GB

url = osmnx.settings.overpass_url.rstrip("/") + "/interpreter"

# Simpler query - just get district relations with tags and ref
query = '[out:json][timeout:180][maxsize:2000000000];area["name"="Wien"]["admin_level"="4"]->.a;rel(area.a)["admin_level"="9"]["boundary"="administrative"];out body;'
data = OrderedDict(data=query)

headers = {"User-Agent": osmnx.settings.http_user_agent, 
           "referer": osmnx.settings.http_referer,
           "Accept-Language": osmnx.settings.http_accept_language}

response = requests.post(url, data=data, timeout=180, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    print("Elements:", len(d.get('elements',[])))
    for el in d.get('elements', []):
        if el.get('type') == 'relation':
            tags = el.get('tags', {})
            ref = tags.get('ref', '')
            name = tags.get('name', '')
            members = el.get('members', [])
            print(f"  rel {el['id']}: ref={ref}, name={name}, members={len(members)}")
else:
    print("Body:", response.text[:500])
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
  <title>OSM3S Response</title>
</head>
<body>

<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000
10:55
The Overpass API is returning 504 (gateway timeout). Let me try a more focused query, and also try with osmnx's own features_from_place which already works as we saw earlier. Actually, we already got the current districts earlier via osmnx.

Let me try the 2014 data. OsMnx features_from_place works with very specific tags. The issue with our direct Overpass queries is that they're timing out. But osmnx was able to download successfully (we got 901 features).

Let me try to get the 2014 data by using the Overpass date parameter through osmnx. Actually, osmnx doesn't support a date parameter. Let me try a more targeted direct query.

Looking at the 504... maybe the polygon is too large. Let me try getting the districts directly by their known relation IDs.
10:55
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json

osmnx.settings.requests_timeout = 180
osmnx.settings.overpass_memory = 2147483648

url = osmnx.settings.overpass_url.rstrip("/") + "/interpreter"

# Get only the 23 Vienna district relations by known IDs
# We know the IDs: 1990590 through 1990600, 1991416, 1991433 through 1991443
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,
           1990597,1990598,1990599,1990600,1991416,1991433,1991434,
           1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
rel_list = ','.join(str(i) for i in rel_ids)

# Current version - get geom
query = f'[out:json][timeout:180][maxsize:2000000000];rel(id:{rel_list});out geom;'
data = OrderedDict(data=query)
headers = {"User-Agent": osmnx.settings.http_user_agent, 
           "referer": osmnx.settings.http_referer,
           "Accept-Language": osmnx.settings.http_accept_language}

response = requests.post(url, data=data, timeout=180, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    print("Elements:", len(d.get('elements',[])))
    with open('/work/current_districts_overpass.json', 'w') as f:
        json.dump(d, f)
    print("Saved to file")
    for el in d.get('elements', [])[:3]:
        print('  id:', el['id'], 'tags:', el.get('tags',{}), 'geom keys:', [k for k in el if k in ('geometry','members','bounds')])
else:
    print("Body:", response.text[:1000])
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
  <title>OSM3S Response</title>
</head>
<body>

<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: runtime error: open64: 0 Success /osm3s_osm_base Dispatcher_Client::request_read_and_idx::timeout. The server is probably too busy to handle your request. </p>

</body>
</html>
11:09
The Overpass API endpoints are timing out or returning 504. Let me try the kumi.systems endpoint more patiently, or try with a simpler query first. Actually, let me also try with OSMnx features_from_place as it already worked earlier (returned 901 results). Let me extract the district geoms from what osmnx already downloaded.

Actually wait - osmnx features_from_place DID work earlier. It returned 901 elements but I filtered to only the district relations. The issue is that `out geom` for relations might be too heavy. But osmnx already successfully downloaded the data. Let me use osmnx to get the geometries.

Let me check if the current data was saved as parquet.
11:09
Bash ls -la /work/current_districts.parquet 2>/dev/null && python3 -c "
ls -la /work/current_districts.parquet 2>/dev/null && python3 -c "
import geopandas as gpd
gdf = gpd.read_parquet('/work/current_districts.parquet')
print('Loaded:', gdf.shape, gdf.crs)
print(gdf[['name','ref']])
print(gdf.geometry.type.unique())
"
-rw-r--r-- 1 agent agent 370232 Jun 16 16:52 /work/current_districts.parquet
Loaded: (24, 411) {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "GeographicCRS", "name": "WGS 84", "datum_ensemble": {"name": "World Geodetic System 1984 ensemble", "members": [{"name": "World Geodetic System 1984 (Transit)"}, {"name": "World Geodetic System 1984 (G730)"}, {"name": "World Geodetic System 1984 (G873)"}, {"name": "World Geodetic System 1984 (G1150)"}, {"name": "World Geodetic System 1984 (G1674)"}, {"name": "World Geodetic System 1984 (G1762)"}, {"name": "World Geodetic System 1984 (G2139)"}, {"name": "World Geodetic System 1984 (G2296)"}], "ellipsoid": {"name": "WGS 84", "semi_major_axis": 6378137, "inverse_flattening": 298.257223563}, "accuracy": "2.0", "id": {"authority": "EPSG", "code": 6326}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "scope": "Horizontal component of 3D system.", "area": "World.", "bbox": {"south_latitude": -90, "west_longitude": -180, "north_latitude": 90, "east_longitude": 180}, "id": {"authority": "EPSG", "code": 4326}}
                                  name  ref
element  id                                
relation 1990590            Alsergrund    9
         1990591              Hietzing   13
         1990592          Innere Stadt    1
         1990593            Josefstadt    8
         1990594          Leopoldstadt    2
         1990595             Mariahilf    6
         1990596              Meidling   12
         1990597                Neubau    7
         1990598               Penzing   14
         1990599  Rudolfsheim-Fünfhaus   15
         1990600               Währing   18
         1991416            Landstraße    3
         1991433           Brigittenau   20
         1991434            Donaustadt   22
         1991435               Döbling   19
         1991436             Favoriten   10
         1991437           Floridsdorf   21
         1991438               Hernals   17
         1991439               Liesing   23
         1991440            Margareten    5
         1991441             Ottakring   16
         1991442             Simmering   11
         1991443                Wieden    4
         5248033                Wieden  NaN
<ArrowStringArray>
['Polygon']
Length: 1, dtype: str
11:20
I have the current Vienna district data from osmnx. There's a duplicate Wieden with no ref, so I should drop that one. Now I need the 2014-01-01 historical version. Since Overpass API keeps timing out with 504 errors, let me try a different approach.

I can use the Overpass API with `[date:"2014-01-01T00:00:00Z"]` to get historical data, but the server is overloaded. Let me try with smaller queries or a different approach.

Actually, let me try a much simpler query first to check if the date parameter works.
11:20
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json
import time

osmnx.settings.requests_timeout = 300
osmnx.settings.overpass_memory = 1073741824  # 1GB

url = osmnx.settings.overpass_url.rstrip("/") + "/interpreter"

# Try a very small query first to see if date parameter works
query = '[out:json][timeout:300];rel(1990590);out geom;'
data = OrderedDict(data=query)
headers = {"User-Agent": osmnx.settings.http_user_agent, 
           "referer": osmnx.settings.http_referer,
           "Accept-Language": osmnx.settings.http_accept_language}

for attempt in range(3):
    print(f"Attempt {attempt+1}...")
    response = requests.post(url, data=data, timeout=300, headers=headers)
    print("Status:", response.status_code)
    if response.status_code == 200:
        d = response.json()
        print("Elements:", len(d.get('elements',[])))
        el = d['elements'][0]
        print('keys:', list(el.keys()))
        with open('/work/test_current.json', 'w') as f:
            json.dump(d, f)
        break
    else:
        print("Body:", response.text[:300])
        if response.status_code == 504:
            time.sleep(15)
PYEOF
Attempt 1...
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lan
Attempt 2...
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lan
Attempt 3...
Status: 200
Elements: 1
keys: ['type', 'id', 'bounds', 'members', 'tags']
12:16
So `out geom` on a single relation works but returns only bounds, members, tags - not a geometry. The relation's geometry comes from its member ways. I need to use `out geom` on the ways, or better, use the `convert` statement to produce geometry.

Actually, looking at the output, `out geom` on a relation gives `bounds`, `members`, `tags` without a `geometry` key. For individual ways, `out geom` would produce a linestring geometry. For a relation like a boundary, the geometry is built from the ways.

Let me think about this differently. I can get the current data using `osmnx.features_from_place` which worked. For the 2014 data, I need to use the Overpass `[date:"2014-01-01T00:00:00Z"]` parameter. Let me try with the relation members (ways) instead.

Actually, a better approach: I can use osmium or download from Geofabrik historical extracts. Let me try the Geofabrik approach using their history files.

Actually wait - let me try the Overpass date query more carefully. The issue was the server timeout. Let me try with the kumi server.
12:16
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json
import time

osmnx.settings.requests_timeout = 300

# Try kumi server with a simple query
url = "https://overpass.kumi.systems/api/interpreter"

query = '[out:json][timeout:300];rel(1990590);out geom;'
headers = {"User-Agent": osmnx.settings.http_user_agent}
print("Trying kumi...")
response = requests.post(url, data={'data': query}, timeout=300, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    print("Elements:", len(d.get('elements',[])))
    print('keys:', list(d['elements'][0].keys()))
else:
    print("Body:", response.text[:300])
PYEOF
Trying kumi...
Status: 200
Elements: 1
keys: ['type', 'id', 'bounds', 'members', 'tags']
13:11
The kumi server works! Let me use it to get the current and 2014 district data. The `out geom` on a relation returns the geometry as an aggregated polygon. But actually with `out geom` on a relation, Overpass returns the geometry directly - let me check if kumi supports it.
13:11
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json

url = "https://overpass.kumi.systems/api/interpreter"
headers = {"User-Agent": osmnx.settings.http_user_agent}

# Test out geom with kumi
query = '[out:json][timeout:300];rel(1990590);out geom;'
response = requests.post(url, data={'data': query}, timeout=300, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    el = d['elements'][0]
    print('keys:', list(el.keys()))
    if 'geometry' in el:
        print('geometry type:', el['geometry']['type'])
        print('geometry coords count:', len(el['geometry'].get('coordinates',[])))
    print('tags:', el.get('tags',{}))
PYEOF
Status: 504
14:46
The kumi server is also overloaded. Let me try a different approach - use `out body` to get member ways, then request those ways with their geometry. Or better yet, let me use osmnx which does work.

Actually, the osmnx features_from_place worked perfectly fine. It returned 24 relation features that are the Vienna districts. Let me use osmnx to get the current districts with geometries.

The problem is I need to get the 2014-01-01 versions. Let me try osmnx but with a date parameter in the overpass settings, or try to get the history data some other way.

Actually, Overpass supports the `[date:"2014-01-01T00:00:00Z"]` parameter. But I need to be able to send the query and get the result. Maybe I should try with the overpass settings in osmnx.

Actually, let me look at how osmnx constructs queries for features.
14:46
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import osmnx._overpass as m
import inspect

src = inspect.getsource(m)
idx = src.find('def _download_overpass_features')
if idx >= 0:
    print(src[idx:idx+3000])
else:
    # Look for features-related download
    for func_name in ['_download_overpass_features', 'features_from_bbox', 'features_from_polygon']:
        idx = src.find(f'def {func_name}')
        if idx >= 0:
            print(src[idx:idx+3000])
            print("\n\n---\n\n")
PYEOF
def _download_overpass_features(
    polygon: Polygon,
    tags: dict[str, bool | str | list[str]],
) -> Iterator[dict[str, Any]]:
    """
    Retrieve OSM features within some boundary polygon from the Overpass API.

    Parameters
    ----------
    polygon
        Boundary to retrieve elements within.
    tags
        Tags used for finding elements in the selected area.

    Yields
    ------
    response_json
        JSON response from the Overpass server.
    """
    # subdivide query polygon to get list of sub-divided polygon coord strings
    polygon_coord_strs = _make_overpass_polygon_coord_strs(polygon)
    msg = f"Requesting data from API in {len(polygon_coord_strs)} request(s)"
    utils.log(msg, level=lg.INFO)

    # pass exterior coordinates of each polygon in list to API, one at a time
    for polygon_coord_str in polygon_coord_strs:
        query_str = _create_overpass_features_query(polygon_coord_str, tags)
        yield _overpass_request(OrderedDict(data=query_str))


def _overpass_request(data: OrderedDict[str, Any]) -> dict[str, Any]:
    """
    Send a HTTP POST request to the Overpass API and return response.

    Parameters
    ----------
    data
        Key-value pairs of parameters.

    Returns
    -------
    response_json
        The Overpass API's response.
    """
    # resolve url to same IP even if there is server round-robin redirecting
    _http._config_dns(settings.overpass_url)

    # prepare the Overpass API URL and see if request already exists in cache
    url = settings.overpass_url.rstrip("/") + "/interpreter"
    prepared_url = str(requests.Request("GET", url, params=data).prepare().url)
    cached_response_json = _http._retrieve_from_cache(prepared_url)
    if isinstance(cached_response_json, dict):
        return cached_response_json

    # pause then request this URL
    pause = _get_overpass_pause(settings.overpass_url)
    hostname = _http._hostname_from_url(url)
    msg = f"Pausing {pause} second(s) before making HTTP POST request to {hostname!r}"
    utils.log(msg, level=lg.INFO)
    time.sleep(pause)

    # transmit the HTTP POST request
    msg = f"Post {prepared_url} with timeout={settings.requests_timeout}"
    utils.log(msg, level=lg.INFO)
    response = requests.post(
        url,
        data=data,
        timeout=settings.requests_timeout,
        headers=_http._get_http_headers(),
        **settings.requests_kwargs,
    )

    # handle 429 and 504 errors by pausing then recursively re-trying request
    if response.status_code in {429, 504}:  # pragma: no cover
        error_pause = 55
        msg = (
            f"{hostname!r} responded {response.status_code} {response.reason}: "
            f"we'll retry in {error_pause} secs"
        )
        utils.log(msg, level=lg.WARNING)
        time.sleep(error_pause)
        return _overpass_request(data)

    response_json = _http._parse_response(response)
    if not isinstance(response_json, dict):  # pragma: no cover
        msg = "Overpass AP
14:59
The Overpass API at the default endpoint is giving 504 errors (server too busy). But I managed to get the current districts using osmnx.features_from_place() already. Let me use that approach for the current data and then try to get the historical (2014) data.

For the 2014 data, I can try using Overpass with the `[date:"2014-01-01T00:00:00Z"]` parameter. But the API seems overloaded. Let me try a different approach - maybe I can use the Overpass API with a smaller query using the kumi endpoint.

Actually, let me think about this more strategically. I already have the current districts saved as a parquet. For the 2014 data, I could:

1. Try Overpass with the date parameter through osmnx or direct calls
2. Use a different approach entirely

Let me first check what the current districts look like and then try the historical query more aggressively.
14:59
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import osmnx.settings
import json
import time

# Use kumi for the queries
url = "https://overpass.kumi.systems/api/interpreter"
headers = {"User-Agent": osmnx.settings.http_user_agent}

# Get all 23 Vienna district relations with out geom
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,
           1990597,1990598,1990599,1990600,1991416,1991433,1991434,
           1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
rel_list = ','.join(str(i) for i in rel_ids)

# First get body only (no geom)
query = f'[out:json][timeout:300];rel(id:{rel_list});out body;'
for attempt in range(3):
    print(f"Attempt {attempt+1} for current data...")
    try:
        response = requests.post(url, data={'data': query}, timeout=300, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            print("Elements:", len(d.get('elements',[])))
            with open('/work/current_body.json', 'w') as f:
                json.dump(d, f)
            print("Saved")
            break
        else:
            print("Body:", response.text[:300])
    except Exception as e:
        print("Error:", e)
    time.sleep(10)
PYEOF
<exit_code>-1</exit_code>
<error>Command timed out after 360000ms</error>
21:09
The server is having trouble with queries. Let me try a different approach - use osmnx features_from_place which we already know works, and then for historical data I'll use the Overpass API with the date parameter, but one district at a time.

Actually, I already have the current districts data in the parquet file. Let me use what I have and try a different approach for the historical data.

Let me try fetching the 2014 data from Overpass with the date parameter, but use smaller queries.
21:09
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
from collections import OrderedDict
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Try to get a single district with date
# First test: does the date parameter work?
query = '[out:json][timeout:120][date:"2014-01-01T00:00:00Z"];rel(1990590);out body;'

for attempt in range(3):
    print(f"Attempt {attempt+1}...")
    try:
        response = requests.post("https://overpass-api.de/api/interpreter", 
                                data={'data': query}, timeout=120, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            print("Elements:", len(d.get('elements',[])))
            if d['elements']:
                el = d['elements'][0]
                print("tags:", el.get('tags',{}))
                print("members count:", len(el.get('members',[])))
            break
        else:
            print("Body:", response.text[:300])
    except Exception as e:
        print("Error:", e)
    time.sleep(5)
PYEOF
Attempt 1...
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lan
Attempt 2...
Status: 200
Elements: 1
tags: {'type': 'boundary', 'ref:at:gkz': '90901', 'wikipedia:street_names:de': 'Liste_der_Straßennamen_von_Wien/Alsergrund', 'ref': '9', 'wikipedia:de': 'Alsergrund', 'wikipedia:en': 'Alsergrund', 'wikipedia:ru': 'Альзергрунд', 'name:ru': 'Альзергрунд', 'name': 'Alsergrund', 'boundary': 'administrative', 'admin_level': '9', 'postal_code': '1090'}
members count: 10
21:37
The date parameter works with Overpass. It retrieved the 2014 snapshot of Alsergrund. Now let me check if the geometry data is also returned with out geom for the historical data.
21:37
21:37
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Now get current first (without date)
query = '[out:json][timeout:120];rel(1990590);out geom;'

response = requests.post("https://overpass-api.de/api/interpreter", 
                        data={'data': query}, timeout=120, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    el = d['elements'][0]
    print("keys:", list(el.keys()))
    if 'geometry' in el:
        print("Geometry type:", el['geometry']['type'])
        # For multipolygon
        print("Full keys:", list(el.keys()))
    if 'members' in el:
        print("Members sample:", el['members'][:3])
PYEOF
Status: 504
21:53
The Overpass API keeps timing out on me. Let me take a different approach. Since I already have the current districts from osmnx (as a GeoParquet file with geometries), I can work with that. For the 2014 data, I need to get the historical boundaries.

Actually, let me try a different approach for the 2014 data. Instead of using the date parameter on the final query, I could use the Overpass API's "out geom" with date. But it seems the server is too busy for larger queries.

Let me try a more efficient approach:
1. Get current districts - I already have them from osmnx
2. For 2014, I can try to get the old relations - they might have different IDs

Actually, let me try the Overture Maps or another data source. But Overture wouldn't have 2014 data.

Let me try another approach - use the Overpass API one district at a time, both for current and 2014.

Actually wait - I have the current data already saved from osmnx. Let me check if I can use a different method for the 2014 data.

For 2014 data, I could use the Overpass API with `[date:"2014-01-01T00:00:00Z"]` and query for the relations one at a time with `out geom`. But the relations might not have existed in 2014 with the same IDs.

Let me think about this differently. In 2014, Vienna's districts might have been mapped differently. The district boundaries in Vienna haven't changed much since 2014, but some might have. Let me try fetching them.

Actually, the better approach: use Overpass to get the district boundary ways at 2014-01-01. The districts are administrative boundaries, and their ways might have changed. Let me query for the ways that make up the district boundaries at both time periods.

Wait, let me think more carefully. The districts are defined as relations with `admin_level=9` and `boundary=administrative`. In 2014, these relations might have had different members (ways). So the geometry would be different.

Let me try one district at a time with `out geom` to get the full geometry.
21:53
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Get current with out geom - single district
query = '[out:json][timeout:120];rel(1990590);out geom;'

for attempt in range(3):
    try:
        response = requests.post("https://overpass-api.de/api/interpreter", 
                                data={'data': query}, timeout=120, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            el = d['elements'][0]
            print("keys:", list(el.keys()))
            if 'geometry' in el:
                print("Geometry type:", el['geometry']['type'])
            elif 'members' in el:
                # The members have geometry?
                print("Members first:", json.dumps(el['members'][0], indent=2)[:500])
            break
        else:
            print("Body:", response.text[:200])
    except Exception as e:
        print("Error:", e)
    time.sleep(5)
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Status: 200
keys: ['type', 'id', 'bounds', 'members', 'tags']
Members first: {
  "type": "way",
  "ref": 147468285,
  "role": "outer",
  "geometry": [
    {
      "lat": 48.2151672,
      "lon": 16.3408703
    },
    {
      "lat": 48.2152918,
      "lon": 16.3409357
    },
    {
      "lat": 48.2157703,
      "lon": 16.3412026
    },
    {
      "lat": 48.2158543,
      "lon": 16.3412458
    },
    {
      "lat": 48.2162683,
      "lon": 16.3414721
    },
    {
      "lat": 48.2163349,
      "lon": 16.3415085
    },
    {
      "lat": 48.216373,
      "lon": 16.3415294
22:50
22:50
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Use convert to get the geometry directly
query = '[out:json][timeout:120];rel(1990590);convert rel ::id=id(), ::geom=geom(), name=t["name"], ref=t["ref"];out geom;'

for attempt in range(3):
    try:
        response = requests.post("https://overpass-api.de/api/interpreter", 
                                data={'data': query}, timeout=120, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            print("Elements:", len(d.get('elements',[])))
            if d['elements']:
                el = d['elements'][0]
                print("keys:", list(el.keys()))
                if 'geometry' in el:
                    print("Geometry type:", el['geometry']['type'])
                    print("Geometry:", json.dumps(el['geometry'], indent=2)[:500])
                print("tags:", el.get('tags',{}))
            break
        else:
            print("Body:", response.text[:200])
    except Exception as e:
        print("Error:", e)
    time.sleep(5)
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
23:54
The overpass-api.de server is very busy (504 errors). Let me try an alternative approach. Since I already have the current district data from osmnx (saved as parquet), I can work with that. For the 2014 snapshot, I need to query Overpass with the date parameter.

Let me try a different strategy: use osmnx itself to get the current data (which it already caches), and then for the historical data, query each district one at a time with a long timeout, building geometries from the member ways.

Actually, I notice osmnx.features_from_place already returned the current data with geometries. So I have current data.

For 2014, I need to:
1. Query the relations with `out body` (with date parameter) to get the member way IDs
2. Then get the way geometries from 2014
3. Build the polygons

Or simpler: use `out geom` on the relation with date, which gives members with geometry arrays. Then reconstruct polygons from those arrays.

But the server is really slow. Let me try the lz4 mirror more carefully.
23:54
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Try lz4 mirror with convert
query = '[out:json][timeout:120];rel(1990590);convert rel ::id=id(), ::geom=geom(), name=t["name"], ref=t["ref"];out geom;'

for attempt in range(3):
    try:
        response = requests.post("https://lz4.overpass-api.de/api/interpreter", 
                                data={'data': query}, timeout=120, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            el = d['elements'][0]
            print("keys:", list(el.keys()))
            if 'geometry' in el:
                print("Geometry type:", el['geometry']['type'])
            break
        else:
            print("Body:", response.text[:200])
    except Exception as e:
        print("Error:", e)
    time.sleep(5)
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
24:39
The Overpass API is really struggling. Let me take a different approach. I already have the current data from osmnx (saved as current_districts.parquet). Now I need to get the 2014 data. 

Let me try a different strategy - use the `out geom;` approach which works with members having geometry, and then reconstruct polygons from those.

Actually, I already know that `out geom` on relations returns members with geometry. Let me build the polygons from the member ways' geometries.

Let me try getting data one district at a time to avoid timeout issues.
24:39
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# Get current data for one district with out geom (members have geometry)
query = '[out:json][timeout:120];rel(1990590);out geom;'

response = requests.post("https://overpass-api.de/api/interpreter", 
                        data={'data': query}, timeout=120, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    el = d['elements'][0]
    print("Member count:", len(el.get('members',[])))
    print("First member type:", el['members'][0]['type'], "role:", el['members'][0]['role'])
    print("Has geometry in members?", 'geometry' in el['members'][0])
    print("Sample member:", json.dumps(el['members'][0], indent=2)[:300])
else:
    print("Body:", response.text[:200])
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:
26:08
The overpass servers are being very unreliable. Let me try a different approach - use the Overture Maps data or the osmnx features_from_place which already worked to get the current data. For the historical 2014 data, I might need to try OSM history exports or use a different method.

Actually, osmnx worked well earlier to get the current districts (features_from_place). Let me use that for current data and then try to get the 2014 data via Overpass using the date parameter, but with a smaller batch.

Alternatively, I could try using the Overpass API with `out body` (no geometry) to get the member lists for the 2014 data, and then reconstruct the polygons from the member way IDs by fetching the ways separately. But that's complex.

Let me try a different approach: Use osmnx for current data (already downloaded and saved), and then try to get 2014 data also via osmnx or direct Overpass with date parameter.

Actually, let me first check if the current districts file has proper geometry.
26:08
26:08
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import geopandas as gpd
import pandas as pd

# Read the current districts we got from osmnx
gdf = gpd.read_parquet('/work/current_districts.parquet')
print("Shape:", gdf.shape)
print("CRS:", gdf.crs)
print("Columns:", list(gdf.columns)[:10])
print("\nGeometry types:", gdf.geometry.type.unique())
print("\nSample districts:")
print(gdf[['name', 'ref']].to_string())

# The file has 24 rows (23 districts + 1 extra Wieden). Let's clean it up
# Keep only rows with a numeric ref
gdf = gdf[gdf['ref'].notna() & gdf['ref'].str.match(r'^\d+$')]
print("\nAfter filtering numeric ref:", gdf.shape)
print(gdf[['name', 'ref']].to_string())
PYEOF
Shape: (24, 411)
CRS: {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "GeographicCRS", "name": "WGS 84", "datum_ensemble": {"name": "World Geodetic System 1984 ensemble", "members": [{"name": "World Geodetic System 1984 (Transit)"}, {"name": "World Geodetic System 1984 (G730)"}, {"name": "World Geodetic System 1984 (G873)"}, {"name": "World Geodetic System 1984 (G1150)"}, {"name": "World Geodetic System 1984 (G1674)"}, {"name": "World Geodetic System 1984 (G1762)"}, {"name": "World Geodetic System 1984 (G2139)"}, {"name": "World Geodetic System 1984 (G2296)"}], "ellipsoid": {"name": "WGS 84", "semi_major_axis": 6378137, "inverse_flattening": 298.257223563}, "accuracy": "2.0", "id": {"authority": "EPSG", "code": 6326}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "scope": "Horizontal component of 3D system.", "area": "World.", "bbox": {"south_latitude": -90, "west_longitude": -180, "north_latitude": 90, "east_longitude": 180}, "id": {"authority": "EPSG", "code": 4326}}
Columns: ['geometry', 'ISO3166-1:alpha2', 'admin_level', 'capital', 'description', 'name', 'name:ab', 'name:ace', 'name:af', 'name:am']

Geometry types: <ArrowStringArray>
['Polygon']
Length: 1, dtype: str

Sample districts:
                                  name  ref
element  id                                
relation 1990590            Alsergrund    9
         1990591              Hietzing   13
         1990592          Innere Stadt    1
         1990593            Josefstadt    8
         1990594          Leopoldstadt    2
         1990595             Mariahilf    6
         1990596              Meidling   12
         1990597                Neubau    7
         1990598               Penzing   14
         1990599  Rudolfsheim-Fünfhaus   15
         1990600               Währing   18
         1991416            Landstraße    3
         1991433           Brigittenau   20
         1991434            Donaustadt   22
         1991435               Döbling   19
         1991436             Favoriten   10
         1991437           Floridsdorf   21
         1991438               Hernals   17
         1991439               Liesing   23
         1991440            Margareten    5
         1991441             Ottakring   16
         1991442             Simmering   11
         1991443                Wieden    4
         5248033                Wieden  NaN

After filtering numeric ref: (23, 411)
                                  name ref
element  id                               
relation 1990590            Alsergrund   9
         1990591              Hietzing  13
         1990592          Innere Stadt   1
         1990593            Josefstadt   8
         1990594          Leopoldstadt   2
         1990595             Mariahilf   6
         1990596              Meidling  12
         1990597                Neubau   7
         1990598               Penzing  14
         1990599  Rudolfsheim-Fünfhaus  15
         1990600               Währing  18
         1991416            Landstraße   3
         1991433           Brigittenau  20
         1991434            Donaustadt  22
         1991435               Döbling  19
         1991436             Favoriten  10
         1991437           Floridsdorf  21
         1991438               Hernals  17
         1991439               Liesing  23
         1991440            Margareten   5
         1991441             Ottakring  16
         1991442             Simmering  11
         1991443                Wieden   4
26:22
Good, I have the current districts. Now I need to get the 2014-01-01 versions. The challenge is that Overpass API is having timeout issues with `out geom` for relations.

Let me try a different approach: use the Overpass `recurse` and `out geom` approach step by step, or use the `convert` approach. Or I could try the OSM attic data approach directly.

The date parameter works (I got data for 2014-01-01 successfully for a single relation), but the geom extraction is failing. Let me try a different approach: use `out body` for 2014 which gives me members (way references), then use `out geom` on the ways. Actually this is getting complex.

Let me think about a simpler approach. Since we have the current data from osmnx (which got full polygon geometries), let me try to get 2014 data by querying each district's relation with the date parameter and getting the member ways' geometries.

Actually, the simpler approach is: I can get the full polygon geometry from the current data. For the 2014 version, I can use the Overpass date parameter to get the relation members (way refs) and then reconstruct the geometry by fetching those ways' geometries.

But this is complex. Let me try a different approach - maybe I can get the geometry via `make` or `convert` with the date parameter.

Let me try a different approach: Use OSM attic to get the relation for 2014.

Actually, let me try with a simpler query - maybe the 504 is because the server is busy. Let me try again with smaller requests and longer waits.
26:22
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

# For the 2014 data, I'll fetch each district's member ways with geometry
# First get all districts' member ways from current (to know what to fetch)
# Then get their 2014 geometries

# Actually let me try a simpler approach: use the Overpass recurse to get the geometry
# Convert rel to geom - maybe with smaller timeout on kumi

# Let me try ALL 23 districts at once with the date parameter
rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,
           1990597,1990598,1990599,1990600,1991416,1991433,1991434,
           1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
rel_list = ','.join(str(i) for i in rel_ids)

# Query: get 2014 relations with out geom (members with geometry)
query = f'[out:json][timeout:300][date:"2014-01-01T00:00:00Z"];rel(id:{rel_list});out geom;'

for attempt in range(3):
    print(f"Attempt {attempt+1}: fetching 2014 data...")
    try:
        # Use lz4 mirror
        response = requests.post("https://lz4.overpass-api.de/api/interpreter", 
                                data={'data': query}, timeout=300, headers=headers)
        print("Status:", response.status_code)
        if response.status_code == 200:
            d = response.json()
            print("Elements:", len(d.get('elements',[])))
            with open('/work/districts_2014.json', 'w') as f:
                json.dump(d, f)
            print("Saved!")
            break
        elif response.status_code == 504:
            print("Timeout, waiting...")
            time.sleep(30)
        else:
            print("Body:", response.text[:300])
            time.sleep(15)
    except Exception as e:
        print("Error:", e)
        time.sleep(15)
PYEOF
Attempt 1: fetching 2014 data...
Status: 504
Timeout, waiting...
Attempt 2: fetching 2014 data...
Status: 504
Timeout, waiting...
Attempt 3: fetching 2014 data...
Status: 200
Elements: 23
Saved!
27:58
Great, I got the 2014 data. Now let me also fetch the current data with `out geom` from the Overpass API.
27:58
27:58
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,
           1990597,1990598,1990599,1990600,1991416,1991433,1991434,
           1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]
rel_list = ','.join(str(i) for i in rel_ids)

query = f'[out:json][timeout:300];rel(id:{rel_list});out geom;'

response = requests.post("https://lz4.overpass-api.de/api/interpreter", 
                         data={'data': query}, timeout=300, headers=headers)
print("Status:", response.status_code)
if response.status_code == 200:
    d = response.json()
    print("Elements:", len(d.get('elements',[])))
    with open('/work/districts_current.json', 'w') as f:
        json.dump(d, f)
    print("Saved current data")
else:
    print("Body:", response.text[:300])
PYEOF
Status: 504
Body: <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lan
28:13
I have current data already from osmnx (parquet file) and 2014 data from overpass. Let me check what I have and build the solution. The current data from osmnx has Polygon geometries already. For the 2014 data I need to reconstruct geometries from the member ways with geometry.

Let me check the 2014 data structure.
28:13
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import json

with open('/work/districts_2014.json') as f:
    d = json.load(f)

print("Elements:", len(d['elements']))
for el in d['elements'][:3]:
    print('id:', el['id'])
    print('tags:', {k:v for k,v in el.get('tags',{}).items() if k in ('name','ref','admin_level')})
    print('members count:', len(el.get('members',[])))
    if el['members']:
        m = el['members'][0]
        print('first member type:', m['type'], 'role:', m['role'], 'has geometry:', 'geometry' in m)
        if 'geometry' in m:
            print('  geometry coords:', len(m['geometry']))
    print()
PYEOF
Elements: 23
id: 1990590
tags: {'ref': '9', 'name': 'Alsergrund', 'admin_level': '9'}
members count: 10
first member type: way role: outer has geometry: True
  geometry coords: 17

id: 1990591
tags: {'ref': '13', 'name': 'Hietzing', 'admin_level': '9'}
members count: 33
first member type: way role: outer has geometry: True
  geometry coords: 75

id: 1990592
tags: {'ref': '1', 'name': 'Innere Stadt', 'admin_level': '9'}
members count: 7
first member type: way role: outer has geometry: True
  geometry coords: 14
28:23
I have the 2014 data with member geometries. I need to build the polygons from the member ways (outer/inner roles). Let me also get the current data. Let me try fetching current data from the main API, maybe with a different approach.
28:23
28:23
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import requests
import json
import time

headers = {"User-Agent": "OSMnx Python package (https://github.com/gboeing/osmnx)"}

rel_ids = [1990590,1990591,1990592,1990593,1990594,1990595,1990596,
           1990597,1990598,1990599,1990600,1991416,1991433,1991434,
           1991435,1991436,1991437,1991438,1991439,1991440,1991441,1991442,1991443]

all_elements = []
for rel_id in rel_ids:
    query = f'[out:json][timeout:120];rel({rel_id});out geom;'
    for attempt in range(3):
        try:
            response = requests.post("https://lz4.overpass-api.de/api/interpreter", 
                                    data={'data': query}, timeout=120, headers=headers)
            if response.status_code == 200:
                d = response.json()
                all_elements.extend(d.get('elements', []))
                print(f"  {rel_id}: OK")
                break
            elif response.status_code == 504:
                print(f"  {rel_id}: timeout, retrying...")
                time.sleep(10)
            else:
                print(f"  {rel_id}: {response.status_code}")
                time.sleep(5)
        except Exception as e:
            print(f"  {rel_id}: {e}")
            time.sleep(10)

print(f"\nTotal elements: {len(all_elements)}")

# Save
with open('/work/districts_current.json', 'w') as f:
    json.dump({"elements": all_elements}, f)
print("Saved current data")
PYEOF
<exit_code>-1</exit_code>
<error>Command timed out after 600000ms</error>
38:59
The command timed out. The Overpass API is being very slow. Let me check what data we already have and use a different approach.

I already have:
1. Current districts from osmnx as a GeoDataFrame with full polygon geometry - `/work/current_districts.parquet` (23 districts with numeric refs)
2. 2014 districts from Overpass with member geometries - `/work/districts_2014.json`

For the 2014 data, I have the members with geometry (points along each way). I need to reconstruct polygons from those. But actually, the current data from osmnx is already in polygon form.

For the 2014 data, I need to reconstruct polygons from the member ways. Let me write a comprehensive script that:
1. Uses the current osmnx polygons (already have)
2. Reconstructs polygons from the 2014 member geometries
3. Computes symmetric difference to find changed areas
4. Classifies them

But wait - the Overpass `out geom` for relations gives members with geometry arrays of {lat, lon} points. I need to convert those to Shapely LineString/Polygon geometries.

Let me check what's in the 2014 data more carefully and build the proper solution.
38:59
38:59
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import json

with open('/work/districts_2014.json') as f:
    d = json.load(f)

print("Elements:", len(d['elements']))

# Check all tags
for el in d['elements']:
    tags = el.get('tags', {})
    name = tags.get('name', 'N/A')
    ref = tags.get('ref', 'N/A')
    members = el.get('members', [])
    outer_count = sum(1 for m in members if m['role'] == 'outer')
    inner_count = sum(1 for m in members if m['role'] == 'inner')
    has_geom = all('geometry' in m for m in members)
    print(f"  id={el['id']:>7d} ref={ref:>2s} name={name:30s} outer={outer_count} inner={inner_count} has_geom={has_geom}")
PYEOF
Elements: 23
  id=1990590 ref= 9 name=Alsergrund                     outer=10 inner=0 has_geom=True
  id=1990591 ref=13 name=Hietzing                       outer=33 inner=0 has_geom=True
  id=1990592 ref= 1 name=Innere Stadt                   outer=7 inner=0 has_geom=True
  id=1990593 ref= 8 name=Josefstadt                     outer=14 inner=0 has_geom=True
  id=1990594 ref= 2 name=Leopoldstadt                   outer=6 inner=0 has_geom=True
  id=1990595 ref= 6 name=Mariahilf                      outer=18 inner=0 has_geom=True
  id=1990596 ref=12 name=Meidling                       outer=42 inner=0 has_geom=True
  id=1990597 ref= 7 name=Neubau                         outer=23 inner=0 has_geom=True
  id=1990598 ref=14 name=Penzing                        outer=46 inner=0 has_geom=True
  id=1990599 ref=15 name=Rudolfsheim-Fünfhaus           outer=32 inner=0 has_geom=True
  id=1990600 ref=18 name=Währing                        outer=50 inner=0 has_geom=True
  id=1991416 ref= 3 name=Landstraße                     outer=20 inner=0 has_geom=True
  id=1991433 ref=20 name=Brigittenau                    outer=5 inner=0 has_geom=True
  id=1991434 ref=22 name=Donaustadt                     outer=38 inner=0 has_geom=True
  id=1991435 ref=19 name=Döbling                        outer=26 inner=0 has_geom=True
  id=1991436 ref=10 name=Favoriten                      outer=25 inner=0 has_geom=True
  id=1991437 ref=21 name=Floridsdorf                    outer=21 inner=0 has_geom=True
  id=1991438 ref=17 name=Hernals                        outer=68 inner=0 has_geom=True
  id=1991439 ref=23 name=Liesing                        outer=30 inner=0 has_geom=True
  id=1991440 ref= 5 name=Margareten                     outer=14 inner=0 has_geom=True
  id=1991441 ref=16 name=Ottakring                      outer=64 inner=0 has_geom=True
  id=1991442 ref=11 name=Simmering                      outer=17 inner=0 has_geom=True
  id=1991443 ref= 4 name=Wieden                         outer=17 inner=0 has_geom=True
39:29
I have the 2014 data with member geometries. Now I need the current data. I already have it from osmnx (as polygons). Let me build the complete solution.

Key observations:
1. Current names from osmnx: 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', etc. - these are modern German names
2. 2014 names from Overpass: 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', etc. - similar but note 'Landstraße' (which is 'Landstraße' in proper encoding) vs current 'Landstraße' (already correct)
3. We need to normalize names - compare by ref number since they all have numeric ref

The approach:
1. Build polygons from the 2014 member ways (outer rings) - since these are MultiPolygons
2. Use the current osmnx polygons (already built)
3. Compare the two by district (matched by ref)
4. Symmetric difference to find changed areas
5. Classify each fragment as added_since_2014, removed_since_2014, or unchanged
6. For each district, flag if it touches any changed area

The member geometries are arrays of {lat, lon} points. I need to convert them to Shapely polygons.

For 2014, each district has only outer members (no inners). So each member way is a ring. Multiple ways form the boundary of the district. To reconstruct the polygon, I need to merge the ways into a closed ring. But wait - if there are multiple outer ways, it might be a MultiPolygon (like a district with discontinuous parts) or a single polygon made of multiple way segments.

Actually, a single relation with multiple outer ways but no inner ways typically means these are boundary segments that together form a single closed ring. But they could also be separate rings forming a MultiPolygon.

The tricky part is that the boundary is broken into separate ways - I need to figure out how they connect.

Let me think of a simpler approach: instead of reconstructing from members, I can use `shapely.ops.unary_union` on the individual way linestrings. But wait, the ways are just lines along the boundary, not closed polygons by themselves.

Actually, looking more carefully at the data: each member way is a segment of the boundary. Multiple outer ways form the complete boundary. If there are no inner ways, it's a single polygon whose boundary is made of multiple way segments.

I can:
1. Build LineStrings from each member way's geometry
2. Merge them with unary_union to get the boundary line(s)
3. Polygonize the result with shapely.ops.polygonize

Or alternatively, I can use the `convert` approach in Overpass to get the geometry pre-built. But I already got the 2014 data with out geom (member geometries).

Actually wait - I already have the current data as proper polygons from osmnx. Let me just build the 2014 polygons from the member ways.

Let me write the complete script.

For name normalization:
- 2014 names: 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', etc.
- Current names: 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', etc.
- Note: 2014 has 'Landstraße' (actually 'Landstraße' due to encoding) vs current 'Landstraße'
- The encoding issue: 'Landstraße' in the JSON might come through as 'Landstraße' if not properly decoded
- Let me check: In the 2014 data, I see 'name': 'Landstra\u00dfe' which would be 'Landstraße' - the ß character is \u00df

So actually the names match. Let me match by ref number to be safe.

Wait, looking at 2014 'Landstraße' - that's an encoding issue (ß is the mojibake for ß when interpreted as Latin-1). But the JSON should have \u00df for ß. Let me verify.

The output showed `Landstraße`... wait, it showed `Landstraße` in the first print. Let me re-check. Actually it showed:
```
name=Landstraße                     outer=20 
```

Hmm, actually the terminal encoding might have mangled it. Let me just match by ref number to be safe.

Current district names from osmnx:
- 'Alsergrund', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Leopoldstadt', 
- 'Mariahilf', 'Meidling', 'Neubau', 'Penzing', 'Rudolfsheim-Fünfhaus',
- 'Währing', 'Landstraße', 'Brigittenau', 'Donaustadt', 'Döbling',
- 'Favoriten', 'Floridsdorf', 'Hernals', 'Liesing', 'Margareten',
- 'Ottakring', 'Simmering', 'Wieden'

I'll use the ref number to match between datasets. Then for the output, use a consistent normalized name.

Let me write the full script now.
39:29
39:29
Write /work/solve.py
content · 245 lines · py
"""
Vienna district boundary change analysis.
Compares current district boundaries against 2014-01-01 snapshots.
"""
import json
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import Polygon, MultiPolygon, LineString, shape, Point
from shapely.ops import polygonize, unary_union, linemerge
from collections import OrderedDict

# ============================================================
# 1. Load current districts from osmnx (already saved)
# ============================================================
current_gdf = gpd.read_parquet('/work/current_districts.parquet')
# Filter to only those with a numeric ref (the 23 Vienna districts)
current_gdf = current_gdf[current_gdf['ref'].notna() & current_gdf['ref'].str.match(r'^\d+$')].copy()
current_gdf['district_ref'] = current_gdf['ref'].astype(int)
current_gdf['district_name'] = current_gdf['name']
current_gdf = current_gdf.set_index('district_ref')
current_gdf = current_gdf[['district_name', 'geometry']]

print(f"Current districts loaded: {len(current_gdf)}")
print(current_gdf[['district_name']].to_string())

# ============================================================
# 2. Load 2014 districts from Overpass JSON
# ============================================================
with open('/work/districts_2014.json') as f:
    old_data = json.load(f)

def members_to_polygons(members):
    """
    Convert Overpass relation members (ways with geometry arrays of {lat, lon})
    into Shapely Polygons/MultiPolygons.
    """
    outer_lines = []
    inner_lines = []
    
    for m in members:
        if 'geometry' not in m:
            continue
        pts = [(p['lon'], p['lat']) for p in m['geometry']]
        if len(pts) < 3:
            continue
        # Ensure ring is closed
        if pts[0] != pts[-1]:
            pts.append(pts[0])
        line = LineString(pts)
        if m['role'] == 'inner':
            inner_lines.append(line)
        else:
            outer_lines.append(line)
    
    # Merge lines and polygonize
    if not outer_lines:
        return None
    
    merged_outer = unary_union(outer_lines)
    merged_inner = unary_union(inner_lines) if inner_lines else None
    
    # polygonize the merged outer boundary
    polygonized = list(polygonize(merged_outer))
    
    if not polygonized:
        return None
    
    if len(polygonized) == 1:
        poly = polygonized[0]
        if merged_inner and not merged_inner.is_empty:
            try:
                poly = Polygon(poly.exterior, [ring.exterior.coords for ring in merged_inner.geoms] if merged_inner.geom_type == 'MultiLineString' else [merged_inner.coords])
            except:
                pass
        return poly
    else:
        # Multiple polygons -> MultiPolygon
        polys = []
        for poly in polygonized:
            if merged_inner and not merged_inner.is_empty:
                try:
                    poly = Polygon(poly.exterior, ...)  # skip inner for multipolygon simplicity
                except:
                    pass
            polys.append(poly)
        return MultiPolygon(polys)


# Build GeoDataFrame of 2014 districts
old_records = []
for el in old_data['elements']:
    ref = int(el['tags']['ref'])
    name = el['tags']['name']
    poly = members_to_polygons(el['members'])
    if poly is not None:
        old_records.append({'district_ref': ref, 'district_name': name, 'geometry': poly})

old_gdf = gpd.GeoDataFrame(old_records, crs='EPSG:4326')
old_gdf = old_gdf.set_index('district_ref')
print(f"\n2014 districts loaded: {len(old_gdf)}")
print(old_gdf[['district_name']].to_string())

# ============================================================
# 3. Match districts by ref and find changes
# ============================================================
all_refs = sorted(set(current_gdf.index) | set(old_gdf.index))
print(f"\nAll district refs: {all_refs}")

# Normalize names - use the current osmnx name as the canonical one
# Build a name lookup
name_lookup = {}
for ref in all_refs:
    if ref in current_gdf.index:
        name_lookup[ref] = current_gdf.loc[ref, 'district_name']
    elif ref in old_gdf.index:
        name_lookup[ref] = old_gdf.loc[ref, 'district_name']

# Project to a metric CRS for accurate area computations
# Vienna is in UTM zone 33N (EPSG:32633)
current_gdf_proj = current_gdf.to_crs('EPSG:32633')
old_gdf_proj = old_gdf.to_crs('EPSG:32633')

# For each district, compute symmetric difference
all_change_parts = []  # list of (geometry, change_type, district_name)

for ref in all_refs:
    cur_geom = current_gdf_proj.loc[ref, 'geometry'] if ref in current_gdf_proj.index else None
    old_geom = old_gdf_proj.loc[ref, 'geometry'] if ref in old_gdf_proj.index else None
    
    district_name = name_lookup[ref]
    
    if cur_geom is None:
        # District was removed entirely
        all_change_parts.append((old_geom, 'removed_since_2014', district_name))
        continue
    
    if old_geom is None:
        # District is brand new
        all_change_parts.append((cur_geom, 'added_since_2014', district_name))
        continue
    
    # Both exist - find differences
    # Ensure valid geometries
    if not cur_geom.is_valid:
        cur_geom = cur_geom.buffer(0)
    if not old_geom.is_valid:
        old_geom = old_geom.buffer(0)
    
    # Added area (in current but not 2014)
    added = cur_geom.difference(old_geom)
    if not added.is_empty:
        if added.geom_type == 'GeometryCollection':
            for g in added.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    all_change_parts.append((g, 'added_since_2014', district_name))
        else:
            all_change_parts.append((added, 'added_since_2014', district_name))
    
    # Removed area (in 2014 but not current)
    removed = old_geom.difference(cur_geom)
    if not removed.is_empty:
        if removed.geom_type == 'GeometryCollection':
            for g in removed.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    all_change_parts.append((g, 'removed_since_2014', district_name))
        else:
            all_change_parts.append((removed, 'removed_since_2014', district_name))
    
    # Unchanged area (intersection)
    unchanged = cur_geom.intersection(old_geom)
    if not unchanged.is_empty:
        if unchanged.geom_type == 'GeometryCollection':
            for g in unchanged.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    all_change_parts.append((g, 'unchanged', district_name))
        else:
            all_change_parts.append((unchanged, 'unchanged', district_name))

print(f"\nTotal change fragments: {len(all_change_parts)}")

# Build GeoDataFrame of change fragments
fragments = gpd.GeoDataFrame(
    [(dt, dn, geom) for geom, dt, dn in all_change_parts],
    columns=['change_type', 'district_name', 'geometry'],
    crs='EPSG:32633'
)
# Dissolve fragments by change_type AND district_name to merge adjacent pieces
fragments_dissolved = fragments.dissolve(by=['change_type', 'district_name']).reset_index()
print(f"After dissolve: {len(fragments_dissolved)} fragments")

# ============================================================
# 4. For each district, flag if it touches any added/removed area
# ============================================================
# Build a unified geometry of all changed areas
changed_geoms = fragments_dissolved[
    fragments_dissolved['change_type'].isin(['added_since_2014', 'removed_since_2014'])
].copy()

if len(changed_geoms) > 0:
    changed_union = unary_union(changed_geoms['geometry'].tolist())
else:
    changed_union = None

# For each district, check if it touches any changed area
district_names = fragments_dissolved['district_name'].unique()
touches_changed = {}

for ref in all_refs:
    dn = name_lookup[ref]
    district_geom = current_gdf_proj.loc[ref, 'geometry'] if ref in current_gdf_proj.index else None
    if district_geom is None:
        district_geom = old_gdf_proj.loc[ref, 'geometry'] if ref in old_gdf_proj.index else None
    
    if district_geom is not None and changed_union is not None:
        touches_changed[dn] = bool(district_geom.touches(changed_union) or district_geom.intersects(changed_union.buffer(1)))
    else:
        touches_changed[dn] = True if (ref not in current_gdf_proj.index or ref not in old_gdf_proj.index) else False

# ============================================================
# 5. Build final output
# ============================================================
# Add touches_changed to each fragment
fragments_dissolved['touches_changed'] = fragments_dissolved['district_name'].map(touches_changed)

# Project back to WGS84 for output
output_gdf = fragments_dissolved.to_crs('EPSG:4326')

# Ensure MultiPolygon type
output_gdf['geometry'] = output_gdf['geometry'].apply(
    lambda g: MultiPolygon([g]) if g.geom_type == 'Polygon' else g
)

# Keep only the required columns
output_gdf = output_gdf[['change_type', 'district_name', 'touches_changed', 'geometry']]
output_gdf = output_gdf.reset_index(drop=True)

print(f"\nOutput features: {len(output_gdf)}")
print(f"Change types: {output_gdf['change_type'].value_counts().to_dict()}")
print(f"Sample:")
print(output_gdf.head(10).to_string())

# Save to GeoJSON
output_gdf.to_file('/work/vienna_boundary_changes.geojson', driver='GeoJSON')
print("\nSaved to /work/vienna_boundary_changes.geojson")
Wrote /work/solve.py (9459 chars).
39:38
Let me run this and see what happens. The member geometry conversion might be tricky. Let me also think about the approach more carefully.

The 2014 data has relations with members (ways) that have geometry arrays ({lat, lon}). I need to:
1. Convert these to LineStrings
2. Use polygonize to make polygons from the outer rings
3. Handle inner rings

But the issue is that the ways come in pieces (each way is just a segment of the boundary), not as full rings. So I need to merge all the outer ways into one linestring and all inner ways into another, then polygonize.

Let me run the script and see what happens.
39:38
Bash cd /work && python3 solve.py
Current districts loaded: 23
                     district_name
district_ref                      
9                       Alsergrund
13                        Hietzing
1                     Innere Stadt
8                       Josefstadt
2                     Leopoldstadt
6                        Mariahilf
12                        Meidling
7                           Neubau
14                         Penzing
15            Rudolfsheim-Fünfhaus
18                         Währing
3                       Landstraße
20                     Brigittenau
22                      Donaustadt
19                         Döbling
10                       Favoriten
21                     Floridsdorf
17                         Hernals
23                         Liesing
5                       Margareten
16                       Ottakring
11                       Simmering
4                           Wieden

2014 districts loaded: 23
                     district_name
district_ref                      
9                       Alsergrund
13                        Hietzing
1                     Innere Stadt
8                       Josefstadt
2                     Leopoldstadt
6                        Mariahilf
12                        Meidling
7                           Neubau
14                         Penzing
15            Rudolfsheim-Fünfhaus
18                         Währing
3                       Landstraße
20                     Brigittenau
22                      Donaustadt
19                         Döbling
10                       Favoriten
21                     Floridsdorf
17                         Hernals
23                         Liesing
5                       Margareten
16                       Ottakring
11                       Simmering
4                           Wieden

All district refs: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

Total change fragments: 1179
After dissolve: 69 fragments

Output features: 69
Change types: {'added_since_2014': 23, 'removed_since_2014': 23, 'unchanged': 23}
Sample:
        change_type district_name  touches_changed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
[... truncated ...]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               MULTIPOLYGON (((16.3497 48.20639, 16.34998 48.20638, 16.3497 48.20639, 16.34969 48.20639, 16.34945 48.2064, 16.3497 48.20639)), ((16.34786 48.20665, 16.34786 48.20665, 16.34812 48.2066, 16.34786 48.20665)), ((16.34388 48.20747, 16.34354 48.20754, 16.34402 48.20744, 16.34388 48.20747)), ((16.33929 48.20843, 16.33922 48.20844, 16.33914 48.20846, 16.33911 48.20846, 16.33931 48.20842, 16.34006 48.20827, 16.34025 48.20823, 16.33947 48.20839, 16.33943 48.2084, 16.33933 48.20842, 16.33931 48.20842, 16.33929 48.20843)), ((16.339 48.2113, 16.33897 48.2113, 16.33898 48.21137, 16.33898 48.21137, 16.33897 48.2113, 16.339 48.2113)), ((16.33901 48.21159, 16.33902 48.21165, 16.33901 48.21159, 16.33901 48.21159)), ((16.33908 48.21165, 16.33902 48.21165, 16.33902 48.21166, 16.33908 48.21165, 16.33915 48.21196, 16.33908 48.21165)), ((16.34063 48.21472, 16.34077 48.21499, 16.34035 48.21422, 16.34063 48.21472)), ((16.35007 48.21505, 16.34954 48.21511, 16.34915 48.21515, 16.34983 48.21508, 16.35023 48.21504, 16.35058 48.21502, 16.35023 48.21504, 16.35007 48.21505)), ((16.34883 48.21519, 16.34915 48.21515, 16.34883 48.21518, 16.34804 48.21525, 16.34883 48.21519)))

Saved to /work/vienna_boundary_changes.geojson
39:57
The script ran successfully, but there's an issue - some of the tiny "added" and "removed" fragments might just be noise due to slightly different coordinate precision between the two datasets. The `out geom` from Overpass with `date:` parameter returns geometry at different precision levels, so tiny differences (sub-meter) are likely just coordinate rounding artifacts.

Let me check the output file and see if it looks reasonable. Also, I need to verify the name normalization is working properly. In 2014, one district is named "Landstraße" (with ß) while current is "Landstraße" - that's the same. Let me verify the output is valid.

Let me also check the sizes - tiny differences like a few square meters could be noise. Let me add a minimum area threshold to filter out noise.
39:57
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import geopandas as gpd
gdf = gpd.read_file('/work/vienna_boundary_changes.geojson')
print('Features:', len(gdf))
print('Columns:', list(gdf.columns))
print(gdf['change_type'].value_counts())
print()
# Check areas
gdf_proj = gdf.to_crs('EPSG:32633')
for ct in ['added_since_2014', 'removed_since_2014', 'unchanged']:
    subset = gdf_proj[gdf_proj['change_type'] == ct]
    areas = subset.area
    print(f"{ct}: count={len(subset)}, min_area={areas.min():.2f} m², max_area={areas.max():.2f} m², sum={areas.sum()/1e6:.2f} km²")
PYEOF
Features: 69
Columns: ['change_type', 'district_name', 'touches_changed', 'geometry']
change_type
added_since_2014      23
removed_since_2014    23
unchanged             23
Name: count, dtype: int64

added_since_2014: count=23, min_area=110.83 m², max_area=98893627.10 m², sum=369.97 km²
removed_since_2014: count=23, min_area=1704.43 m², max_area=6646526.49 m², sum=36.10 km²
unchanged: count=23, min_area=44933.58 m², max_area=19221954.16 m², sum=44.66 km²
40:14
The results look reasonable - we have 23 districts each with added, removed, and unchanged fragments. The area totals make sense. However, the large "added" area of ~99 km² seems suspicious. Let me check what's happening more carefully.

Actually, the current data from osmnx had polygon geometries already built. The 2014 data came from Overpass out geom on relations, which gives member ways with point geometry arrays. My `members_to_polygons` function might not be correctly assembling the polygons from the member ways.

Let me look at this more carefully. The problem could be that the 2014 relation members' geometries form rings that don't properly polygonize into the correct district shapes.

Let me check the 2014 data more carefully.
40:14
40:14
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import json
from shapely.geometry import LineString, Polygon, MultiPolygon
from shapely.ops import polygonize, unary_union, linemerge
import geopandas as gpd

with open('/work/districts_2014.json') as f:
    d = json.load(f)

# Check one district in detail
el = d['elements'][0]  # Alsergrund
print("District:", el['tags']['name'])
print("Members:", len(el['members']))

# Let's look at the geometry of each member way
outer_geoms = []
for m in el['members']:
    if 'geometry' not in m:
        print(f"  No geometry for member {m['ref']}")
        continue
    pts = [(p['lon'], p['lat']) for p in m['geometry']]
    print(f"  Member {m['ref']}: role={m['role']}, points={len(pts)}, first={pts[0]}, last={pts[-1]}")
    line = LineString(pts)
    outer_geoms.append(line)

# Try merging
merged = unary_union(outer_geoms)
print(f"\nMerged type: {merged.geom_type}")
if merged.geom_type == 'MultiLineString':
    print(f"  Num geoms: {len(merged.geoms)}")
    for i, g in enumerate(merged.geoms):
        print(f"  {i}: {g.is_ring}, coords={len(g.coords)}")

# Try polygonize
polys = list(polygonize(merged))
print(f"\nPolygonized count: {len(polys)}")
for i, p in enumerate(polys):
    print(f"  Poly {i}: area={p.area:.6f} deg², valid={p.is_valid}")

# Compare with current
cur = gpd.read_parquet('/work/current_districts.parquet')
cur_al = cur[cur['name'] == 'Alsergrund']
print(f"\nCurrent Alsergrund area: {cur_al.to_crs('EPSG:32633').area.values[0]:.0f} m²")
PYEOF
District: Alsergrund
Members: 10
  Member 147468285: role=outer, points=17, first=(16.3408703, 48.2151672), last=(16.3419972, 48.2172453)
  Member 8046654: role=outer, points=8, first=(16.3408703, 48.2151672), last=(16.3462104, 48.2154152)
  Member 104962880: role=outer, points=7, first=(16.3495419, 48.2151074), last=(16.3462104, 48.2154152)
  Member 68480634: role=outer, points=16, first=(16.3565102, 48.2143408), last=(16.3495419, 48.2151074)
  Member 147396089: role=outer, points=10, first=(16.370149, 48.2184891), last=(16.3565102, 48.2143408)
  Member 147463455: role=outer, points=12, first=(16.367462, 48.2251118), last=(16.370149, 48.2184891)
  Member 147463456: role=outer, points=16, first=(16.3616214, 48.236309), last=(16.367462, 48.2251118)
  Member 147463454: role=outer, points=15, first=(16.3537656, 48.231992), last=(16.3616214, 48.236309)
  Member 147463462: role=outer, points=86, first=(16.3420185, 48.2172859), last=(16.3537656, 48.231992)
  Member 206221053: role=outer, points=2, first=(16.3419972, 48.2172453), last=(16.3420185, 48.2172859)

Merged type: MultiLineString
  Num geoms: 10
  0: False, coords=17
  1: False, coords=8
  2: False, coords=7
  3: False, coords=16
  4: False, coords=10
  5: False, coords=12
  6: False, coords=16
  7: False, coords=15
  8: False, coords=86
  9: False, coords=2

Polygonized count: 1
  Poly 0: area=0.000359 deg², valid=True

Current Alsergrund area: 2965555 m²
40:24
I see the issue. The way segments need to be ordered and merged properly. The members are outer ways that together form a polygon boundary. They share endpoints (e.g., the last point of one way connects to the first point of the next). I need to sort the ways by their connection order to form a closed ring, or use linemerge to stitch them.

The `unary_union` should merge them into a single linestring... but it creates a MultiLineString and they're not rings (is_ring=False). Let me try `linemerge` instead.
40:24
40:24
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import json
from shapely.geometry import LineString, Polygon, MultiPolygon
from shapely.ops import polygonize, unary_union, linemerge

with open('/work/districts_2014.json') as f:
    d = json.load(f)

# Check member ordering for one district
el = d['elements'][0]  # Alsergrund
print("District:", el['tags']['name'])

# Build lines
lines = []
for m in el['members']:
    if 'geometry' not in m:
        continue
    pts = [(p['lon'], p['lat']) for p in m['geometry']]
    lines.append(LineString(pts))

# Try linemerge
merged = linemerge(lines)
print(f"Linemerge result type: {merged.geom_type}")
if merged.geom_type == 'LineString':
    print(f"  is_ring: {merged.is_ring}")
    print(f"  coords: {len(merged.coords)}")
    poly = Polygon(merged)
    print(f"  Polygon valid: {poly.is_valid}")
    print(f"  Polygon area: {poly.area:.6f} deg²")
elif merged.geom_type == 'MultiLineString':
    print(f"  Num parts: {len(merged.geoms)}")
    # Try polygonize the merged
    polys = list(polygonize(merged))
    print(f"  Polygonized count: {len(polys)}")
    for i, p in enumerate(polys):
        print(f"    Poly {i}: area={p.area:.6f}")

# Check the 2-point member
for m in el['members']:
    if len(m.get('geometry', [])) == 2:
        print(f"\n2-point member {m['ref']}:")
        print(f"  {m['geometry']}")
PYEOF
District: Alsergrund
Linemerge result type: LineString
  is_ring: True
  coords: 180
  Polygon valid: True
  Polygon area: 0.000359 deg²

2-point member 206221053:
  [{'lat': 48.2172453, 'lon': 16.3419972}, {'lat': 48.2172859, 'lon': 16.3420185}]
40:49
Linemerge works better than unary_union for combining the member ways into a contiguous boundary. Let me rewrite the solve.py using linemerge instead of unary_union, and also handle the case where we have inner members (though these districts seem to have no inner members).

Let me also verify the area is correct. Area 0.000359 deg². At Vienna's latitude (~48°N), 1 deg² ≈ 111km * 111km * cos(48°) ≈ 111*111*0.669 ≈ 8244 km². So 0.000359 deg² ≈ 0.000359 * 8244 * 1e6 ≈ 2,960,000 m². That matches the current Alsergrund area of 2,965,555 m². Great!

So linemerge works. Let me rewrite the solve.py properly.
40:49
40:49
Write /work/solve.py
content · 254 lines · py
"""
Vienna district boundary change analysis.
Compares current district boundaries (from OSMnx) against 2014-01-01 snapshots (from Overpass).
"""
import json
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import Polygon, MultiPolygon, LineString
from shapely.ops import polygonize, linemerge, unary_union

# ============================================================
# 1. Load current districts from osmnx (saved as parquet)
# ============================================================
current_gdf = gpd.read_parquet('/work/current_districts.parquet')
# Keep only the 23 Vienna districts (those with a numeric ref)
current_gdf = current_gdf[current_gdf['ref'].notna() & current_gdf['ref'].str.match(r'^\d+$')].copy()
current_gdf['district_ref'] = current_gdf['ref'].astype(int)
current_gdf['district_name'] = current_gdf['name']
current_gdf = current_gdf.set_index('district_ref')
current_gdf = current_gdf[['district_name', 'geometry']]
current_gdf = current_gdf[current_gdf.geometry.notna()]

print(f"Current districts: {len(current_gdf)}")

# ============================================================
# 2. Load 2014 districts from Overpass JSON (members with geometry)
# ============================================================
with open('/work/districts_2014.json') as f:
    old_data = json.load(f)

def members_to_polygon(members):
    """
    Convert Overpass relation members (way entries with geometry arrays of
    {lat,lon} points) into a Shapely Polygon or MultiPolygon.
    Members arrive in the Overpass JSON order and linemerge can order them.
    """
    outer_lines = []
    inner_lines = []

    for m in members:
        if 'geometry' not in m or len(m['geometry']) < 2:
            continue
        pts = [(p['lon'], p['lat']) for p in m['geometry']]
        if pts[0] == pts[-1]:  # already closed
            line = LineString(pts)
        else:
            line = LineString(pts)

        if m.get('role') == 'inner':
            inner_lines.append(line)
        else:
            outer_lines.append(line)

    if not outer_lines:
        return None

    # Merge outer ways into a single ring (or multi-ring)
    merged_outer = linemerge(outer_lines)
    if merged_outer.is_empty:
        return None

    if merged_outer.geom_type == 'LineString' and merged_outer.is_ring:
        outer_poly = Polygon(merged_outer)
    elif merged_outer.geom_type == 'MultiLineString':
        # Try to polygonize
        polys = list(polygonize(merged_outer))
        if not polys:
            # Fallback: try each ring separately
            polys = []
            for line in merged_outer.geoms:
                if line.is_ring:
                    try:
                        polys.append(Polygon(line))
                    except Exception:
                        pass
        if not polys:
            return None
        if len(polys) == 1:
            outer_poly = polys[0]
        else:
            outer_poly = MultiPolygon(polys)
    else:
        # Try polygonize
        polys = list(polygonize(merged_outer))
        if not polys:
            return None
        outer_poly = polys[0] if len(polys) == 1 else MultiPolygon(polys)

    # Handle inner rings
    if inner_lines:
        merged_inner = linemerge(inner_lines)
        if not merged_inner.is_empty and merged_inner.geom_type == 'MultiLineString':
            # Subtract inner rings
            inner_polys = list(polygonize(merged_inner))
            for inner_poly in inner_polys:
                try:
                    outer_poly = outer_poly.difference(inner_poly)
                except Exception:
                    pass

    # Ensure valid
    if not outer_poly.is_valid:
        outer_poly = outer_poly.buffer(0)

    return outer_poly


old_records = []
for el in old_data['elements']:
    ref = int(el['tags']['ref'])
    name = el['tags']['name']
    poly = members_to_polygon(el['members'])
    if poly is not None:
        old_records.append({'district_ref': ref, 'district_name_2014': name, 'geometry': poly})

old_gdf = gpd.GeoDataFrame(old_records, crs='EPSG:4326')
old_gdf = old_gdf.set_index('district_ref')
print(f"2014 districts: {len(old_gdf)}")

# ============================================================
# 3. Build name normalisation lookup
# ============================================================
# Current names (from osmnx) are the canonical ones.
# 2014 names may differ in casing/diacritics.
all_refs = sorted(set(current_gdf.index) & set(old_gdf.index))
print(f"Districts in both snapshots: {len(all_refs)}")

# Map 2014 names to current names (by ref)
name_map = {}
for ref in all_refs:
    cur_name = current_gdf.loc[ref, 'district_name']
    old_name = old_gdf.loc[ref, 'district_name_2014']
    name_map[ref] = cur_name
    if cur_name != old_name:
        print(f"  Ref {ref}: '{old_name}' -> '{cur_name}'")

# ============================================================
# 4. Project to metric CRS (UTM 33N) for accurate geometry ops
# ============================================================
current_proj = current_gdf.to_crs('EPSG:32633')
old_proj = old_gdf.to_crs('EPSG:32633')

# ============================================================
# 5. For each district, compute symmetric difference fragments
# ============================================================
fragments = []  # list of (change_type, district_name, geometry)

for ref in all_refs:
    cur_geom = current_proj.loc[ref, 'geometry']
    old_geom = old_proj.loc[ref, 'geometry']
    dist_name = name_map[ref]

    # Ensure valid
    if not cur_geom.is_valid:
        cur_geom = cur_geom.buffer(0)
    if not old_geom.is_valid:
        old_geom = old_geom.buffer(0)

    # --- Unchanged (intersection) ---
    unchanged = cur_geom.intersection(old_geom)
    if not unchanged.is_empty:
        if unchanged.geom_type == 'GeometryCollection':
            for g in unchanged.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    fragments.append(('unchanged', dist_name, g))
        else:
            fragments.append(('unchanged', dist_name, unchanged))

    # --- Added (in current but not 2014) ---
    added = cur_geom.difference(old_geom)
    if not added.is_empty:
        if added.geom_type == 'GeometryCollection':
            for g in added.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    fragments.append(('added_since_2014', dist_name, g))
        else:
            fragments.append(('added_since_2014', dist_name, added))

    # --- Removed (in 2014 but not current) ---
    removed = old_geom.difference(cur_geom)
    if not removed.is_empty:
        if removed.geom_type == 'GeometryCollection':
            for g in removed.geoms:
                if g.geom_type in ('Polygon', 'MultiPolygon'):
                    fragments.append(('removed_since_2014', dist_name, g))
        else:
            fragments.append(('removed_since_2014', dist_name, removed))

print(f"Raw fragments: {len(fragments)}")

# Build GeoDataFrame
fgdf = gpd.GeoDataFrame(fragments, columns=['change_type', 'district_name', 'geometry'], crs='EPSG:32633')

# Dissolve by (change_type, district_name) to merge adjacent fragments
fgdf = fgdf.dissolve(by=['change_type', 'district_name']).reset_index()
print(f"Dissolved fragments: {len(fgdf)}")

# ============================================================
# 6. Flag districts that touch any changed area
# ============================================================
# Collect all added/removed geometries
changed_sel = fgdf[fgdf['change_type'].isin(['added_since_2014', 'removed_since_2014'])]
if len(changed_sel) > 0:
    changed_union = unary_union(changed_sel['geometry'].tolist())
else:
    changed_union = None

# For each district, check if its current geometry touches changed area
# We use a tiny buffer to capture "borders" (touches is strict adjacency)
touches_map = {}
for ref in all_refs:
    dn = name_map[ref]
    dg = current_proj.loc[ref, 'geometry']
    if changed_union is not None and not changed_union.is_empty:
        # touches() + small buffer intersect to catch near-touches
        touches = dg.touches(changed_union)
        # Also check if the district boundary intersects changed area boundary
        boundary = dg.boundary
        if touches or boundary.intersects(changed_union.buffer(0.1)):
            touches_map[dn] = True
        else:
            touches_map[dn] = False
    else:
        touches_map[dn] = False

fgdf['touches_changed'] = fgdf['district_name'].map(touches_map)

# ============================================================
# 7. Convert to WGS84 and ensure MultiPolygon type
# ============================================================
output_gdf = fgdf.to_crs('EPSG:4326')
output_gdf['geometry'] = output_gdf['geometry'].apply(
    lambda g: MultiPolygon([g]) if g.geom_type == 'Polygon' else g
)
output_gdf = output_gdf[['change_type', 'district_name', 'touches_changed', 'geometry']]
output_gdf = output_gdf.reset_index(drop=True)

print(f"\nFinal output features: {len(output_gdf)}")
print(f"Change type counts: {output_gdf['change_type'].value_counts().to_dict()}")

# Quick area sanity check (in km²)
output_proj = output_gdf.to_crs('EPSG:32633')
for ct in ['added_since_2014', 'removed_since_2014', 'unchanged']:
    sub = output_proj[output_proj['change_type'] == ct]
    if len(sub) > 0:
        total_area = sub.area.sum() / 1e6
        print(f"  {ct}: {len(sub)} features, {total_area:.2f} km² total")

# ============================================================
# 8. Save
# ============================================================
output_gdf.to_file('/work/vienna_boundary_changes.geojson', driver='GeoJSON')
print("\nSaved → /work/vienna_boundary_changes.geojson")
Wrote /work/solve.py (9813 chars).
40:57
Bash cd /work && python3 solve.py
Current districts: 23
2014 districts: 23
Districts in both snapshots: 23
Raw fragments: 69
Dissolved fragments: 69

Final output features: 69
Change type counts: {'added_since_2014': 23, 'removed_since_2014': 23, 'unchanged': 23}
  added_since_2014: 23 features, 0.17 km² total
  removed_since_2014: 23 features, 0.17 km² total
  unchanged: 23 features, 414.47 km² total

Saved → /work/vienna_boundary_changes.geojson
41:26
The areas look much more reasonable now. ~414 km² unchanged (Vienna is about 414 km² total), with ~0.17 km² added and removed each. Let me verify the output is correct by checking the data.
41:26
41:26
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import geopandas as gpd
import json

gdf = gpd.read_file('/work/vienna_boundary_changes.geojson')
print("Features:", len(gdf))
print("CRS:", gdf.crs)
print("Columns:", list(gdf.columns))
print()

# Check unique districts
print("Districts:", sorted(gdf['district_name'].unique()))
print()

# Check touches_changed
print("Touches changed distribution:")
print(gdf.groupby('change_type')['touches_changed'].value_counts().to_string())
print()

# Check geometry types
print("Geometry types:", gdf.geometry.type.unique())
print()

# Sample output
for ct in ['added_since_2014', 'removed_since_2014', 'unchanged']:
    sub = gdf[gdf['change_type'] == ct].head(3)
    print(f"\n--- {ct} (sample) ---")
    for _, row in sub.iterrows():
        print(f"  district={row['district_name']}, touches={row['touches_changed']}, "
              f"geom_type={row.geometry.geom_type}")
PYEOF
Features: 69
CRS: EPSG:4326
Columns: ['change_type', 'district_name', 'touches_changed', 'geometry']

Districts: ['Alsergrund', 'Brigittenau', 'Donaustadt', 'Döbling', 'Favoriten', 'Floridsdorf', 'Hernals', 'Hietzing', 'Innere Stadt', 'Josefstadt', 'Landstraße', 'Leopoldstadt', 'Liesing', 'Margareten', 'Mariahilf', 'Meidling', 'Neubau', 'Ottakring', 'Penzing', 'Rudolfsheim-Fünfhaus', 'Simmering', 'Wieden', 'Währing']

Touches changed distribution:
change_type         touches_changed
added_since_2014    True               23
removed_since_2014  True               23
unchanged           True               23

Geometry types: <ArrowStringArray>
['MultiPolygon']
Length: 1, dtype: str


--- added_since_2014 (sample) ---
  district=Alsergrund, touches=True, geom_type=MultiPolygon
  district=Brigittenau, touches=True, geom_type=MultiPolygon
  district=Donaustadt, touches=True, geom_type=MultiPolygon

--- removed_since_2014 (sample) ---
  district=Alsergrund, touches=True, geom_type=MultiPolygon
  district=Brigittenau, touches=True, geom_type=MultiPolygon
  district=Donaustadt, touches=True, geom_type=MultiPolygon

--- unchanged (sample) ---
  district=Alsergrund, touches=True, geom_type=MultiPolygon
  district=Brigittenau, touches=True, geom_type=MultiPolygon
  district=Donaustadt, touches=True, geom_type=MultiPolygon
41:47
The output looks good! All districts have touches_changed=True because every district borders a changed area (since the changes are along boundaries between districts - any boundary change in a district necessarily touches its neighboring district). This makes sense.

Let me verify the file exists and is valid GeoJSON.
41:47
41:47
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import json
with open('/work/vienna_boundary_changes.geojson') as f:
    data = json.load(f)

print("Type:", data['type'])
print("Features:", len(data['features']))
print()

feat = data['features'][0]
print("Sample feature properties:", json.dumps(feat['properties'], indent=2))
print("Geometry type:", feat['geometry']['type'])
PYEOF
Type: FeatureCollection
Features: 69

Sample feature properties: {
  "change_type": "added_since_2014",
  "district_name": "Alsergrund",
  "touches_changed": true
}
Geometry type: MultiPolygon
41:56
Let me verify the file exists and do a final sanity check.
41:56
41:56
Bash ls -la /work/vienna_boundary_changes.geojson
-rw-r--r-- 1 agent agent 903906 Jun 16 17:27 /work/vienna_boundary_changes.geojson
42:07
Let me do a quick sanity check. The total area of all unchanged + added should roughly equal the total area of current districts, and unchanged + removed should roughly equal the total area of 2014 districts.

Also, let me check that the names used are all the same (24 names = 23 districts). The current names are: Alsergrund, Hietzing, Innere Stadt, Josefstadt, Leopoldstadt, Mariahilf, Meidling, Neubau, Penzing, Rudolfsheim-Fünfhaus, Währing, Landstraße, Brigittenau, Donaustadt, Döbling, Favoriten, Floridsdorf, Hernals, Liesing, Margareten, Ottakring, Simmering, Wieden. That's 23.

Let me also check that the 2014 name "Landstraße" vs "Landstraße" - in the 2014 data I saw "Landstraße" and the current data has "Landstraße". Let me check.

Actually looking more carefully at the 2014 output earlier:
```
id=1991416 ref= 3 name=Landstraße                     outer=20 inner=0 has_geom=True
```

And current: "Landstraße" - that's a difference (ß vs ss). So the name normalisation was needed.

Let me verify the output has the correct data and that everything is consistent.
42:07
Bash cd /work && python3 << 'PYEOF'
cd /work && python3 << 'PYEOF'
import geopandas as gpd
import json

# Verify total areas
gdf = gpd.read_file('/work/vienna_boundary_changes.geojson')
gdf_proj = gdf.to_crs('EPSG:32633')

# Total current area (unchanged + added)
unchanged_area = gdf_proj[gdf_proj['change_type'] == 'unchanged'].area.sum()
added_area = gdf_proj[gdf_proj['change_type'] == 'added_since_2014'].area.sum()
removed_area = gdf_proj[gdf_proj['change_type'] == 'removed_since_2014'].area.sum()

print(f"Unchanged: {unchanged_area/1e6:.3f} km²")
print(f"Added:     {added_area/1e6:.3f} km²")
print(f"Removed:   {removed_area/1e6:.3f} km²")
print(f"Current total (unchanged+added): {(unchanged_area+added_area)/1e6:.3f} km²")
print(f"Old total (unchanged+removed):   {(unchanged_area+removed_area)/1e6:.3f} km²")

# Check the 2014 "Landstraße" vs current "Landstraße"
with open('/work/districts_2014.json') as f:
    d = json.load(f)
for el in d['elements']:
    name = el['tags']['name']
    ref = el['tags']['ref']
    print(f"  2014 ref={ref}: '{name}'")

print()
cur = gpd.read_parquet('/work/current_districts.parquet')
cur = cur[cur['ref'].notna() & cur['ref'].str.match(r'^\d+$')]
for idx, row in cur.iterrows():
    print(f"  Current ref={row['ref']}: '{row['name']}'")
PYEOF
Unchanged: 414.466 km²
Added:     0.171 km²
Removed:   0.169 km²
Current total (unchanged+added): 414.638 km²
Old total (unchanged+removed):   414.636 km²
  2014 ref=9: 'Alsergrund'
  2014 ref=13: 'Hietzing'
  2014 ref=1: 'Innere Stadt'
  2014 ref=8: 'Josefstadt'
  2014 ref=2: 'Leopoldstadt'
  2014 ref=6: 'Mariahilf'
  2014 ref=12: 'Meidling'
  2014 ref=7: 'Neubau'
  2014 ref=14: 'Penzing'
  2014 ref=15: 'Rudolfsheim-Fünfhaus'
  2014 ref=18: 'Währing'
  2014 ref=3: 'Landstraße'
  2014 ref=20: 'Brigittenau'
  2014 ref=22: 'Donaustadt'
  2014 ref=19: 'Döbling'
  2014 ref=10: 'Favoriten'
  2014 ref=21: 'Floridsdorf'
  2014 ref=17: 'Hernals'
  2014 ref=23: 'Liesing'
  2014 ref=5: 'Margareten'
  2014 ref=16: 'Ottakring'
  2014 ref=11: 'Simmering'
  2014 ref=4: 'Wieden'

  Current ref=9: 'Alsergrund'
  Current ref=13: 'Hietzing'
  Current ref=1: 'Innere Stadt'
  Current ref=8: 'Josefstadt'
  Current ref=2: 'Leopoldstadt'
  Current ref=6: 'Mariahilf'
  Current ref=12: 'Meidling'
  Current ref=7: 'Neubau'
  Current ref=14: 'Penzing'
  Current ref=15: 'Rudolfsheim-Fünfhaus'
  Current ref=18: 'Währing'
  Current ref=3: 'Landstraße'
  Current ref=20: 'Brigittenau'
  Current ref=22: 'Donaustadt'
  Current ref=19: 'Döbling'
  Current ref=10: 'Favoriten'
  Current ref=21: 'Floridsdorf'
  Current ref=17: 'Hernals'
  Current ref=23: 'Liesing'
  Current ref=5: 'Margareten'
  Current ref=16: 'Ottakring'
  Current ref=11: 'Simmering'
  Current ref=4: 'Wieden'
42:24
The output looks great! Let me check that the 2014 name "Landstraße" vs current "Landstraße" are actually the same (they are). The total areas match well (414.638 km² vs 414.636 km² - a tiny difference of 0.002 km² from the boundary changes).

Let me verify the file exists with the right name one more time.
42:24
42:24
Bash ls -la /work/vienna_boundary_changes.geojson && python3 -c "
ls -la /work/vienna_boundary_changes.geojson && python3 -c "
import geopandas as gpd
g = gpd.read_file('/work/vienna_boundary_changes.geojson')
print('Features:', len(g))
print('All MultiPolygon:', all(g.geometry.type == 'MultiPolygon'))
print('Columns:', list(g.columns))
for c in ['change_type','district_name','touches_changed']:
    print(f'{c}: present = {c in g.columns}')
"
-rw-r--r-- 1 agent agent 903906 Jun 16 17:27 /work/vienna_boundary_changes.geojson
Features: 69
All MultiPolygon: True
Columns: ['change_type', 'district_name', 'touches_changed', 'geometry']
change_type: present = True
district_name: present = True
touches_changed: present = True