fio-l3-vienna-geofabrik-highways

Status: done
Score: 1.000
Duration: 55:06
Cost: 5.30¢
Model: deepseek/deepseek-v4-flash

Map (reference ← swipe → agent)

0:00
I'm building a traffic-noise model of the Gürtel corridor for the City of Vienna and need the surrounding network as a single GeoPackage. Can you pull every highway from the current Geofabrik Austria PBF (https://download.geofabrik.de/europe/austria-latest.osm.pbf) that intersects a 500 m buffer around Vienna's Gürtel, defined as the highway ways whose name ends in 'Gürtel'? I also need every public-transport route relation (bus, tram, subway, train, trolleybus, light_rail) that crosses the same buffer, with each route as a single feature. Keep the OSM attributes full and untruncated: name, highway, maxspeed, lanes, surface, and oneway on highways; ref, name, operator, and route on the PT routes. Please write both layers to vienna_network.gpkg, with the highways layer as LineString and the pt_routes layer as MultiLineString, in Austria's standard projected coordinate system.
0:00
System info
{
  "adapter": "openrouter",
  "model": "deepseek/deepseek-v4-flash",
  "base_url": "https://openrouter.ai/api/v1"
}
1:26
The user wants me to:

1. Download the Austria OSM PBF from Geofabrik
2. Find all highways whose name ends with 'Gürtel' in Vienna
3. Create a 500m buffer around those highways
4. Find all highways that intersect that buffer
5. Find all public-transport route relations (bus, tram, subway, train, trolleybus, light_rail) that cross the same buffer
6. Save both layers to vienna_network.gpkg
7. Highways as LineString, PT routes as MultiLineString
8. Use Austria's standard projected coordinate system (EPSG:31287 - MGI / Austria Lambert, or maybe EPSG:31256 - MGI / Austria GK East? Actually the most common Austrian projected CRS is EPSG:31287 - MGI / Austria Lambert. Let me verify... Austria's standard is EPSG:31287 (MGI / Austria Lambert) which is used for national mapping. But EPSG:31256 (MGI / Austria GK East) is also common. The standard projected CRS for Austria is indeed EPSG:31287.)

Let me think about the approach:

1. Use osmium to parse the PBF
2. First, find all highway ways with name ending in 'Gürtel' in Vienna
3. Buffer those ways by 500m
4. Find all highways intersecting that buffer
5. Find all PT route relations intersecting that buffer

This is a complex task. Let me write a Python script using osmium.

Actually, let me think about this more carefully:

Step 1: Download the PBF file
Step 2: Use osmium to find all ways that are highways with name ending in 'Gürtel'
Step 3: Buffer them
Step 4: Find all highways intersecting the buffer
Step 5: Find all PT route relations intersecting the buffer

Using osmium in Python, I can write handlers that process nodes, ways, and relations.

Let me think about the approach more carefully:

For finding the Gürtel ways:
- I need to find all ways where `highway` tag exists AND `name` ends with 'Gürtel'
- I need to collect the node coordinates for these ways to build their geometries

For the buffer:
- Once I have the Gürtel ways as geometries, I can buffer them by 500m using shapely

For highways intersecting buffer:
- I need to go through all ways again (or at least all highway ways) and check if they intersect the buffer

For PT routes:
- I need to find route relations with route=bus/tram/subway/train/trolleybus/light_rail
- Each route relation has member ways - I need to collect all member way geometries and merge them into a single MultiLineString
- Then check if they intersect the buffer

This is quite complex with osmium. Let me write a comprehensive script.

Actually, I think the most efficient approach is:
1. Download the PBF
2. Use osmium to parse it

Let me write the script step by step.

The script will:
1. Download the PBF
2. Use osmium to parse and extract data
3. Create buffers and find intersections
4. Save to GeoPackage

Let me code this up carefully.

For the projected CRS, Austria's standard is EPSG:31287 (MGI / Austria Lambert).

Let me write the solution:

```python
import osmium
import geopandas as gpd
import pandas as pd
import shapely.geometry as geom
import shapely.ops as ops
from shapely import wkb
import os
import urllib.request
import sys

# Step 1: Download PBF
pbf_url = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
pbf_path = "/work/austria-latest.osm.pbf"

if not os.path.exists(pbf_path):
    print("Downloading PBF...")
    urllib.request.urlretrieve(pbf_url, pbf_path)
    print("Download complete")

# Step 2: Find Gürtel ways
class GuertelHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # way_id -> list of node_refs
        self.guertel_tags = {}  # way_id -> dict of tags
        self.node_coords = {}   # node_id -> (lon, lat)
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags and 'name' in tags:
            name = tags['name']
            if name.endswith('Gürtel') or name.endswith('Guertel') or name.endswith('Gurtel'):
                self.guertel_ways[w.id] = [n.ref for n in w.nodes]
                self.guertel_tags[w.id] = tags
    
    def node(self, n):
        if n.id in self.coords_needed:
            self.node_coords[n.id] = (n.location.lon, n.location.lat)

# But wait - I need to know which nodes are needed before processing...
```

Hmm, osmium is a single-pass handler. So I need to either:
1. Do two passes (first find Gürtel ways and collect their node IDs, then get coords)
2. Or collect all nodes and all ways and filter later (memory heavy)

Let me use a different approach - use osmium to handle it in one pass by storing all nodes and ways, then filter.

Actually, let me reconsider. The PBF is ~500MB for Austria. That's quite large. Let me use a smarter approach.

Better approach: Use osmium's handler to:
1. In the first pass, find Gürtel ways and collect the node IDs needed
2. In the second pass, collect coordinates for those nodes
3. Build geometries
4. Then do another pass for highways intersecting the buffer
5. And another for PT relations

Actually, osmium handlers do a single pass. So I'd need to either:
- Use `osmium file` multiple times on the same file
- Or use a more complex approach

Let me use a handler approach where I collect everything in one pass if possible, or use multiple passes.

Wait - I can just do multiple passes by creating the handler and calling `handler.apply_file(pbf_path)` multiple times. Osmium supports this.

Let me redesign:

Pass 1: Find all Gürtel ways, collect their node IDs
Pass 2: Collect coordinates for those nodes
Pass 3: Find all highway ways that intersect the Gürtel buffer
... but how do I check intersection without having all node coords?

Actually, I think the cleanest approach is:

1. First pass: Find Gürtel ways and note their node IDs
2. Second pass: Get coordinates for those nodes
3. Build Gürtel geometries and buffer
4. Third pass: Find all highway ways, check if they intersect buffer (need node coords for all highway ways... that's a lot)

Hmm, this is getting complex. Let me think of a different approach.

Alternative: Use osmium to filter by tag, then use another tool (like GDAL/ogr2ogr or DuckDB with spatial) to do the spatial analysis.

Actually, the simplest approach might be:
1. Download the PBF
2. Use `osmium tags-filter` to extract highways and then convert to GeoPackage
3. Then use geopandas to do the spatial analysis

Let me use a combined approach:
1. Use osmium to filter the PBF to relevant features
2. Use geopandas/shapely for spatial analysis

Actually, let me use osmnx for the Vienna Gürtel part - that might be easier.

Wait, but osmnx uses Overpass API which has size limits. The entire Vienna network might be too large.

Let me go back to the osmium approach. I'll do it in multiple passes:

Pass 1: Find Gürtel ways, collect node IDs
Pass 2: Get node coordinates
Pass 3: Find highway ways intersecting buffer
Pass 4: Find PT route relations intersecting buffer

For pass 3, I need to know which nodes are on highway ways. That's potentially millions of nodes. Let me think...

Actually, let me use a more pragmatic approach. I'll use osmium to:
1. Extract all ways with highway tag
2. Also extract all ways with name ending in Gürtel
3. Build geometries for both
4. Do spatial filtering in geopandas

But the Austria PBF is huge. Let me just focus on what I need.

Better plan:
1. Download the PBF
2. Use osmium to extract Gürtel ways + their node coordinates
3. Buffer them
4. Use osmium to extract all highway ways in a bounding box roughly covering Vienna + buffer
5. Use osmium to extract all PT route relations
6. Do spatial filtering

Actually, let me be more practical. Let me use a script that:

1. Downloads the PBF
2. Pass 1: Find Gürtel ways and their node refs
3. Pass 2: Get node coordinates for those nodes
4. Build Gürtel line geometries
5. Buffer by 500m
6. Get the bounding box of the buffer
7. Pass 3: Within that bbox, get all highway ways and their node coordinates
8. Check spatial intersection
9. Pass 4: Find PT route relations that intersect the buffer

Let me just write it all out.

Actually, I realize I need to handle a LOT of data carefully. Let me think about memory.

For Austria PBF (~500MB), there are millions of nodes, hundreds of thousands of ways, and many relations.

For the Gürtel area, after getting the buffer, we're looking at a much smaller area - maybe just Vienna + some surroundings. The Gürtel is a ring road around Vienna's inner districts.

Let me take this approach:
1. Download PBF
2. Use osmium to find Gürtel ways - store their node IDs
3. Get node coordinates
4. Build Gürtel geometries, buffer, get bbox of buffer
5. Use osmium to find all highway ways in the buffer area - but I need node coords... 

Actually, what if I just store all node coordinates? That's a lot of memory for Austria.

Let me think about using osmium's `apply_file` with locations=True on the handler. There's a way to get node locations during processing.

In osmium, you can use `osmium.SimpleHandler` and access `node.location` for any referenced node if you set `locations=True` in `apply_file`. Wait, actually `locations=True` is for the `apply_file` method which enables node location caching for way processing.

Let me check the osmium API:

```python
class MyHandler(osmium.SimpleHandler):
    def way(self, w):
        for n in w.nodes:
            lat = n.location.lat  # This needs locations=True

handler.apply_file(pbf_path, locations=True)
```

Yes! If I use `locations=True`, I get node locations when processing ways. But this only works if all nodes are processed before ways. In osmium, the handler processes nodes first, then ways, so `locations=True` can cache node locations.

So my approach:

1. Download PBF
2. Create a handler that:
   - Stores all node coordinates (or just the ones it needs)
   - When processing ways, checks if it's a highway with name ending in Gürtel
   - If so, builds the geometry from node locations
3. After finding Gürtel ways, buffer them
4. Then do another pass to find all highway ways intersecting the buffer
5. And another for PT relations

But storing ALL node coordinates might be too memory intensive. Let me use a smarter approach.

Actually, for Vienna, the number of nodes in the Gürtel area is manageable. Let me:

1. First pass: Find Gürtel ways with locations=True to get their coordinates
2. Buffer and get bbox
3. Second pass: Within the bbox, collect all nodes (just the coordinates)
4. Actually...

Let me try a different, more practical approach. I'll write a comprehensive script.

Let me use the approach of:

1. Download PBF
2. Use osmium in multiple passes
3. Build everything up to GeoPackage

Here's my plan for the script:

```python
import osmium
import geopandas as gpd
import shapely.geometry as geom
from shapely import wkb, wkt
import pyproj
import numpy as np
import os, urllib.request, sys, json

PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"

# Download if needed
if not os.path.exists(PBF_PATH):
    urllib.request.urlretrieve(PBF_URL, PBF_PATH)

# Phase 1: Find Gürtel ways and their node coordinates
class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_geoms = {}  # way_id -> LineString
        self.guertel_way_tags = {}   # way_id -> tags
        self.nodes = {}  # node_id -> (lon, lat)
    
    def node(self, n):
        # Store all nodes - but this is memory intensive
        self.nodes[n.id] = (n.location.lon, n.location.lat)
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags and 'name' in tags:
            name = tags['name']
            if name.endswith('Gürtel'):
                coords = []
                valid = True
                for n in w.nodes:
                    if n.ref in self.nodes:
                        coords.append(self.nodes[n.ref])
                    else:
                        valid = False
                        break
                if valid and len(coords) >= 2:
                    self.guertel_way_ids.add(w.id)
                    self.guertel_way_geoms[w.id] = geom.LineString(coords)
                    self.guertel_way_tags[w.id] = tags
```

Wait, storing ALL nodes for Austria is too much. Let me think of a better way.

Actually, let me use two separate handler passes:

Pass 1: Just find the Gürtel way IDs and their node refs (without loading coordinates)
Pass 2: Get the coordinates for those specific nodes

```python
# Pass 1: Find Gürtel way IDs and node refs
class Pass1(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # way_id -> [node_refs]
        self.guertel_tags = {}
        self.needed_nodes = set()
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags and 'name' in tags:
            name = tags['name']
            if name.endswith('Gürtel'):
                refs = [n.ref for n in w.nodes]
                self.guertel_ways[w.id] = refs
                self.guertel_tags[w.id] = tags
                self.needed_nodes.update(refs)

# Pass 2: Get coordinates for needed nodes
class Pass2(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed_nodes = needed_nodes
        self.node_coords = {}
    
    def node(self, n):
        if n.id in self.needed_nodes:
            self.node_coords[n.id] = (n.location.lon, n.location.lat)
```

This is better - but it still requires two full passes of the PBF. That's okay for a ~500MB file, it'll just take some time.

Then for finding highway ways in the buffer:

Pass 3: Find all highway ways that intersect the buffer
But I'd need to know if they intersect without having coordinates... 

I think the most practical approach is:

For the highways: After I have the buffer bbox, I can do another pass where for each highway way, I check if its bounding box intersects the buffer bbox. But I still need node coordinates to build the actual geometry.

Let me try yet another approach. Since we're working with Vienna's Gürtel area, the total number of nodes in and around it is manageable. I can:

1. Find Gürtel ways
2. Buffer them, get bbox
3. In one pass, collect all nodes within that bbox AND all highway ways
4. Match them up

Actually, let me use a different strategy. Let me use `osmium` to create filtered OSM files first, then process those.

Or... let me just be smart about it. The Gürtel area in Vienna is maybe 5-10 km across. I can:

1. Find Gürtel ways and their coords (2 passes)
2. Buffer, get expanded bbox
3. In one pass with locations=True, process all ways. For each way that has a highway tag and whose nodes are within the bbox, collect it.

Wait, `locations=True` makes node locations available when processing ways, but it still processes all nodes first and caches their locations internally. With `locations=True`, I can access `n.lat` and `n.lon` in the way handler.

Hmm, but I think osmium's `locations=True` actually stores all node locations internally, which uses a lot of memory for a whole country.

Let me just write the script and handle it efficiently. I'll use multiple passes:

Pass 1: Find Gürtel way IDs and their node refs
Pass 2: Get node coords for Gürtel way nodes
  - Build Gürtel geometries
  - Buffer by 500m
  - Get bbox of buffer (with some margin)

Then for finding highways:
Pass 3: Find all highway ways. For each way, check if its bbox (from node refs... but I don't have all coords)

This is getting circular. Let me use a completely different approach.

I'll use DuckDB with the spatial extension! DuckDB can read the PBF file directly using the spatial extension with `ST_Read` or by using OGR.

Actually, let me check what tools are available. DuckDB with spatial extension can read GeoParquet, Shapefile, GPKG. For PBF, I'd need to convert first.

OK let me combine approaches:

1. Use `osmium` to extract just the Vienna area into a smaller PBF
2. Then process that smaller PBF

Or use `osmium` command-line tools:
- `osmium tags-filter` to filter by tag
- `osmium extract` to extract by bounding box

But actually, let me just use the Python approach with multiple passes and be smart about it.

Here's my final approach:

```python
import osmium
import geopandas as gpd
import shapely.geometry as geom
from shapely import wkb
import pandas as pd
import numpy as np
import os, urllib.request
from pyproj import CRS, Transformer
from functools import partial

PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
# Austria standard projected CRS: EPSG:31287 (MGI / Austria Lambert)
AUT_CRS = "EPSG:31287"

# Step 1: Download
if not os.path.exists(PBF_PATH):
    urllib.request.urlretrieve(
        "https://download.geofabrik.de/europe/austria-latest.osm.pbf",
        PBF_PATH
    )

# Step 2: Find Gürtel ways (Pass 1 - just get IDs and node refs)
class GuertelWayFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # id -> [node_ids]
        self.guertel_tags = {}
        self.needed_nodes = set()
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags and 'name' in tags:
            name = tags['name']
            if name.endswith('Gürtel'):
                refs = [n.ref for n in w.nodes]
                self.guertel_ways[w.id] = refs
                self.guertel_tags[w.id] = tags
                self.needed_nodes.update(refs)

finder = GuertelWayFinder()
finder.apply_file(PBF_PATH)
print(f"Found {len(finder.guertel_ways)} Gürtel ways")
print(f"Need {len(finder.needed_nodes)} nodes")

# Step 3: Get coordinates for Gürtel nodes (Pass 2)
class NodeCoordGetter(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed_nodes = needed_nodes
        self.node_coords = {}
    
    def node(self, n):
        if n.id in self.needed_nodes:
            self.node_coords[n.id] = (n.location.lon, n.location.lat)

coord_getter = NodeCoordGetter(finder.needed_nodes)
coord_getter.apply_file(PBF_PATH)
print(f"Got {len(coord_getter.node_coords)} node coordinates")

# Step 4: Build Gürtel line geometries
guertel_lines = []
for way_id, refs in finder.guertel_ways.items():
    coords = [coord_getter.node_coords[r] for r in refs if r in coord_getter.node_coords]
    if len(coords) >= 2:
        line = geom.LineString(coords)
        tags = finder.guertel_tags[way_id]
        tags['osm_id'] = way_id
        guertel_lines.append({'geometry': line, **tags})

guertel_gdf = gpd.GeoDataFrame(guertel_lines, crs="EPSG:4326")
# Project to Austrian CRS
guertel_gdf = guertel_gdf.to_crs(AUT_CRS)

# Step 5: Buffer by 500m
buffer = guertel_gdf.unary_union.buffer(500)
buffer_gdf = gpd.GeoDataFrame(geometry=[buffer], crs=AUT_CRS)
# Get bbox in WGS84 for filtering
buffer_bounds = buffer.bounds  # (minx, miny, maxx, maxy) in AUT_CRS
# Transform bbox to WGS84
transformer = Transformer.from_crs(AUT_CRS, "EPSG:4326", always_xy=True)
# Convert corners
sw = transformer.transform(buffer_bounds[0], buffer_bounds[1])
ne = transformer.transform(buffer_bounds[2], buffer_bounds[3])
bbox_wgs84 = (sw[0], sw[1], ne[0], ne[1])
print(f"Buffer bbox (WGS84): {bbox_wgs84}")

# Step 6: Find all highway ways intersecting the buffer
# For this, I need to process all ways with highway tag and check intersection
# I'll need their node coordinates too

# Pass 3: First, find all highway way IDs and their node refs within bbox
class HighwayWayFinder(osmium.SimpleHandler):
    def __init__(self, bbox):
        super().__init__()
        self.bbox = bbox  # (min_lon, min_lat, max_lon, max_lat)
        self.highway_ways = {}  # id -> [node_ids]
        self.highway_tags = {}
        self.highway_needed_nodes = set()
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags:
            refs = [n.ref for n in w.nodes]
            self.highway_ways[w.id] = refs
            self.highway_tags[w.id] = tags
            self.highway_needed_nodes.update(refs)

highway_finder = HighwayWayFinder(bbox_wgs84)
highway_finder.apply_file(PBF_PATH)
print(f"Found {len(highway_finder.highway_ways)} highway ways")
print(f"Need {len(highway_finder.highway_needed_nodes)} highway nodes")

# Pass 4: Get node coordinates for highway nodes
class HighwayNodeGetter(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed_nodes = needed_nodes
        self.node_coords = {}
    
    def node(self, n):
        if n.id in self.needed_nodes:
            self.node_coords[n.id] = (n.location.lon, n.location.lat)

highway_node_getter = HighwayNodeGetter(highway_finder.highway_needed_nodes)
highway_node_getter.apply_file(PBF_PATH)
print(f"Got {len(highway_node_getter.node_coords)} highway node coordinates")

# Build highway geometries
highway_features = []
for way_id, refs in highway_finder.highway_ways.items():
    coords = [highway_node_getter.node_coords[r] for r in refs if r in highway_node_getter.node_coords]
    if len(coords) >= 2:
        line = geom.LineString(coords)
        tags = highway_finder.highway_tags[way_id]
        tags['osm_id'] = way_id
        highway_features.append({'geometry': line, **tags})

highways_gdf = gpd.GeoDataFrame(highway_features, crs="EPSG:4326")
highways_gdf = highways_gdf.to_crs(AUT_CRS)

# Filter to those intersecting buffer
highways_in_buffer = highways_gdf[highways_gdf.intersects(buffer)]
print(f"Highways intersecting buffer: {len(highways_in_buffer)}")

# Select relevant columns
keep_cols = ['osm_id', 'name', 'highway', 'maxspeed', 'lanes', 'surface', 'oneway', 'geometry']
existing_cols = [c for c in keep_cols if c in highways_in_buffer.columns or c == 'geometry']
highways_out = highways_in_buffer[existing_cols]
```

Hmm, wait. The Pass 3 approach of collecting ALL highway ways in the whole PBF is going to be millions of ways. That's too many nodes.

Let me think differently. I should use the buffer bbox to filter at the OSM level too. But I'm not filtering by location in the handler - I'm collecting everything with highway tag.

Actually, osmium handlers process everything in the file. There's no built-in spatial filtering in the SimpleHandler. So I'd need to collect all nodes first, then check.

Hmm, but if I use locations=True, I can get node locations while processing ways, and check if the way's bbox is within the buffer bbox.

Let me try a different approach - use locations=True and process ways:

```python
class HighwayInBufferFinder(osmium.SimpleHandler):
    def __init__(self, buffer_polygon_wgs84):
        super().__init__()
        self.buffer = buffer_polygon_wgs84
        self.buffer_bounds = buffer_polygon_wgs84.bounds
        self.highway_features = []  # list of (geometry, tags)
        self.current_way_coords = {}
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags:
            # Build coordinates from node locations
            coords = []
            for n in w.nodes:
                if n.location.valid():
                    coords.append((n.location.lon, n.location.lat))
            if len(coords) >= 2:
                line = geom.LineString(coords)
                # Quick bbox check
                if line.intersects(self.buffer):
                    self.highway_features.append((line, tags))

handler = HighwayInBufferFinder(buffer_wgs84)
handler.apply_file(PBF_PATH, locations=True)
```

With `locations=True`, osmium caches all node locations internally and makes them available when processing ways. This uses a lot of memory but it works. For Austria, it might be too much memory though.

Actually, I think the approach of processing everything with locations=True will work but might take a while and use a lot of memory. Let me try it and see.

Actually, let me reconsider. The osmium library with `locations=True` in `apply_file` stores node locations in a special index that can use disk-based storage. Let me use that.

Wait, actually osmium's Python bindings may handle memory differently. Let me look into this.

Let me just write a practical solution and test it. I'll use a hybrid approach:

1. First, find Gürtel ways (2 passes)
2. Build buffer, get bbox
3. Use locations=True to find highways in the buffer area
4. For PT routes, find route relations and their member ways

For the PT routes, I need to:
1. Find route relations with route=bus/tram/subway/train/trolleybus/light_rail
2. For each relation, find its member ways
3. Collect the way geometries and merge into MultiLineString
4. Check if they intersect the buffer

This is going to be a complex script. Let me write it carefully.

Actually, I realize I can simplify by using `osmium` with locations=True for the whole PBF processing. The key insight is that when `locations=True` is used, node locations are stored in an internal cache that's designed to handle large files efficiently (it uses a disk-backed index in the C++ osmium library).

Let me write the complete script now.

Actually, I just realized - for the PT routes, each relation's member ways need their node coordinates too. And route relations can have dozens or hundreds of member ways. This is getting complex.

Let me take a step-by-step approach and write the Python script.

For PT routes, I'll need to:
1. Find all route relations with the specified route types
2. For each route, collect all member way IDs
3. Get all node IDs for those ways
4. Get node coordinates
5. Build geometries
6. Merge into MultiLineString
7. Check buffer intersection

This could be done in multiple passes.

Let me write the complete script now. I'll be careful about memory and performance.

OK let me just write it. I'll use osmium handlers efficiently.

Here's my plan for the complete script:

```python
#!/usr/bin/env python3
"""
Extract Vienna Gürtel corridor highways and PT routes from Austria PBF.
"""

import osmium
import geopandas as gpd
import shapely.geometry as geom
from shapely import wkb
import pandas as pd
import numpy as np
import os, sys, urllib.request, time
from pyproj import Transformer
from collections import defaultdict

# --- CONFIG ---
PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
AUT_CRS = "EPSG:31287"  # MGI / Austria Lambert

# 1. Download
if not os.path.exists(PBF_PATH):
    print("Downloading PBF...")
    for attempt in range(3):
        try:
            urllib.request.urlretrieve(PBF_URL, PBF_PATH)
            break
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}")
            time.sleep(2)
    else:
        sys.exit("Download failed after 3 attempts")
    print("Download complete")

# 2. Find Gürtel ways
class GuertelPass1(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # id -> [node_ids]
        self.guertel_tags = {}
        self.needed_nodes = set()
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags and 'name' in tags:
            name = tags.get('name', '')
            if name.endswith('Gürtel') or name.endswith('Guertel'):
                refs = [n.ref for n in w.nodes]
                if len(refs) >= 2:
                    self.guertel_ways[w.id] = refs
                    self.guertel_tags[w.id] = tags
                    self.needed_nodes.update(refs)

print("Pass 1: Finding Gürtel ways...")
g1 = GuertelPass1()
g1.apply_file(PBF_PATH)
print(f"  Found {len(g1.guertel_ways)} Gürtel ways, {len(g1.needed_nodes)} nodes needed")

# 3. Get Gürtel node coordinates
class NodePass(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed = needed_nodes
        self.coords = {}
    
    def node(self, n):
        nid = n.id
        if nid in self.needed:
            self.coords[nid] = (n.location.lon, n.location.lat)
            self.needed.discard(nid)  # remove to save memory

print("Pass 2: Getting Gürtel node coordinates...")
np2 = NodePass(g1.needed_nodes.copy())  # copy because we mutate
np2.apply_file(PBF_PATH)
print(f"  Got {len(np2.coords)} coordinates")

# 4. Build Gürtel geometries
guertel_geoms = []
for wid, refs in g1.guertel_ways.items():
    coords = [np2.coords[r] for r in refs if r in np2.coords]
    if len(coords) >= 2:
        line = geom.LineString(coords)
        tags = g1.guertel_tags[wid].copy()
        guertel_geoms.append((line, tags))

guertel_gdf = gpd.GeoDataFrame(
    [t[1] for t in guertel_geoms],
    geometry=[t[0] for t in guertel_geoms],
    crs="EPSG:4326"
)
guertel_gdf = guertel_gdf.to_crs(AUT_CRS)
print(f"  Built {len(guertel_gdf)} Gürtel geometries")

# 5. Buffer
buffer_geom = guertel_gdf.unary_union.buffer(500)
buffer_bounds = buffer_geom.bounds  # (minx, miny, maxx, maxy) in AUT_CRS

# 6. Find all highway ways intersecting buffer
# Using locations=True
class HighwayPass(osmium.SimpleHandler):
    def __init__(self, buffer_wgs84):
        super().__init__()
        self.buffer = buffer_wgs84
        self.ways = []  # list of (LineString, tags_dict)
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags:
            coords = []
            for n in w.nodes:
                loc = n.location
                if loc and loc.valid():
                    coords.append((loc.lon, loc.lat))
            if len(coords) >= 2:
                try:
                    line = geom.LineString(coords)
                    if line.intersects(self.buffer):
                        self.ways.append((line, tags))
                except:
                    pass

# Convert buffer to WGS84 for intersection check
trans = Transformer.from_crs(AUT_CRS, "EPSG:4326", always_xy=True)
minx, miny = trans.transform(buffer_bounds[0], buffer_bounds[1])
maxx, maxy = trans.transform(buffer_bounds[2], buffer_bounds[3])
# Make slightly larger bbox
buffer_wgs84 = geom.box(minx, miny, maxx, maxy)

print("Pass 3: Finding highway ways intersecting buffer...")
hp = HighwayPass(buffer_wgs84)
hp.apply_file(PBF_PATH, locations=True)
print(f"  Found {len(hp.ways)} highway ways")

# Build highways GeoDataFrame
highway_records = []
for line, tags in hp.ways:
    rec = {k: v for k, v in tags.items() if k in ('name', 'highway', 'maxspeed', 'lanes', 'surface', 'oneway')}
    rec['geometry'] = line
    highway_records.append(rec)

highways_gdf = gpd.GeoDataFrame(highway_records, crs="EPSG:4326")
highways_gdf = highways_gdf.to_crs(AUT_CRS)

# Refine intersection (buffer in AUT_CRS)
highways_in_buffer = highways_gdf[highways_gdf.intersects(buffer_geom)].copy()
print(f"  After spatial refine: {len(highways_in_buffer)} highways in buffer")

# 7. Find PT route relations
# First, find all route relations with matching route types
PT_ROUTES = {'bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail', 'light_rail'}

class PTRelationPass(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.pt_relations = {}  # id -> {'tags': {}, 'members': [(role, type, id)]}
    
    def relation(self, r):
        tags = dict(r.tags)
        route = tags.get('route', '')
        if route in PT_ROUTES:
            members = [(m.role, m.type, m.ref) for m in r.members]
            self.pt_relations[r.id] = {'tags': tags, 'members': members}

print("Pass 4: Finding PT route relations...")
prp = PTRelationPass()
prp.apply_file(PBF_PATH)
print(f"  Found {len(prp.pt_relations)} PT route relations")

# Collect all member way IDs from PT relations
member_way_ids = set()
for rel_id, data in prp.pt_relations.items():
    for role, mtype, mref in data['members']:
        if mtype == 'w':  # way type
            member_way_ids.add(mref)

print(f"  {len(member_way_ids)} unique member ways in PT relations")

# Get coordinates for member ways' nodes
# First, get the node refs of member ways
class MemberWayPass(osmium.SimpleHandler):
    def __init__(self, way_ids):
        super().__init__()
        self.way_ids = way_ids
        self.way_node_refs = {}  # way_id -> [node_ids]
        self.needed_nodes = set()
    
    def way(self, w):
        if w.id in self.way_ids:
            refs = [n.ref for n in w.nodes]
            self.way_node_refs[w.id] = refs
            self.needed_nodes.update(refs)

print("Pass 5: Getting member way node refs...")
mwp = MemberWayPass(member_way_ids)
mwp.apply_file(PBF_PATH)
print(f"  Got {len(mwp.way_node_refs)} ways, {len(mwp.needed_nodes)} unique nodes")

# Get node coordinates
class PTNodePass(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed = needed_nodes
        self.coords = {}
    
    def node(self, n):
        if n.id in self.needed:
            self.coords[n.id] = (n.location.lon, n.location.lat)
            self.needed.discard(n.id)

print("Pass 6: Getting PT node coordinates...")
ptnp = PTNodePass(mwp.needed_nodes.copy())
ptnp.apply_file(PBF_PATH)
print(f"  Got {len(ptnp.coords)} node coordinates")

# Build way geometries for PT route members
way_geoms = {}  # way_id -> LineString
for wid, refs in mwp.way_node_refs.items():
    coords = [ptnp.coords[r] for r in refs if r in ptnp.coords]
    if len(coords) >= 2:
        try:
            way_geoms[wid] = geom.LineString(coords)
        except:
            pass

print(f"  Built {len(way_geoms)} way geometries for PT members")

# Now build route geometries and filter by buffer intersection
pt_route_features = []
for rel_id, data in prp.pt_relations.items():
    tags = data['tags']
    route_lines = []
    for role, mtype, mref in data['members']:
        if mtype == 'w' and mref in way_geoms:
            route_lines.append(way_geoms[mref])
    
    if route_lines:
        # Merge into MultiLineString
        if len(route_lines) == 1:
            route_geom = route_lines[0]
        else:
            route_geom = geom.MultiLineString(route_lines)
        
        # Check intersection with buffer (in WGS84 first, quick check)
        route_buffer_wgs84 = geom.box(minx, miny, maxx, maxy)
        if route_geom.intersects(route_buffer_wgs84):
            rec = {k: v for k, v in tags.items() if k in ('ref', 'name', 'operator', 'route')}
            rec['geometry'] = route_geom
            pt_route_features.append(rec)

print(f"  {len(pt_route_features)} PT routes intersecting buffer (WGS84 check)")

pt_routes_gdf = gpd.GeoDataFrame(pt_route_features, crs="EPSG:4326")
pt_routes_gdf = pt_routes_gdf.to_crs(AUT_CRS)

# Refine intersection
pt_routes_in_buffer = pt_routes_gdf[pt_routes_gdf.intersects(buffer_geom)]
print(f"  After spatial refine: {len(pt_routes_in_buffer)} PT routes in buffer")

# 8. Write to GeoPackage
# Highways as LineString
highways_out = highways_in_buffer.copy()
# Ensure LineString geometry type
# Keep only desired columns
keep_hw = ['name', 'highway', 'maxspeed', 'lanes', 'surface', 'oneway']
hw_cols = [c for c in keep_hw if c in highways_out.columns]
... hmm wait, I should keep all columns actually. The user said "Keep the OSM attributes full and untruncated" for the listed ones. So I should include all of those + geometry.

Actually re-reading: "Keep the OSM attributes full and untruncated: name, highway, maxspeed, lanes, surface, and oneway on highways; ref, name, operator, and route on the PT routes."

So I should keep those specific attributes (and geometry). Let me re-read... "Keep the OSM attributes full and untruncated" - meaning don't truncate the values, keep the full values for those attributes. So I should only include those named attributes plus geometry.

For highways: name, highway, maxspeed, lanes, surface, oneway, geometry
For PT routes: ref, name, operator, route, geometry

Let me finalize the script.

Actually, I realize there might be issues with the locations=True approach for the highway pass - it might require too much memory. Let me add a check.

Also, I should ensure the PT routes are output as MultiLineString. If a route's geometry is already a MultiLineString, great. If it's a single LineString, I should convert it.

Let me write the complete script now. I'll be thorough.

Actually, there's another issue. The `locations=True` parameter in osmium's `apply_file` might not work as expected. Let me check the osmium Python docs...

In osmium, `locations=True` is used with `apply_file` to enable node location tracking. When you process ways, you can access `node.location` which gives lat/lon. However, this only works if the handler processes nodes before ways (which is the default order in PBF files).

But wait - osmium's SimpleHandler processes all nodes first, then ways. So with `locations=True`, when processing ways, all node locations should be available.

However, the memory issue: osmium stores node locations in an internal data structure. For Austria with ~500 million nodes? No, Austria PBF probably has more like ~50-100 million nodes. Each coordinate takes 16 bytes (double lat + double lon), so that's ~1.6 GB for 100M nodes. That might be too much.

Let me think about an alternative approach to avoid storing all nodes.

Alternative: Process the PBF in a more targeted way.

Actually, I think for Vienna area, the number of nodes is much more manageable. The Gürtel is in Vienna. Vienna has maybe a few million nodes. That should be fine.

But my HighwayPass handler processes ALL ways in Austria and all their nodes. That's the problem.

Let me use a smarter approach:

1. Find Gürtel ways (done)
2. Buffer, get bbox
3. Use the bbox to filter nodes first (only collect nodes in the bbox area)
4. Then process ways to find highways intersecting the buffer

This requires two more passes.

Or, alternatively:
1. Use `osmium extract` command line to extract the bbox area into a smaller PBF
2. Process the smaller PBF

Let me check if `osmium` CLI tool is available... actually, osmium is installed as a Python package, not necessarily the CLI tool. Let me check.

Actually, the `osmium` CLI tool might be available if the Python package installs it. But I'm not sure. Let me just use Python.

Let me use a pragmatic approach: filter nodes by bbox first, then filter ways.

```python
# Pass to collect all nodes within the buffer bbox (expanded a bit)
class NodeInBBoxPass(osmium.SimpleHandler):
    def __init__(self, bbox_wgs84):
        super().__init__()
        self.bbox = bbox_wgs84  # (min_lon, min_lat, max_lon, max_lat)
        self.min_lon, self.min_lat, self.max_lon, self.max_lat = bbox_wgs84
        self.coords = {}
    
    def node(self, n):
        lon = n.location.lon
        lat = n.location.lat
        if self.min_lon <= lon <= self.max_lon and self.min_lat <= lat <= self.max_lat:
            self.coords[n.id] = (lon, lat)

# Then find highways using these node sets
class HighwayInBBoxPass(osmium.SimpleHandler):
    def __init__(self, node_coords, buffer_wgs84):
        super().__init__()
        self.node_coords = node_coords
        self.buffer = buffer_wgs84
        self.ways = []
    
    def way(self, w):
        tags = dict(w.tags)
        if 'highway' in tags:
            coords = []
            for n in w.nodes:
                if n.ref in self.node_coords:
                    coords.append(self.node_coords[n.ref])
            if len(coords) >= 2:
                try:
                    line = geom.LineString(coords)
                    if line.intersects(self.buffer):
                        self.ways.append((line, tags))
                except:
                    pass
```

This is better because it only stores nodes within the bbox, not all nodes in Austria.

OK let me just write the complete, final script. I'll be careful about all the details.

Let me also think about what "name ends in 'Gürtel'" means in OSM. In Vienna, the main Gürtel roads are:
- Äußerer Gürtel (outer)
- Innerer Gürtel (inner)
- And segments named like "Währinger Gürtel", "Gürtel" etc.

So the name ends with "Gürtel" - this would match all of these.

Let me finalize the script now.

One more consideration: for the PT route relations, I need to check if they "cross the same buffer". A route relation could cross the buffer even if most of its ways are outside it. The route relation's geometry is the union of all its member ways. If any of those ways cross the buffer, the route crosses it.

Let me also think about the member types in relations. The members can be nodes (type 'n'), ways (type 'w'), or relations (type 'r'). For building geometries, I only care about ways. And some route members might have role like 'forward', 'backward', 'platform', etc.

OK, I'll write it now. Let me be thorough and handle edge cases.

Let me structure the script properly and write it.
1:26
Bash python3 -c "import osmium; print(osmium.__version__)" 2>&1
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import osmium; print(osmium.__version__)
                         ^^^^^^^^^^^^^^^^^^
AttributeError: module 'osmium' has no attribute '__version__'
1:26
Bash python3 -c "import geopandas; print(geopandas.__version__)" 2>&1
1.1.3
1:27
Bash python3 -c "import shapely; print(shapely.__version__)" 2>&1
2.1.2
1:28
Bash ls /work/ 2>&1
(no output)
2:01
Let me understand the task:

1. Download the Austria PBF from Geofabrik
2. Find all "Gürtel" ways in Vienna (name ends in 'Gürtel') - these define the Gürtel corridor
3. Buffer those ways by 500m
4. Find all highways that intersect that buffer
5. Find all public transport route relations (bus, tram, subway, train, trolleybus, light_rail) that cross that buffer
6. Write both layers to vienna_network.gpkg
   - highways: LineString, with attributes name, highway, maxspeed, lanes, surface, oneway
   - pt_routes: MultiLineString, with attributes ref, name, operator, route
7. Use Austria's standard projected coordinate system (EPSG:31287 - MGI / Austria Lambert, or maybe EPSG:31256 - MGI / Austria GK East? Let me think... The standard projected CRS for Austria is EPSG:31287 (MGI / Austria Lambert) which is used for nationwide mapping. But EPSG:31256 (MGI / Austria GK East) is also common. Let me check - Austria's standard projected CRS is EPSG:31287 - MGI / Austria Lambert.)

Let me write a comprehensive Python script to do this. I'll use osmium for reading the PBF file efficiently.

The approach:
1. Download the PBF file
2. First pass: find all ways whose name ends with 'Gürtel' in Vienna
3. Buffer those ways by 500m (in a projected CRS for accurate buffering)
4. Second pass: find all highway ways intersecting the buffer
5. Find all PT route relations intersecting the buffer
6. Write to GeoPackage

Actually, osmium is a bit tricky to work with for spatial queries. Let me think about the best approach.

Option 1: Use osmium to extract relevant data, then process with shapely/geopandas
Option 2: Use osmnx to query directly from Overpass API

Since we're dealing with a large PBF file (Austria), let me use osmium to parse it locally rather than hitting Overpass API, which might time out.

Actually, let me think more carefully. The approach with osmium:

1. Use osmium to read the PBF and find all ways whose `name` tag ends with "Gürtel"
2. Get their geometries and buffer by 500m
3. Then find all highway ways that intersect this buffer
4. For PT relations, I need to find relations with route=bus/tram/subway/train/trolleybus/light_rail that cross the buffer

But osmium works in a streaming fashion, so I need to make multiple passes or collect data in memory.

Let me think about a clean approach:

**Pass 1**: Read the PBF, find all ways with name ending in "Gürtel". Collect their node references.
**Pass 2**: Get the node coordinates for those references.
**Pass 3**: Build geometries, buffer, then find intersecting highway ways.
**Pass 4**: Find PT route relations.

This is getting complex. Let me use a different approach - use pyosmium or process in stages.

Actually, a simpler approach: I can use `osmium` command-line tool for filtering first, then process with Python. But let me check what's available.

Let me check if `osmium` CLI tool is available.

Actually, I'll write a Python script using the osmium module (the Python bindings) to do this efficiently.

Let me write the script step by step.

Actually, let me think about the simplest robust approach:

1. Download the PBF
2. Use osmium Python module to:
   a. First pass: collect all ways with name ending in Gürtel, and their node IDs
   b. Second pass: resolve node coordinates
   c. Build buffer geometry
   d. Third pass: find highway ways intersecting buffer
   e. For PT relations: find relations with route=bus/tram/subway/train/trolleybus/light_rail that intersect buffer
   f. Write to GeoPackage

This is going to be a substantial script. Let me write it carefully.

For the buffer, I need to project to a metric CRS first. Austria's standard is EPSG:31287 (MGI / Austria Lambert). I'll use that for the buffer and final output.

Let me write the complete script:

```python
import osmium
import geopandas as gpd
import pandas as pd
import shapely.geometry as geom
import shapely.ops as ops
import pyproj
from shapely import wkb
import sys
import urllib.request
import os

# Step 1: Download the PBF
pbf_url = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
pbf_path = "/work/austria-latest.osm.pbf"

if not os.path.exists(pbf_path):
    print("Downloading PBF...")
    urllib.request.urlretrieve(pbf_url, pbf_path)
    print("Download complete.")

# Step 2: Find all ways whose name ends with 'Gürtel'
class GuertelWaysHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # way_id -> osmium Way
        self.node_ids = set()
    
    def way(self, w):
        if 'name' in w.tags and w.tags['name'].endswith('Gürtel'):
            self.guertel_ways[w.id] = w
            for n in w.nodes:
                self.node_ids.add(n.ref)

print("Finding Gürtel ways...")
handler1 = GuertelWaysHandler()
handler1.apply_file(pbf_path)

print(f"Found {len(handler1.guertel_ways)} Gürtel ways with {len(handler1.node_ids)} unique nodes")

# Step 3: Get node coordinates
class NodeHandler(osmium.SimpleHandler):
    def __init__(self, target_ids):
        super().__init__()
        self.target_ids = target_ids
        self.nodes = {}  # node_id -> (lon, lat)
    
    def node(self, n):
        if n.id in self.target_ids:
            self.nodes[n.id] = (n.location.lon, n.location.lat)

print("Getting node coordinates...")
handler_nodes = NodeHandler(handler1.node_ids)
handler_nodes.apply_file(pbf_path)

print(f"Resolved {len(handler_nodes.nodes)} nodes")

# Step 4: Build Gürtel way geometries
def build_way_geometry(way, nodes_dict):
    coords = []
    for n in way.nodes:
        if n.ref in nodes_dict:
            coords.append(nodes_dict[n.ref])
    if len(coords) < 2:
        return None
    return geom.LineString(coords)

guertel_lines = []
for way_id, way in handler1.guertel_ways.items():
    line = build_way_geometry(way, handler_nodes.nodes)
    if line is not None:
        guertel_lines.append(line)

print(f"Built {len(guertel_lines)} Gürtel line geometries")

# Step 5: Buffer the Gürtel by 500m
# First project to a metric CRS
gdf_guertel = gpd.GeoDataFrame(geometry=guertel_lines, crs="EPSG:4326")
gdf_guertel = gdf_guertel.to_crs("EPSG:31287")  # Austria Lambert
guertel_union = gdf_guertel.union_all()
buffer_geom = guertel_union.buffer(500)
buffer_geom_wgs84 = gpd.GeoSeries([buffer_geom], crs="EPSG:31287").to_crs("EPSG:4326").iloc[0]

print("Built 500m buffer")

# Step 6: Find all highway ways intersecting the buffer
class HighwayHandler(osmium.SimpleHandler):
    def __init__(self, buffer_geom):
        super().__init__()
        self.buffer_geom = buffer_geom
        self.highway_ways = {}  # way_id -> osmium Way
        self.highway_node_ids = set()
    
    def way(self, w):
        if 'highway' in w.tags:
            # Build a quick bbox check
            self.highway_ways[w.id] = w
            for n in w.nodes:
                self.highway_node_ids.add(n.ref)

print("Finding highway ways...")
handler_highways = HighwayHandler(buffer_geom_wgs84)
handler_highways.apply_file(pbf_path)

print(f"Found {len(handler_highways.highway_ways)} highway ways with {len(handler_highways.highway_node_ids)} nodes")

# Get highway node coordinates
print("Getting highway node coordinates...")
handler_highway_nodes = NodeHandler(handler_highways.highway_node_ids)
handler_highway_nodes.apply_file(pbf_path)
print(f"Resolved {len(handler_highway_nodes.nodes)} highway nodes")

# Filter highways that actually intersect the buffer
print("Filtering highways intersecting buffer...")
highway_features = []
for way_id, way in handler_highways.highway_ways.items():
    line = build_way_geometry(way, handler_highway_nodes.nodes)
    if line is not None and line.intersects(buffer_geom_wgs84):
        attrs = {
            'name': way.tags.get('name'),
            'highway': way.tags.get('highway'),
            'maxspeed': way.tags.get('maxspeed'),
            'lanes': way.tags.get('lanes'),
            'surface': way.tags.get('surface'),
            'oneway': way.tags.get('oneway'),
        }
        highway_features.append({'geometry': line, **attrs})

print(f"Found {len(highway_features)} intersecting highway ways")

# Step 7: Find PT route relations
# First, get the buffer in WGS84 for intersection checks
# PT route relations are tricky with osmium because they reference ways, not nodes directly.
# A relation has members (ways), and we need to check if any of those ways intersect the buffer.

class PTRelationHandler(osmium.SimpleHandler):
    def __init__(self, buffer_geom):
        super().__init__()
        self.buffer_geom = buffer_geom
        self.pt_relations = {}  # rel_id -> relation
        self.pt_relation_way_ids = set()  # way IDs referenced by PT relations
    
    def relation(self, r):
        route = r.tags.get('route')
        if route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            self.pt_relations[r.id] = r
            for m in r.members:
                if m.type == 'w':  # way members
                    self.pt_relation_way_ids.add(m.ref)

print("Finding PT relations...")
handler_pt = PTRelationHandler(buffer_geom_wgs84)
handler_pt.apply_file(pbf_path)

print(f"Found {len(handler_pt.pt_relations)} PT relations with {len(handler_pt.pt_relation_way_ids)} referenced ways")

# Get geometries of ways referenced by PT relations
print("Getting PT relation way geometries...")
class PTWayHandler(osmium.SimpleHandler):
    def __init__(self, target_ids):
        super().__init__()
        self.target_ids = target_ids
        self.pt_ways = {}
        self.pt_node_ids = set()
    
    def way(self, w):
        if w.id in self.target_ids:
            self.pt_ways[w.id] = w
            for n in w.nodes:
                self.pt_node_ids.add(n.ref)

handler_pt_ways = PTWayHandler(handler_pt.pt_relation_way_ids)
handler_pt_ways.apply_file(pbf_path)
print(f"Found {len(handler_pt_ways.pt_ways)} PT ways with {len(handler_pt_ways.pt_node_ids)} nodes")

# Get node coordinates for PT ways
handler_pt_nodes = NodeHandler(handler_pt_ways.pt_node_ids)
handler_pt_nodes.apply_file(pbf_path)
print(f"Resolved {len(handler_pt_nodes.nodes)} PT way nodes")

# Build way geometries
pt_way_geoms = {}
for way_id, way in handler_pt_ways.pt_ways.items():
    line = build_way_geometry(way, handler_pt_nodes.nodes)
    if line is not None:
        pt_way_geoms[way_id] = line

print(f"Built {len(pt_way_geoms)} PT way geometries")

# Now check each PT relation: does it cross the buffer?
# A relation crosses the buffer if any of its way members intersect the buffer
pt_route_features = []
for rel_id, rel in handler_pt.pt_relations.items():
    member_lines = []
    crosses_buffer = False
    for m in rel.members:
        if m.type == 'w' and m.ref in pt_way_geoms:
            line = pt_way_geoms[m.ref]
            if line.intersects(buffer_geom_wgs84):
                crosses_buffer = True
            member_lines.append(line)
    
    if crosses_buffer and member_lines:
        # Merge into MultiLineString
        merged = geom.MultiLineString(member_lines)
        attrs = {
            'ref': rel.tags.get('ref'),
            'name': rel.tags.get('name'),
            'operator': rel.tags.get('operator'),
            'route': rel.tags.get('route'),
        }
        pt_route_features.append({'geometry': merged, **attrs})

print(f"Found {len(pt_route_features)} PT routes crossing the buffer")

# Step 8: Write to GeoPackage
print("Writing to GeoPackage...")
gdf_highways = gpd.GeoDataFrame(highway_features, crs="EPSG:4326")
gdf_highways = gdf_highways.to_crs("EPSG:31287")

gdf_pt = gpd.GeoDataFrame(pt_route_features, crs="EPSG:4326")
gdf_pt = gdf_pt.to_crs("EPSG:31287")

# PT routes should be MultiLineString
# Convert any LineString to MultiLineString
from shapely.geometry import MultiLineString
gdf_pt['geometry'] = gdf_pt['geometry'].apply(
    lambda g: MultiLineString([g]) if g.geom_type == 'LineString' else g
)

gdf_highways.to_file("/work/vienna_network.gpkg", layer="highways", driver="GPKG")
gdf_pt.to_file("/work/vienna_network.gpkg", layer="pt_routes", driver="GPKG")

print("Done!")
```

Wait, there's a problem. The osmium SimpleHandler processes nodes first, then ways, then relations by default. But I'm calling apply_file multiple times. That should be fine since each handler only processes the type it cares about.

But there's also a concern about memory - the Austria PBF is large. Let me be more careful about what I collect.

Also, I need to handle the case where the Gürtel ways might not have a `name` tag that ends in 'Gürtel' - they might have names like "Gürtel" or "Wiener Gürtel" or similar. The user says "whose name ends in 'Gürtel'" so I'll filter on that.

Let me also think about the buffer more carefully. A 500m buffer around a ~10km ring road in WGS84 degrees is extremely small (roughly 0.0045 degrees at Vienna's latitude). The buffer needs to be done in a projected CRS. I'll use EPSG:31287.

Actually, let me think about this differently. Rather than trying to do everything in memory from the full PBF, let me use a more efficient approach.

Actually, osmium is pretty efficient for this. Let me just write the script and run it.

One issue: the PBF file is about 500MB. Let me check if there's enough disk space.

Let me also make sure I handle the case properly. The approach seems sound. Let me write it.

Wait, one more issue. When I apply_file multiple times, I'm reading the entire PBF each time. That's 4 passes (Gürtel ways, nodes for Gürtel, highway ways, nodes for highways, PT relations, PT ways, PT nodes). That's 7 passes! This will be slow.

Let me optimize: I can collect all data in fewer passes.

Pass 1: Collect Gürtel ways, highway ways, and PT relations in one go. Also collect all node IDs needed.
Pass 2: Get all node coordinates.

That's only 2 passes.

Let me rewrite:

```python
class Collector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}
        self.highway_ways = {}
        self.pt_relations = {}
        self.pt_relation_way_ids = set()
        self.all_node_ids = set()
    
    def way(self, w):
        if 'name' in w.tags and w.tags['name'].endswith('Gürtel'):
            self.guertel_ways[w.id] = w
            for n in w.nodes:
                self.all_node_ids.add(n.ref)
        
        if 'highway' in w.tags:
            self.highway_ways[w.id] = w
            for n in w.nodes:
                self.all_node_ids.add(n.ref)
    
    def relation(self, r):
        route = r.tags.get('route')
        if route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            self.pt_relations[r.id] = r
            for m in r.members:
                if m.type == 'w':
                    self.pt_relation_way_ids.add(m.ref)
                    # We'll need these way nodes too... but we might not have collected them yet
```

Hmm, the problem is that when processing ways, I don't know which ways are referenced by PT relations until I've processed relations. And osmium processes in order: nodes, then ways, then relations. So in a single pass, I can't know which ways are PT-related while processing ways.

Let me do 3 passes:
1. Pass: Collect Gürtel ways, highway ways, and PT relations + their referenced way IDs
2. Pass: Collect PT way geometries (only those referenced by PT relations that weren't already collected as highway ways)
3. Pass: Get all node coordinates

Actually, I can do it in 2 passes still:
- Pass 1: Collect Gürtel ways, highway ways, PT relations. Store PT relation way IDs separately.
- Then figure out which node IDs I need: all nodes from Gürtel ways + highway ways + PT ways. But I don't have PT way nodes yet...

Let me think again. I need:
1. Gürtel ways (name ends in Gürtel) → their nodes
2. Highway ways (has highway tag) → their nodes
3. PT relations (route in [bus, tram, ...]) → their member way IDs → those ways' nodes
4. Intersection checks need geometries

Best approach with 3 passes:
- Pass 1: Collect Gürtel ways + highway ways + PT relations (store PT way IDs)
- Pass 2: Collect PT way geometries (only for ways referenced by PT relations, and not already collected)
- For this I need the node IDs from those PT ways
- Pass 3: Get all needed node coordinates

Actually let me simplify. Let me do 2 passes:

Pass 1: Collect everything:
- All Gürtel ways
- All highway ways
- All PT relations (just store the IDs and tags, and the referenced way IDs)
- All node IDs from Gürtel + highway ways
- Also store PT-referenced way IDs

Then, separately process the PT-referenced ways:
- Read the PBF again, only looking at ways that are PT-referenced but not already collected as highway ways
- Collect their node IDs

Pass 2: Get all node coordinates

Let me write this more carefully. Actually, osmium can handle this. Let me just write a clean 3-pass script.

Let me also consider: the Austria PBF is ~500MB. Downloading it might take a while but should be fine.

Let me also think about whether there are Gürtel ways that aren't highways. The user says "Vienna's Gürtel, defined as the highway ways whose name ends in 'Gürtel'". So I should only look at ways that have both 'highway' tag AND name ending in 'Gürtel'.

Let me write the script now.

Actually, I just realized something. The user asks "every highway from the current Geofabrik Austria PBF that intersects a 500m buffer around Vienna's Gürtel". The Gürtel is a major ring road in Vienna. The ways with name ending in 'Gürtel' should be the actual Gürtel road segments.

Let me think about what attributes to keep:
- Highways: name, highway, maxspeed, lanes, surface, oneway
- PT routes: ref, name, operator, route

Let me write the complete script now.

Let me also think about performance. Reading the full Austria PBF (~500MB) 3 times might be slow but acceptable.

Actually, let me check how big the PBF is first, and also check if osmium tool is available.

Let me just write and run the script. If it's too slow, I'll adjust.
2:01
2:01
Bash which osmium && osmium --version 2>&1 || echo "osmium CLI not found"
osmium CLI not found
2:01
Bash python3 -c "from pyproj import CRS; c=CRS.from_user_input(31287); print(c)"
EPSG:31287
2:02
Bash df -h /work 2>&1
Filesystem      Size  Used Avail Use% Mounted on
overlay         953G  593G  341G  64% /
3:13
Let me check what CRS is "Austria's standard projected coordinate system". The most common one is EPSG:31287 (MGI / Austria GK East) or EPSG:31256 (MGI / Austria GK Central) or EPSG:31258 (MGI / Austria GK M28) or EPSG:3416 (MGI / Austria Lambert). Let me check what's typical - EPSG:31287 is MGI / Austria GK East which covers Vienna well. But actually Austria standard is MGI / Austria GK East (EPSG:31287) for Vienna area. Let me confirm.

Actually, the Austrian standard projected CRS is EPSG:31287 (MGI / Austria GK East) for the Vienna/Lower Austria/Burgenland region. The broader Austria standard is EPSG:3416 (MGI / Austria Lambert). Let me just go with EPSG:31287 as that's the GK zone covering Vienna.

Now let me design the script. The PBF is from Geofabrik. The approach:

1. Download the Austria PBF
2. First, find all ways with name ending in 'Gürtel' (the Gürtel corridor)
3. Buffer these by 500m
4. Extract all highways that intersect this buffer
5. Extract all PT route relations that cross this buffer
6. Write to GeoPackage

Since the PBF is large (~1.2GB), I need an efficient approach. Using osmium library in Python is good for this.

Let me plan the script:

### Step 1: Download the PBF
Use requests or wget to download.

### Step 2: Find Gürtel ways and create buffer
Use osmium to iterate over ways, find those with name ending in 'Gürtel', collect their geometries, union them, buffer 500m.

### Step 3: Extract highways in buffer
Use osmium to iterate over ways again, check if they intersect the buffer, filter for highway=*.

### Step 4: Extract PT route relations
Use osmium to iterate over relations, filter for type=route and route in (bus, tram, subway, train, trolleybus, light_rail). Check if the route's ways intersect the buffer.

Actually, osmium Python handler approach is better:
- First pass: read all ways, find Gürtel ways, compute buffer
- Second pass: read ways again, find highways intersecting buffer
- Third pass: read relations, find PT routes intersecting buffer

But loading full ways into memory might be heavy. Let me think of a more efficient approach.

Actually, the osmium library allows handler classes that can process the file in one pass. But we need two passes since we need the buffer first.

Better approach:
1. First pass: Read all ways, find those with name ending in Gürtel. Collect their node refs.
2. First pass (or second pass): Get node coordinates for those refs. Build geometries, buffer.
3. Second pass: Read all ways again, find highways intersecting the buffer.
4. Third pass: Read all relations, find PT routes intersecting the buffer.

Actually, osmium allows for a handler that processes nodes, ways, and relations in a single pass. But since we need the buffer before we can filter, we need at least two passes.

Let me use a more practical approach:

**Pass 1**: Use osmium to get all ways with name ending in 'Gürtel'. Get their node coordinates.
- For each such way, build the LineString geometry
- Union them, buffer by 500m

**Pass 2**: Use osmium to find all highway ways that intersect the buffer. Collect their node coordinates and build geometries.

**Pass 3**: Use osmium to find all PT route relations. For each relation, collect the way members, check if any way is in our buffer area (we can check if the way geometry intersects the buffer).

Actually, the most efficient way with a large PBF is to:

1. First pass: Collect all Gürtel way node IDs and way geometries 
2. First pass (same pass): We can also collect node coordinates

Simplify: Let me do it in multiple passes using osmium handlers.

Let me write this as a Python script.

Actually, for a 1.2GB PBF file, this might take a while. Let me write a robust script.

Here's my plan:

```python
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, box
from shapely.ops import unary_union
from pyproj import Transformer
import json
import sys

# Austria projected CRS for Vienna
TARGET_CRS = "EPSG:31287"
WGS84_CRS = "EPSG:4326"

# Step 1: First pass - find all Gürtel ways
class GuertelHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_ways = []  # list of (way_id, node_refs, tags)
    
    def way(self, w):
        name = w.tags.get('name')
        if name and name.endswith('Gürtel'):
            self.guertel_way_ids.add(w.id)
            self.guertel_ways.append((w.id, list(w.nodes), dict(w.tags)))

class NodeCollector(osmium.SimpleHandler):
    def __init__(self, required_node_ids):
        super().__init__()
        self.required_node_ids = required_node_ids
        self.coords = {}
    
    def node(self, n):
        if n.id in self.required_node_ids:
            self.coords[n.id] = (n.location.lon, n.location.lat)

# Pass 1: Find Gürtel ways
handler1 = GuertelHandler()
handler1.apply_file("austria-latest.osm.pbf")

# Collect all node IDs from Gürtel ways
guertel_node_ids = set()
for wid, nodes, tags in handler1.guertel_ways:
    for n in nodes:
        guertel_node_ids.add(n.ref)

# Get node coordinates
node_handler = NodeCollector(guertel_node_ids)
node_handler.apply_file("austria-latest.osm.pbf")

# Build Gürtel geometries
from shapely.geometry import LineString
guertel_lines = []
for wid, nodes, tags in handler1.guertel_ways:
    coords = []
    for n in nodes:
        if n.ref in node_handler.coords:
            coords.append(node_handler.coords[n.ref])
    if len(coords) >= 2:
        guertel_lines.append(LineString(coords))

guertel_union = unary_union(guertel_lines)
guertel_buffer = guertel_union.buffer(0.05)  # ~500m in degrees? No, better use projected CRS
```

Hmm, the buffer in degrees won't be accurate. 500m in degrees is roughly 0.0045 degrees. Let me think about this differently.

Better approach: Use pyproj to project the Gürtel ways to a metric CRS first, buffer in meters, then project back to WGS84 for intersection checks, or keep them in projected CRS throughout.

But the osmium coordinates are in WGS84. So:

1. Get Gürtel ways as WGS84 LineStrings
2. Project to EPSG:31287 (metric)
3. Buffer by 500m
4. Project back to WGS84 for spatial filtering in osmium? No, we can just check in projected CRS.

Actually, let me simplify: I'll buffer in WGS84 with an approximate degree-to-meter conversion at Vienna's latitude (~48.2°N). 
- 1 degree longitude ≈ 111320 * cos(48.2°) ≈ 111320 * 0.666 ≈ 74150m
- 1 degree latitude ≈ 111320m
- 500m in degrees ≈ 500/111320 ≈ 0.00449° for lat, 500/74150 ≈ 0.00674° for lon

But this is approximate. Better to use a proper projection.

Let me use a different approach: First collect Gürtel geometries, project, buffer, then use the buffer in WGS84 for intersection checks in osmium (using shapely's intersection).

Actually, let me think about this more carefully.

The PBF file is large (~1.2GB). We want to process it efficiently. The osmium Python library is good for this as it streams through the file.

Plan:
1. First pass: Get Gürtel way geometries, project to EPSG:31287, buffer by 500m, project back to WGS84
2. Second pass: Read all ways, check highway tag, check if geometry intersects buffer, collect
3. Third pass: Read all relations, check PT route, check if any member way intersects buffer

Let me code this up.

For the PT routes, a route relation has members (ways, nodes, relations). We need to check if the route's geometry (the linestring formed by its way members) crosses the buffer. This is complex because we need to assemble the geometry from member ways.

Let me think about an efficient approach for PT routes:

Option A: Collect all PT route relations in first pass, noting their member way IDs. Then in a second pass, collect those way geometries. Then assemble route geometries.

Option B: Use osmium handler that processes both relations and ways in one pass. Store relation info, and when ways are encountered, check if they belong to any PT route relation.

Actually, osmium processes in order: nodes, then ways, then relations in a single pass. So we can't see a relation before its member ways.

Let me do it in 3-4 passes:

**Pass 1**: Find Gürtel ways, get their node coords, build buffer
**Pass 2**: Find highways intersecting buffer, get their node coords, build geometries
**Pass 3**: Find PT route relations, collect their member way IDs
**Pass 4**: Get geometries for those member ways, build route geometries

Actually, let me simplify. The PBF is not that huge that we can't be a bit inefficient. Let me write a pragmatic script.

Actually, I think the most practical approach is:

1. Download the PBF
2. Use osmium to find Gürtel ways and their node coordinates
3. Build buffer
4. Use osmium to find highway ways that intersect buffer and collect their data
5. For PT routes: use osmium to find route relations, then for each relation find its way members, and assemble geometry

Let me write the script now. I'll be careful with performance.

Let me also consider using osmium's `apply_file` with locations=True to get node coordinates on the fly.

Here's my refined plan:

```python
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, Point
from shapely.ops import unary_union
from pyproj import Transformer
import os

TARGET_CRS = "EPSG:31287"
WGS84_CRS = "EPSG:4326"
PBF_PATH = "/work/austria-latest.osm.pbf"

# Download PBF if needed
if not os.path.exists(PBF_PATH):
    import urllib.request
    url = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
    urllib.request.urlretrieve(url, PBF_PATH)

# Transformer
transformer = Transformer.from_crs(WGS84_CRS, TARGET_CRS, always_xy=True)
transformer_back = Transformer.from_crs(TARGET_CRS, WGS84_CRS, always_xy=True)

# ---- PASS 1: Find Gürtel ways ----
class GuertelHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_nodes = {}  # node_id -> (lon, lat)
        self.guertel_ways = []  # list of (way_id, [node_ids], tags_dict)
    
    def way(self, w):
        name = w.tags.get('name')
        if name and name.endswith('Gürtel'):
            node_ids = [n.ref for n in w.nodes]
            self.guertel_ways.append((w.id, node_ids, dict(w.tags)))
    
    def node(self, n):
        if n.id in self.guertel_nodes or any(n.id in way[1] for way in self.guertel_ways):
            self.guertel_nodes[n.id] = (n.location.lon, n.location.lat)
```

Hmm, but the node() method would need to check all way node lists which is O(n*m). Let me think...

Actually, osmium processes nodes first, then ways. So in the handler, nodes come before ways. So I can't collect node refs from ways before seeing nodes.

Wait, actually osmium does process in order: nodes, then ways, then relations. So:

- node() is called for each node
- way() is called for each way
- relation() is called for each relation

So I need to:
1. First pass: Just find Gürtel way IDs and their node refs
2. Second pass: Get node coordinates for those refs
3. Build geometries

Or, use osmium with `locations=True` which gives node locations directly in the way handler. Let me check... Yes! `osmium.SimpleHandler` with `locations=True` gives node locations in the way nodes.

Actually, looking at the osmium docs: `SimpleHandler` doesn't have a `locations` parameter. The `apply_file` method has a `locations` parameter. Let me check.

From osmium docs:
```python
handler.apply_file("file.osm.pbf", locations=True)
```

When `locations=True`, the way nodes have their location set. This means we don't need a separate node handler.

So:

```python
class GuertelHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways_coords = []  # list of (way_id, [(lon,lat),...], tags)
    
    def way(self, w):
        name = w.tags.get('name')
        if name and name.endswith('Gürtel'):
            coords = []
            for n in w.nodes:
                if n.location.valid():
                    coords.append((n.location.lon, n.location.lat))
            if len(coords) >= 2:
                self.guertel_ways_coords.append((w.id, coords, dict(w.tags)))

handler = GuertelHandler()
handler.apply_file(PBF_PATH, locations=True)
```

This is much cleaner!

Now for the buffer. At Vienna's latitude (~48.2°N):
- 1° lat ≈ 111320m
- 1° lon ≈ 111320 * cos(48.2°) ≈ 74150m

500m in degrees lat = 500/111320 ≈ 0.00449°
500m in degrees lon = 500/74150 ≈ 0.00674°

But for accuracy, let me project to EPSG:31287 first.

```python
guertel_lines = []
for wid, coords, tags in handler.guertel_ways_coords:
    # project to target CRS
    projected_coords = [transformer.transform(lon, lat) for lon, lat in coords]
    guertel_lines.append(LineString(projected_coords))

guertel_union = unary_union(guertel_lines)
buffer_geom = guertel_union.buffer(500)  # 500m buffer in projected CRS

# Get buffer bounds in WGS84 for quick filtering
minx, miny, maxx, maxy = buffer_geom.bounds
# Convert bounds to WGS84 corners
wgs84_corners = [transformer_back.transform(minx, miny), 
                 transformer_back.transform(maxx, miny),
                 transformer_back.transform(maxx, maxy),
                 transformer_back.transform(minx, maxy)]
# Create WGS84 bbox for quick filtering
wgs84_minx = min(c[0] for c in wgs84_corners)
wgs84_miny = min(c[1] for c in wgs84_corners)
wgs84_maxx = max(c[0] for c in wgs84_corners)
wgs84_maxy = max(c[1] for c in wgs84_corners)
```

OK, this is getting complex but manageable. Let me also think about the PT route relations.

For PT routes, a relation has type=route, route=bus/tram/subway/train/trolleybus/light_rail. Its members are typically ways. I need to collect all member ways, get their geometries, and assemble them into a single MultiLineString per route relation.

The approach:
1. In a pass with locations=True, find all PT route relations, collect member way IDs
2. In another pass, find the geometries of those ways
3. For each route, assemble the geometry from its member ways

Actually, I think I can do this more efficiently:

**Pass 1**: With locations=True
- Collect Gürtel way geometries (as before)
- Collect highway way geometries that intersect buffer (for highways layer)
- Collect PT route relations info (type, id, tags, member way IDs)

**Pass 2**: Without locations, just collect way geometries for PT route member ways

Wait, but with locations=True, all way geometries are available. So I can collect PT member way geometries in pass 1 if I know which ways belong to PT routes.

But we don't know the PT routes until we see the relations, which come after all ways. So we'd need pass 2 for PT route member ways.

Alternatively: 
1. First pass (locations=True): Find Gürtel ways and build buffer
2. Second pass (locations=True): 
   - Find highway ways intersecting buffer, collect geometries
   - Find PT route relations, note member way IDs
   - Store member way geometries when we encounter them (but we don't know which ways are members until we see relations after ways)

OK, the fundamental issue is that in OSM PBF, ways come before relations. So we can't know which ways are PT route members during the ways pass.

Let me do:
1. Pass 1 (locations=True): Find Gürtel ways → build buffer
2. Pass 2 (locations=True): Find highways intersecting buffer → collect for highways layer. Also find PT route relations, collect member way IDs.
3. Pass 3 (locations=True): Find specific ways by ID (PT route member ways) → collect geometries
4. Assemble PT route geometries from their member way geometries.

This is clean and should work well.

Actually, for Pass 2 and 3, I can combine them differently. In Pass 2 I find both highway ways and PT relation info. In Pass 3 I only need the PT member way geometries. Since osmium is fast (streaming), this is fine.

Actually, wait. osmium Python handler's `apply_file` with `locations=True` is slower than without because it stores node coordinates. But for our purposes, it's the cleanest approach.

Let me also consider: do I need `locations=True` for the pass that only examines relations? For relations, I just need the relation attributes and member way IDs. So Pass 2 (finding relations) doesn't need locations=True. Only Pass 1 and Pass 3 need it.

Let me write this more carefully.

Actually, I realize I should also think about memory. Austria's PBF is about 1.2GB. The number of highway ways in a 500m buffer around Gürtel should be manageable (maybe a few thousand). PT route relations in Vienna should also be manageable (maybe a few hundred).

Let me write the full script now.

For the highway way extraction, I need these tags: name, highway, maxspeed, lanes, surface, oneway. So I'll collect those.

For PT route relations: ref, name, operator, route.

Let me also think about what happens if a highway way partially overlaps the buffer - we still want the full way geometry, correct? The user says "intersects a 500 m buffer", so any way that intersects at any point should be included in full.

OK, let me write the script.

One more thing: the PT routes should be "each route as a single feature" with MultiLineString geometry. So if a route has different segments/branches, they all go into one MultiLineString.

Let me code this up.

For the output CRS: EPSG:31287 (MGI / Austria GK East) is standard for Vienna. Let me verify this is indeed "Austria's standard projected coordinate system".

Actually, there are several options:
- EPSG:31287 - MGI / Austria GK East (covers Vienna)
- EPSG:31256 - MGI / Austria GK Central
- EPSG:31255 - MGI / Austria GK West
- EPSG:3416 - MGI / Austria Lambert

The "standard" for Austria-wide work is often EPSG:3416 (Austria Lambert) or EPSG:31287 (for the Vienna region). Since Vienna is in the eastern part, EPSG:31287 is appropriate. Let me go with EPSG:31287 as it's the projected CRS for the GK zone covering Vienna.

Now let me write the script.

Let me also handle the case where the Gürtel might be multiple disconnected ways (which it definitely is - it's a long ring road consisting of many OSM way segments).

Let me also consider: the Gürtel name condition is "name ends in 'Gürtel'". In Vienna, we'd have ways like "Währinger Gürtel", "Gürtel", "Lerchenfelder Gürtel", etc. The condition `name.endswith('Gürtel')` should catch these.

But wait - what about the Umlaut? In the PBF, it's stored as UTF-8, so 'Gürtel' is fine.

Let me also handle the fact that some ways might be tagged with name:en or other variants - we want the name tag.

Now let me write the full script.

Actually, I should also think about: what if the Gürtel ways form a ring/circular road? The buffer around the union should give a nice corridor shape.

Alright, let me write a comprehensive script.

For the PT routes, I need to be careful about how route relations work. A route relation can have many member ways, and they might not all be connected. The geometry should be a MultiLineString of all member way geometries (in WGS84, then projected to target CRS).

Let me also handle the case where a PT route relation might have no ways or only a few - we should still include it if at least one way intersects the buffer.

Actually, re-reading the user's request: "every public-transport route relation (bus, tram, subway, train, trolleybus, light_rail) that crosses the same buffer". So we need routes that cross the buffer - meaning at least some part of the route is within the buffer. We should include the full route geometry, not just the part in the buffer.

Let me write the script now.

Let me also think about the "crosses the buffer" logic. For a PT route, the route relation has way members. If any of those way members intersect the buffer, then the route crosses the buffer. So:

1. Find all PT route relations
2. For each, collect member way IDs
3. If any of those ways intersect the buffer, include the route
4. The route geometry is the collection of all its member way geometries

Let me code this up step by step.

```python
#!/usr/bin/env python3
"""
Extract Gürtel corridor highway network and PT routes from Austria OSM PBF.
Output: vienna_network.gpkg with highways and pt_routes layers.
"""

import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, shape
from shapely.ops import unary_union
from pyproj import Transformer
import os, sys, time

PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"
OUTPUT_GPKG = "/work/vienna_network.gpkg"
TARGET_CRS = "EPSG:31287"
WGS84_CRS = "EPSG:4326"

# Download if needed
if not os.path.exists(PBF_PATH):
    print("Downloading Austria PBF...")
    import urllib.request
    urllib.request.urlretrieve(PBF_URL, PBF_PATH)
    print("Download complete.")

# Coordinate transformers
wgs84_to_target = Transformer.from_crs(WGS84_CRS, TARGET_CRS, always_xy=True)
target_to_wgs84 = Transformer.from_crs(TARGET_CRS, WGS84_CRS, always_xy=True)

# ────────────────── PASS 1: Find Gürtel ways ──────────────────
class GuertelHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.ways = []  # list of (id, [(lon,lat),...], tags)
    
    def way(self, w):
        name = w.tags.get('name')
        if name and name.endswith('Gürtel'):
            coords = [(n.location.lon, n.location.lat) for n in w.nodes if n.location.valid()]
            if len(coords) >= 2:
                self.ways.append((w.id, coords, dict(w.tags)))

print("Pass 1: Finding Gürtel ways...")
t0 = time.time()
handler1 = GuertelHandler()
handler1.apply_file(PBF_PATH, locations=True)
print(f"  Found {len(handler1.ways)} Gürtel way segments in {time.time()-t0:.1f}s")

# Build Gürtel geometry in target CRS and buffer
guertel_lines_target = []
for wid, coords, tags in handler1.ways:
    projected = [wgs84_to_target.transform(lon, lat) for lon, lat in coords]
    guertel_lines_target.append(LineString(projected))

guertel_union = unary_union(guertel_lines_target)
buffer_geom_target = guertel_union.buffer(500)

# Also create a WGS84 version of the buffer for spatial filtering with osmium
# We'll convert the buffer's envelope to WGS84 for quick bounding box checks
buffer_bounds_target = buffer_geom_target.bounds  # (minx, miny, maxx, maxy)
corners_wgs84 = [
    target_to_wgs84.transform(buffer_bounds_target[0], buffer_bounds_target[1]),
    target_to_wgs84.transform(buffer_bounds_target[2], buffer_bounds_target[1]),
    target_to_wgs84.transform(buffer_bounds_target[2], buffer_bounds_target[3]),
    target_to_wgs84.transform(buffer_bounds_target[0], buffer_bounds_target[3]),
]
wgs84_lons = [c[0] for c in corners_wgs84]
wgs84_lats = [c[1] for c in corners_wgs84]
bbox_wgs84 = (min(wgs84_lons), min(wgs84_lats), max(wgs84_lons), max(wgs84_lats))
print(f"  Buffer bbox (WGS84): {bbox_wgs84}")

# Create a shapely box in WGS84 for quick pre-filtering, then we'll check exact intersection in target CRS
from shapely.geometry import box
bbox_geom_wgs84 = box(*bbox_wgs84)

# ────────────────── PASS 2: Highway ways + PT route relations ──────────────────
class Pass2Handler(osmium.SimpleHandler):
    def __init__(self, buffer_wgs84, buffer_target, transformer):
        super().__init__()
        self.buffer_wgs84 = buffer_wgs84
        self.buffer_target = buffer_target
        self.transformer = transformer
        
        self.highway_ways = []  # list of dicts with geometry and tags
        self.pt_relations = []  # list of (relation_id, tags, member_way_ids)
    
    def way(self, w):
        highway = w.tags.get('highway')
        if not highway:
            return
        
        # Quick bbox check
        if not self.buffer_wgs84.intersects(box(
            min(n.location.lon for n in w.nodes if n.location.valid()),
            min(n.location.lat for n in w.nodes if n.location.valid()),
            max(n.location.lon for n in w.nodes if n.location.valid()),
            max(n.location.lat for n in w.nodes if n.location.valid())
        )):
            return
```

Hmm, this is getting complicated. Let me simplify.

Actually, let me take a step back. The osmium library with locations=True is the right tool. But I think I'm overcomplicating this. Let me write a cleaner script.

Here's my revised plan:

1. **Pass 1 (locations=True)**: Collect Gürtel ways, build buffer
2. **Pass 2 (locations=True)**: Collect highway ways that intersect buffer → save to GeoDataFrame
3. **Pass 2 (same pass)**: Also collect PT route relations (but ways come before relations, so we can't check way geometries in the relation handler in the same pass... wait, the relation handler gets called after all ways, so we can check using the way data we've collected)

Actually, in a single pass, the order is:
1. All nodes (if locations=True, node locations are stored)
2. All ways (way() is called for each way)
3. All relations (relation() is called for each relation)

So in Pass 2:
- way() is called for each way → check if highway and intersects buffer → store geometry and tags
- relation() is called for each relation → check if PT route → store member way IDs

Then after the pass, we can check which relations have ways that intersect the buffer by matching member way IDs to our stored highway ways.

But wait - a PT route's way members might not be highway ways. They could be any way (including non-highways like dedicated tram tracks). So we need to collect ALL way geometries that intersect the buffer, not just highway ways.

So let me revise:

**Pass 2 (locations=True)**: Collect ALL ways that intersect the buffer (both for highways and PT member ways). Also collect PT route relations.

But collecting ALL ways in the buffer is more expensive. Let me think...

Actually, for PT routes, the member ways are likely to be within or near the buffer since we're only interested in routes that cross the buffer. So I can collect all ways that intersect the buffer.

But that could be a lot of ways. Actually, for a 500m buffer around Gürtel, the area is about maybe 50km² (Gürtel is ~12km long, so 12*1 ≈ 12km², or more with the ring shape). That's manageable.

Let me collect all ways that intersect the buffer. Then I can filter for highway ways for the highways layer, and for PT routes I can check if any member way is in the collected ways.

Wait, but the PT route member ways might extend beyond the buffer. We want the full route geometry. So we need the way geometries even if they extend beyond the buffer.

Hmm, let me reconsider. For PT routes that cross the buffer, we need all their member ways. These member ways might be far from the buffer. So we can't just collect ways in the buffer for PT routes.

OK, let me do it properly with multiple passes:

**Pass 1 (locations=True)**: Find Gürtel ways → build buffer

**Pass 2 (locations=True)**: 
- For EVERY way, check if it intersects the buffer. If yes, store its geometry (in target CRS) and tags.
- Also store highway-specific info from all ways.

Actually, I just need to store ways that intersect the buffer for the highway layer. And for PT routes, I need to find:
1. All PT route relations
2. For each, the member way IDs
3. Get geometries for those ways

So I need two more passes.

Let me simplify even further. For the PT routes, I'll:

**Pass 2**: Find ALL PT route relations (without locations). Store their tags and member way IDs.
**Pass 3 (locations=True)**: Find geometries for all ways that are members of PT routes OR are highways in the buffer.

Actually, let me combine Pass 2 and Pass 3:

**Pass 2 (locations=True)**: 
- For each way: if it intersects buffer AND has highway tag → store for highways layer
- For each relation: if PT route → store relation info with member way IDs

**Pass 3 (locations=True)**: Get geometries for all ways referenced by PT routes (whether they intersect buffer or not).

This should work. Pass 3 will get the full way geometries for all PT route member ways. Then we can assemble route geometries.

But there's an issue: Pass 2 and Pass 3 both process the entire PBF. This is slow but manageable.

Actually, can I optimize? In Pass 2, I also need the member way geometries for PT routes to check if they intersect the buffer. But since I'm already checking which ways intersect the buffer, I can note that.

Let me restructure:

**Pass 1 (locations=True)**: Find Gürtel ways → build buffer geometry.

**Pass 2 (locations=True)**: 
- For each way: 
  - If it intersects buffer: store way_id, geometry in target CRS, and tags
  - Also, independently track: if it has highway tag AND intersects buffer → add to highways list
- For each relation:
  - If it's a PT route (route=bus/tram/subway/train/trolleybus/light_rail):
    - Store relation info and member way IDs
    - After the pass, check which member ways are in our stored buffer-intersecting ways
    - If any member way intersects the buffer, the route crosses the buffer

**Pass 3 (locations=True)**: 
- Get way geometries for ALL member ways of PT routes that cross the buffer (including those that don't intersect the buffer themselves, so we have the full route)
- Actually wait, in Pass 2 I already have the geometries for buffer-intersecting ways. I need geometries for the non-buffer ways too.

Hmm, this is getting complex. Let me just do:

**Pass 2 (without locations)**: Find PT route relations only. Store member way IDs.

**Pass 3 (locations=True)**: 
- For all ways: if highway tag AND intersects buffer → save to highways list
- For all ways: if way_id is in the set of PT member way IDs → save to pt_ways dict

Then assemble.

Actually, I think the cleanest approach is:

1. Pass 1: Find Gürtel ways → buffer
2. Pass 2: Find all ways intersecting buffer (store geometry + tags). Find all PT route relations (store member way IDs).
3. After Pass 2, identify which PT route ways we already have and which we still need.
4. Pass 3: Get missing PT route way geometries.

Let me just write this and not overthink.

Actually, you know what, let me try a different approach entirely. Instead of using osmium Python for everything, let me:

1. Use osmium to find Gürtel ways
2. Use osmium to find ways in the buffer
3. For PT routes, use osmium relations handler

Let me just write the script and iterate if needed. The PBF is 1.2GB and processing it 3-4 times will take a few minutes, which is acceptable.

Let me write the script now.
3:13
3:13
Bash python3 -c "
python3 -c "
from pyproj import CRS
# Check common Austria CRS options
for epsg in [31287, 31256, 31255, 3416, 3035]:
    c = CRS.from_epsg(epsg)
    print(f'EPSG:{epsg} -> {c.name}')
"
EPSG:31287 -> MGI / Austria Lambert
EPSG:31256 -> MGI / Austria GK East
EPSG:31255 -> MGI / Austria GK Central
EPSG:3416 -> ETRS89 / Austria Lambert
EPSG:3035 -> ETRS89-extended / LAEA Europe
4:40
Austria's standard projected coordinate system is MGI / Austria Lambert (EPSG:31287). Let me now write the script. I need to:

1. Download the Austria PBF
2. Read it with osmium to find Gürtel ways (highway ways whose name ends in 'Gürtel')
3. Buffer those ways by 500m
4. Extract all highways intersecting the buffer
5. Extract all PT route relations (bus, tram, subway, train, trolleybus, light_rail) crossing the buffer
6. Write to GeoPackage with EPSG:31287

Let me think about how to approach this with osmium.

For osmium in Python, I need to write a handler class. The PBF file is large (~1GB for Austria). Let me approach this in stages:

Stage 1: Use osmium to find all ways whose `name` tag ends with "Gürtel" and that have a `highway` tag. Collect their node coordinates and build their geometries. Buffer by 500m to get the area of interest.

Stage 2: Use the buffer to extract:
   a) All highway ways that intersect the buffer
   b) All PT route relations that intersect the buffer

For the PT relations, I need to get the relation members (ways) and their geometries, then merge them into a single MultiLineString per route.

Let me think about the osmium approach more carefully.

Actually, with osmium Python, I can write handlers that process nodes, ways, and relations. But to get full geometry for ways I need node coordinates. Let me use a two-pass approach:

Pass 1: Read the PBF and find Gürtel ways. Collect all node refs for those ways, and store the way geometries.
Pass 2: Build the buffer from Gürtel ways.
Pass 3: Find all highways and PT relations intersecting the buffer.

Actually, let me think about this differently. The osmium library provides the `osmium.SimpleHandler` or `osmium.NodeLocationsForWays` handler that can assemble way geometries on the fly.

Let me write a comprehensive script:

1. First pass - collect Gürtel ways:
   - Scan all ways for name ending in 'Gürtel' and highway tag present
   - Store their node references
   - Compute their geometries (need node locations)

2. Build buffer around Gürtel

3. Second pass - collect all highways and PT relations intersecting buffer:
   - Scan all ways: if highway and geometry intersects buffer, collect
   - Scan all relations: if route tag is one of the PT types, collect members and build geometry

For a large PBF like Austria (~1GB), the osmium library processes sequentially. I can use `osmium.FileProcessor` with custom handlers.

Let me write a clean implementation.

For the PT route relations - I need to:
1. Find relations with type=route and route IN (bus, tram, subway, train, trolleybus, light_rail, light_rail)
2. Get their member ways (and nodes for geometry)
3. Build a single LineString or MultiLineString geometry per route
4. Check if it intersects the buffer
5. If yes, include it

For the highways:
1. Find ways with highway tag
2. Build geometry
3. Check if it intersects buffer
4. If yes, include it

Let me structure the code:

```python
import osmium
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString, Polygon, box
from shapely.ops import unary_union
import pyproj
from functools import lru_cache
import sys

# Step 1: Find Gürtel ways
class GürtelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # way_id -> {nodes, tags}
    
    def way(self, w):
        if 'highway' in w.tags and 'name' in w.tags and w.tags['name'].endswith('Gürtel'):
            self.guertel_ways[w.id] = {
                'nodes': [n.ref for n in w.nodes],
                'tags': dict(w.tags)
            }

# Step 1a: Get node locations for Gürtel ways
class GürtelNodeCollector(osmium.SimpleHandler):
    def __init__(self, needed_nodes):
        super().__init__()
        self.needed_nodes = needed_nodes
        self.node_coords = {}
    
    def node(self, n):
        if n.id in self.needed_nodes:
            self.node_coords[n.id] = (n.lon, n.lat)

# Then build geometries...
```

Actually, the osmium Python handler processes the file in order. So I need to process nodes first, then ways, then relations. But the standard approach when you need node locations for ways is to use `osmium.NodeLocationsForWays` as a wrapper.

Let me use a different approach. I'll use the `osmium.FileProcessor` with the `apply_file` function, and use `NodeLocationsForWays` handler.

```python
class WayHandler(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.ways = {}
    
    def way(self, w):
        # process
        pass

# Apply with node locations
h = WayHandler()
osmium.apply_file('input.pbf', locations=True, handlers=[h])
```

The `locations=True` parameter in `apply_file` enables node location tracking, which means before each way is processed, its nodes have already been processed and the way has geometry available via `w.nodes` which now have `lat`, `lon` attributes.

Wait - with `locations=True`, osmium uses `NodeLocationsForWays` which stores all node locations in memory (or a file-based cache). For a file as large as Austria PBF, this could use significant memory. Let me check...

Actually, the Austria PBF is about 1GB. The number of nodes in Austria is roughly ~300-400 million. Each node location is about 16 bytes (double lat, double lon) plus the ID (8 bytes). That's about 24 bytes per node, which is about 7-10 GB. That's too much for this environment.

So I need a different approach. Let me do it in multiple passes:

Pass 1: Find Gürtel way IDs and the node IDs they use.
Pass 2: Read only those node locations from the PBF.
Pass 3: Build Gürtel geometries and buffer.
Pass 4: Find all highway way IDs and node IDs within the buffer area (rough spatial filter using bounding box).
Pass 5: Read the needed node locations.
Pass 6: Build highway geometries and filter by exact intersection.
Pass 7: Repeat for PT relations.

Actually, this is complex. Let me think of a simpler approach.

Alternative: Use `osmium` CLI tool to filter the PBF to just Vienna area first, then work with a smaller file. But osmium CLI is not installed.

Let me just use the Python osmium library carefully. The `NodeLocationsForWays` handler can use disk-based storage. Let me check...

Actually, osmium's `NodeLocationsForWays` by default uses a file-based node cache. Let me try with the `locations=True` flag and see what happens.

Actually, let me reconsider. For just the Gürtel ways and their buffer, I need relatively few nodes. Let me do:

1. First scan: find Gürtel way IDs and their node refs. Also get rough bounding box.
2. Second scan: read only those node locations.
3. Build Gürtel geometries, buffer by 500m.
4. Third scan: for all ways, do a quick bbox check and then full geometry check. Store matching way IDs and all node refs of those ways.
5. Fourth scan: read all needed node locations.
6. Build highway geometries.
7. Fifth scan: find PT relations, get their member way IDs, check intersection.

This is many passes. Let me simplify.

Actually, a smarter approach: use `osmium` to filter to the bounding box of Vienna (~48.1-48.3N, 16.2-16.5E), then work with the smaller extract.

Or even better: use `osmium.get_bbox()` to get the bbox, then use `osmium.index.MapFileBased` for node locations.

Let me just write the script and see. The key insight is that for the Gürtel buffer area, we only need a fraction of the nodes. Let me try the simplest osmium approach first.

Actually, I think the cleanest approach for this task is:

1. Scan PBF for Gürtel ways (no node locations needed, just IDs and node refs)
2. Scan PBF again for just those nodes (to get coords)
3. Build Gürtel geometries, compute buffer bbox
4. Scan PBF for all highway ways and PT relations that could be in the area (using bbox filter)
5. Collect needed node IDs
6. Scan PBF for those nodes
7. Build final geometries, filter by exact intersection

Let me implement this.

Wait - actually for scanning, osmium reads the whole PBF. The Austria PBF is maybe 1GB. Reading it fully 3-4 times could take a few minutes per pass. Let me try to be efficient.

Let me write it all in one script with multiple passes.

Actually, thinking more about it - let me use the `apply_file` with `locations=True` but using a file-based index. That way it doesn't use too much RAM.

The `NodeLocationsForWays` handler accepts an `index_type` parameter. The default is flex_memory which might use a lot of RAM. I can use `sparse_file_array` or `sparse_mmap_array` for disk-based storage.

Let me check the available index types.

Actually, a more practical approach for this environment: let me just use the osmium Python library with `locations=True` and a file-based node cache. The osmium library by default uses a temporary file for node locations when `locations=True`.

Let me just write it.

But wait - Austria PBF may be too large for `locations=True` with default settings. Let me check the environment's RAM.

Let me just try it.

Actually, I think the most robust approach is:

1. Use `osmium` to find Gürtel way IDs (simple pass, no geometry)
2. Use `osmium` again with just the needed nodes to build Gürtel geometries
3. Buffer
4. Use `osmium` again - this time for all highways. For each highway way, get its bbox from node IDs... but I don't have node coords yet.

Hmm, this is the chicken-and-egg problem. Without node coords, I can't check intersection.

Let me use a different strategy:
1. First find Gürtel ways and build their buffer polygon
2. Get the bounding box of the buffer
3. Do a full pass with `locations=True` (file-based index) and:
   - For each way with highway tag: check if bbox overlaps buffer bbox, if so record the way and its node refs
   - For each relation with PT route: record its member way IDs

This approach does one full pass with node locations enabled, which might be memory-heavy but let me try.

Actually, you know what - let me just use a pragmatic approach. I'll use the `osmium` Python module but use the `NodeLocationsForWays` handler with a file-based node cache. Let me check what options are available.

Looking at the osmium docs, `NodeLocationsForWays` has a `cache` parameter that can be a file path, and it uses a sparse file index.

Let me write the script step by step.

OK let me just code it up. I'll use multiple passes but try to be efficient.

Here's my plan for the script:

```python
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, box, shape
from shapely.ops import unary_union
import pyproj
import sys, os

PBF = '/work/austria-latest.osm.pbf'
GPKG = '/work/vienna_network.gpkg'
BUFFER_DIST = 500  # meters
TARGET_EPSG = 31287

# We need a transformer for buffering
# Gürtel geometry is in WGS84, but we need to buffer in metric coords
# So we'll transform to EPSG:31287, buffer, then transform back

# Step 1: Find Gürtel ways and their node refs
class GürtelWayFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # id -> {'node_refs': [int], 'tags': dict}
    
    def way(self, w):
        tags = dict(w.tags)
        highway = tags.get('highway')
        name = tags.get('name', '')
        if highway and name.endswith('Gürtel'):
            node_refs = [n.ref for n in w.nodes]
            self.guertel_ways[w.id] = {'node_refs': node_refs, 'tags': tags}

finder = GürtelWayFinder()
finder.apply_file(PBF)

# Collect all needed node IDs
needed_nodes = set()
for info in finder.guertel_ways.values():
    needed_nodes.update(info['node_refs'])

# Step 2: Read those node locations
class NodeReader(osmium.SimpleHandler):
    def __init__(self, needed):
        super().__init__()
        self.needed = needed
        self.coords = {}
    
    def node(self, n):
        if n.id in self.needed:
            self.coords[n.id] = (n.lon, n.lat)

node_reader = NodeReader(needed_nodes)
node_reader.apply_file(PBF)

# Step 3: Build Gürtel geometries
guertel_geoms = []
for wid, info in finder.guertel_ways.items():
    coords = [node_reader.coords.get(ref) for ref in info['node_refs']]
    coords = [c for c in coords if c is not None]
    if len(coords) >= 2:
        guertel_geoms.append(LineString(coords))

# Step 4: Build buffer
# Transform to EPSG:31287, buffer, transform back
wgs84 = pyproj.CRS.from_epsg(4326)
target_crs = pyproj.CRS.from_epsg(TARGET_EPSG)
project = pyproj.Transformer.from_crs(wgs84, target_crs, always_xy=True).transform
project_back = pyproj.Transformer.from_crs(target_crs, wgs84, always_xy=True).transform

from shapely.ops import transform

guertel_metric = [transform(project, g) for g in guertel_geoms]
buffered_metric = [g.buffer(BUFFER_DIST) for g in guertel_metric]
buffer_union = unary_union(buffered_metric)
buffer_wgs84 = transform(project_back, buffer_union)
buffer_bounds = buffer_wgs84.bounds  # (minx, miny, maxx, maxy)

# Step 5: Now scan for all highways and PT relations
# For highways: find all ways in the buffer area
class HighwayCollector(osmium.SimpleHandler):
    def __init__(self, bbox):
        super().__init__()
        self.bbox = bbox  # (minx, miny, maxx, maxy)
        self.highway_ways = {}  # way_id -> {'node_refs': [int], 'tags': dict}
        self.pt_relations = {}  # rel_id -> {'member_way_ids': [int], 'tags': dict}
        self.relation_member_ways = set()  # all way IDs referenced by PT relations
        
        # Also collect all node IDs needed
        self.all_needed_nodes = set()
        
        # Relation handling
        self._current_relation_id = None
    
    def way(self, w):
        # Quick bbox check is hard without node coords... 
        # We'll rely on the tags and collect everything, then filter later
        tags = dict(w.tags)
        if 'highway' in tags:
            node_refs = [n.ref for n in w.nodes]
            self.highway_ways[w.id] = {'node_refs': node_refs, 'tags': tags}
            self.all_needed_nodes.update(node_refs)
    
    def relation(self, r):
        tags = dict(r.tags)
        route = tags.get('route', '')
        if route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            member_way_ids = []
            for m in r.members:
                if m.type == 'w':
                    member_way_ids.append(m.ref)
                    self.relation_member_ways.add(m.ref)
            self.pt_relations[r.id] = {
                'member_way_ids': member_way_ids,
                'tags': tags
            }

collector = HighwayCollector(buffer_bounds)
collector.apply_file(PBF)

# Also need node locations for relation member ways
collector.all_needed_nodes.update(collector.relation_member_ways)  # Wait, this is way IDs, not node IDs
```

Hmm wait, I have a problem. For PT relations, I need the geometries of their member ways, which means I need the node locations of those ways. And the member way IDs need to be tracked separately from the highway ways.

Let me restructure. I'll also need all node refs from PT relation member ways that aren't already covered by highway ways.

Let me rethink and simplify:

1. Find Gürtel ways and nodes → build buffer
2. Scan all ways (without geometry) - collect highway ways and their node refs. Also collect all ways that are members of PT relations.
3. Collect all needed node locations.
4. Build geometries and filter by buffer intersection.
5. For PT relations, build geometries from their member ways.

Actually, for highways, I need ALL highways (not just the ones near the buffer). Because a highway might cross the buffer even if its full extent is larger. Wait, no - I need highways that INTERSECT the buffer. So I need to check geometry intersection, not just bbox overlap.

But to check geometry intersection, I need the full geometry, which requires node coords.

OK so the most practical approach:

1. Find Gürtel ways, nodes → buffer geometry
2. Get bbox of buffer
3. Scan for all highway ways. For each, check if their node ref bbox overlaps the buffer bbox (even without node coords, we can't do a bbox check...).

Actually, without node coords I truly can't do any spatial filtering. So I either:
A) Collect ALL highway ways in Austria (all their node refs too), then read all those nodes, then filter - this is too many
B) Use `locations=True` in a single pass - might be memory heavy
C) Use OSM Overpass API instead - query just for the area

Given the constraints, let me try approach B with a file-based node cache, or use Overpass API for the actual filtering.

Actually, Overpass API is probably the cleanest here. Let me fetch just the data around Vienna's Gürtel from Overpass.

Wait, the user specifically said "from the current Geofabrik Austria PBF". So I need to use the local PBF. Let me download it and process it.

Let me try a different approach: download the PBF and process it with a file-based node cache.

Actually, I realize I can use `osmium` to filter the PBF to just the Vienna area first. Let me check if `osmium` CLI is available, or I can use the Python bindings.

Alternatively, I can use `osmium` Python to do a spatial filter. The osmium Python library has `osmium.SimpleHandler` that can read the file. But the issue is always geometry.

Here's my final plan - pragmatic and should work:

1. Download the PBF
2. Find Gürtel ways (simple pass, just tags)
3. Find Gürtel way nodes (second pass, just those node IDs)
4. Build Gürtel geometries, buffer by 500m, get buffer bbox in WGS84
5. Scan PBF for all highway ways in Austria (third pass) - but only those whose node ID range might overlap... No, I can't do this without coords.

OK, let me just try approach: collect ALL highway ways' node IDs and ALL PT relation member way IDs. Then in a fourth pass, collect ALL those node locations. Then filter. Austria has about 70M highway ways... actually that's too many.

Let me reconsider. The Austria PBF is about 1GB. Most of that is in nodes. The number of highway ways in all of Austria might be on the order of a few million. Each way has on average maybe 3-4 nodes. So that's maybe 10-15M node references. That could work.

Actually, let me be smarter. Instead of collecting ALL highways in Austria, let me use Overpass API to get what I need, since it's designed for exactly this type of spatial query.

But the user specifically said "from the current Geofabrik Austria PBF". So I need to use the PBF.

Let me try yet another approach. I'll use `osmium` with the `locations=True` option but configure it to use a file-based cache (sparse file array) rather than the default memory cache. This should keep memory usage under control.

Looking at the osmium documentation:
- `NodeLocationsForWays` can be configured with `cache` parameter pointing to a file path
- Or use `index_type` parameter

Let me check the osmium source code for how to configure this...

Actually, in `osmium.apply_file()`, when you pass `locations=True`, it internally creates a `NodeLocationsForWays` handler with default settings. The default index type for `NodeLocationsForWays` is `flex_memory` which uses a lot of RAM.

I can create the `NodeLocationsForWays` handler manually and pass it. Let me check the API.

Actually, looking at the Python bindings, `osmium.NodeLocationsForWays` accepts an `index_type` parameter which can be one of the index types defined in `osmium.index`. 

The available index types include:
- `sparse_file_array` - uses file storage
- `sparse_mmap_array` - uses memory-mapped file
- `flex_memory` - memory-based (default)

Let me try with `sparse_file_array`.

Actually, the simplest thing is:

```python
from osmium import NodeLocationsForWays
from osmium.index import create_map

# Create a file-based index
index = create_map("sparse_file_array", "nodes.cache")
nl = NodeLocationsForWays(index)
```

Then pass this handler along with my custom handler.

Let me try this approach.

Actually, let me just download the file first, then write the script and test it. If it's too slow or runs out of memory, I'll adjust.

Let me download and start coding:

Actually, first let me structure the problem more carefully.

The steps:
1. Download `austria-latest.osm.pbf`
2. Find ways with `name` ending in "Gürtel" AND `highway` tag present → these are Gürtel ways
3. Get their geometries (need node coords)
4. Buffer by 500m → area of interest polygon
5. Find ALL highway ways that intersect the buffer
6. Find ALL PT route relations that intersect the buffer
7. Write to GeoPackage

For step 2–3 (Gürtel ways), I need a small subset of nodes, so I can do a two-pass approach without heavy memory.

For steps 5–6 (highways and PT routes in buffer), I need geometry intersection. The cleanest approach with osmium is to use `NodeLocationsForWays` with a file-based index.

Let me write the script.

Actually, I just realized there's another simpler approach for steps 5-6. Since the buffer is relatively small (the Gürtel is about 10km long, so the buffer is maybe 11km x 1km), I can use Overpass API to get just the data in that area. But the user said "from the Geofabrik Austria PBF". 

Actually, the user said "Can you pull every highway from the current Geofabrik Austria PBF". This means they want me to use the PBF as the source, but it doesn't mean I can't use Overpass for filtering. The data ultimately comes from OSM. But to be faithful, I should use the PBF.

Let me go with the osmium approach with file-based node cache.

Plan:
1. Download PBF in background
2. Write script with osmium that:
   a. First pass: find Gürtel ways (tags only)
   b. Second pass: get node locations for Gürtel ways
   c. Build buffer, get bbox
   d. Third pass: with NodeLocationsForWays (file-based cache), find all highway ways and PT relations in the buffer area
   e. Write results

Let me write it.

Let me also consider: PT route relations may reference ways that are not highways (e.g., rail tracks have railway=rail rather than highway). For PT routes, I should get the geometry of ALL member ways, not just highways.

Let me write the script now.

For the PT route relations, I need to:
- Find all relations with type=route and route IN (bus, tram, subway, train, trolleybus, light_rail)
- For each such relation, collect all member ways
- Build a geometry from those ways (a single MultiLineString per route)
- Check if it intersects the buffer

For this, I need:
- The relation tags (ref, name, operator, route)
- The member way geometries

So I need a two-pass approach:
- Pass 1: Find all PT relations and their member way IDs
- Pass 2: Get geometries for those ways (need node locations)
- Build MultiLineStrings

Let me structure the overall script carefully.

```python
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, shape, box
from shapely.ops import transform, unary_union
import pyproj
import sys, os, time

PBF = '/work/austria-latest.osm.pbf'
GPKG = '/work/vienna_network.gpkg'
BUFFER_DIST = 500.0  # meters

# ============ STEP 1: Find Gürtel ways ============
class GürtelWayFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}
    
    def way(self, w):
        tags = dict(w.tags)
        if tags.get('highway') and tags.get('name', '').endswith('Gürtel'):
            self.guertel_ways[w.id] = {
                'node_refs': [n.ref for n in w.nodes],
                'tags': tags
            }

finder = GürtelWayFinder()
print("Finding Gürtel ways...")
finder.apply_file(PBF)
print(f"Found {len(finder.guertel_ways)} Gürtel ways")

# Collect node IDs
node_ids = set()
for info in finder.guertel_ways.values():
    node_ids.update(info['node_refs'])

# ============ STEP 2: Get Gürtel node coords ============
class NodeGetter(osmium.SimpleHandler):
    def __init__(self, needed_ids):
        super().__init__()
        self.needed = needed_ids
        self.coords = {}
    
    def node(self, n):
        if n.id in self.needed:
            self.coords[n.id] = (n.lon, n.lat)
    
    def handle(self, obj):
        if obj.is_node(self):
            self.node(obj)
        # Skip ways and relations for speed

print(f"Looking up {len(node_ids)} node coordinates...")
ng = NodeGetter(node_ids)
ng.apply_file(PBF)
print(f"Found {len(ng.coords)} node coordinates")

# ============ STEP 3: Build Gürtel geometries ============
wgs84 = pyproj.CRS.from_epsg(4326)
target = pyproj.CRS.from_epsg(31287)
to_metric = pyproj.Transformer.from_crs(wgs84, target, always_xy=True).transform
to_wgs84 = pyproj.Transformer.from_crs(target, wgs84, always_xy=True).transform

guertel_lines = []
for info in finder.guertel_ways.values():
    coords = [ng.coords[ref] for ref in info['node_refs'] if ref in ng.coords]
    if len(coords) >= 2:
        guertel_lines.append(LineString(coords))

print(f"Built {len(guertel_lines)} Gürtel linestrings")

# Buffer in metric CRS
guertel_metric = [transform(to_metric, g) for g in guertel_lines]
buffered = [g.buffer(BUFFER_DIST) for g in guertel_metric]
buffer_union = unary_union(buffered)
buffer_wgs84 = transform(to_wgs84, buffer_union)

# ============ STEP 4: Find highways and PT relations in buffer ============
# Use NodeLocationsForWays with file-based index

from osmium.index import create_map

class DataCollector(osmium.SimpleHandler):
    def __init__(self, buffer_polygon):
        super().__init__()
        self.buffer = buffer_polygon
        self.highway_features = []  # (way_id, geometry, tags)
        self.pt_features = {}  # rel_id -> {'ways': {way_id: geometry}, 'tags': dict, 'member_way_ids': [int]}
        self.pt_way_geoms = {}  # way_id -> geometry (for ways that are PT members)
        
        # Track which ways are PT members
        self.pt_member_ways = set()
        
        # For building ways from node locations
        self._current_way = None
    
    def way(self, w):
        tags = dict(w.tags)
        is_highway = 'highway' in tags
        is_pt_member = w.id in self.pt_member_ways
        
        if is_highway or is_pt_member:
            # Build geometry from node locations
            coords = [(n.lon, n.lat) for n in w.nodes if n.location.valid()]
            if len(coords) >= 2:
                geom = LineString(coords)
                
                # Check if intersects buffer (quick bbox check first)
                if geom.intersects(self.buffer) or geom.within(self.buffer):
                    if is_highway:
                        self.highway_features.append((w.id, geom, tags))
                    if is_pt_member:
                        self.pt_way_geoms[w.id] = geom
    
    def relation(self, r):
        tags = dict(r.tags)
        route = tags.get('route', '')
        if route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            member_way_ids = [m.ref for m in r.members if m.type == 'w']
            for wid in member_way_ids:
                self.pt_member_ways.add(wid)
            self.pt_features[r.id] = {
                'tags': tags,
                'member_way_ids': member_way_ids
            }

# First pass: find PT relations (no node locations needed)
class RelationFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.pt_relations = {}
        self.pt_member_ways = set()
    
    def relation(self, r):
        tags = dict(r.tags)
        route = tags.get('route', '')
        if route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            member_way_ids = [m.ref for m in r.members if m.type == 'w']
            for wid in member_way_ids:
                self.pt_member_ways.add(wid)
            self.pt_relations[r.id] = {
                'tags': tags,
                'member_way_ids': member_way_ids
            }

print("Finding PT relations...")
rf = RelationFinder()
rf.apply_file(PBF)
print(f"Found {len(rf.pt_relations)} PT relations with {len(rf.pt_member_ways)} member ways")

# Now do the main pass with NodeLocationsForWays
print("Collecting highways and PT way geometries...")
# Create file-based node location index
idx = create_map("sparse_file_array", "/work/node_cache.dat")
nl = osmium.NodeLocationsForWays(idx)

# We need a custom handler that uses the node locations
class MainCollector(osmium.SimpleHandler):
    def __init__(self, buffer_polygon, pt_member_ways):
        super().__init__()
        self.buffer = buffer_polygon
        self.highway_features = []
        self.pt_way_geoms = {}
        self.pt_member_ways = pt_member_ways
    
    def way(self, w):
        tags = dict(w.tags)
        is_highway = 'highway' in tags
        is_pt_member = w.id in self.pt_member_ways
        
        if is_highway or is_pt_member:
            coords = [(n.lon, n.lat) for n in w.nodes if n.location.valid()]
            if len(coords) >= 2:
                geom = LineString(coords)
                try:
                    if geom.intersects(self.buffer):
                        if is_highway:
                            self.highway_features.append((w.id, geom, tags))
                        if is_pt_member:
                            self.pt_way_geoms[w.id] = geom
                except Exception as e:
                    pass

collector = MainCollector(buffer_wgs84, rf.pt_member_ways)
# Apply with node locations handler first
osmium.apply_file(PBF, handlers=[nl, collector])
```

Hmm, wait. `osmium.apply_file` with handlers in order - the `NodeLocationsForWays` handler needs to process before the collector. But osmium processes objects in order: first all nodes, then ways, then relations. So `NodeLocationsForWays` will store node locations, and when ways are processed, the `nl` handler adds location information to way nodes before passing to the collector.

Actually, looking at the osmium Python docs more carefully - when you pass `locations=True` to `apply_file`, it does this automatically. But for more control, you can create a `NodeLocationsForWays` handler and pass it in the handlers list.

The issue is that `NodeLocationsForWays` should come first in the handlers list.

Let me check how to properly pass node locations...

Looking at osmium Python API:
```python
# Method 1: use locations=True
osmium.apply_file("file.osm.pbf", locations=True, handler=my_handler)

# Method 2: use NodeLocationsForWays wrapper
nl = osmium.NodeLocationsForWays()
osmium.apply_file("file.osm.pbf", handlers=[nl, my_handler])
```

Both should work. But for method 1, passing a custom index is not possible. For method 2, I can create the NodeLocationsForWays with a specific index.

Let me check the constructor of NodeLocationsForWays... I think it accepts `index_type` or a cache file path.

Actually, looking at the osmium source code, I think `NodeLocationsForWays` can accept an `index_type` string parameter.

Let me just try the straightforward approach first and see if it works.

Actually wait - let me check how much RAM is available:

Let me just write the full script, run it, and iterate if needed.

Let me be more pragmatic. Instead of overthinking the node cache, let me use `locations=True` in `apply_file` and trust it works. For 1GB PBF, the node cache might use some RAM but likely not too much for what we need.

Actually, I just realized - with `locations=True`, osmium uses a default index that stores ALL node locations in the PBF. For Austria with ~400M nodes, that's a lot of memory. But with a sparse file array, it would be manageable (maybe a few GB on disk).

Let me use the file-based approach explicitly.

Let me also think about whether I even need to use NodeLocationsForWays for ALL nodes. I could do:

1. Find Gürtel ways and their node IDs → get those coords → build buffer
2. Get buffer bbox
3. Scan all ways in PBF: for each way with highway tag, collect its node IDs (but only if the way might be near the buffer - but I can't check without coords!)

Without coords, I can't do spatial filtering during scan. So I need either:
- NodeLocationsForWays for a full scan
- Or collect ALL highway way node refs in Austria and then get their coords

The first option is simpler. Let me go with NodeLocationsForWays using a file-based index.

Let me look at how to create a NodeLocationsForWays with a custom index in Python...

I think the correct way is:
```python
from osmium import NodeLocationsForWays
index = create_map("sparse_file_array", "/work/node_cache.dat")
nl = NodeLocationsForWays(index)
```

Let me just write it and test.

Actually, looking at the osmium Python bindings code on GitHub, `NodeLocationsForWays` is a handler. Its constructor accepts:
- `index_type` (string) - the type of index to use

And `create_map(index_type, cache_file)` creates the index.

Let me go ahead and write the complete script.

Here's my refined plan:

```python
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, shape
from shapely.ops import transform, unary_union
import pyproj
import sys, os

PBF_SRC = 'https://download.geofabrik.de/europe/austria-latest.osm.pbf'
PBF = '/work/austria-latest.osm.pbf'
GPKG = '/work/vienna_network.gpkg'
BUFFER_DIST = 500.0

# ... (download and all steps)
```

Let me write it all out now.

OK one concern - downloading the PBF. At ~1GB, it might take a while. Let me check if the environment has reasonable bandwidth.

Let me start the download in the background and start writing the script.

Actually, let me first check if there's already a PBF file present.
4:40
4:40
Bash ls /work/ 2>&1
(no output)
4:40
Bash python3 -c "
python3 -c "
import osmium
# Check what index types are available
try:
    from osmium.index import create_map
    print('create_map available')
except ImportError:
    print('create_map not available')

# Check NodeLocationsForWays signature
import inspect
try:
    sig = inspect.signature(osmium.NodeLocationsForWays.__init__)
    print(f'NodeLocationsForWays.__init__: {sig}')
except Exception as e:
    print(f'Error inspecting: {e}')
" 2>&1
create_map available
Error inspecting: no signature found for builtin <instancemethod __init__ at 0x7fc73bb13010>
4:40
Bash pip3 show osmium 2>&1
Name: osmium
Version: 4.3.1
Summary: Python bindings for libosmium, the data processing library for OSM data
Home-page: https://osmcode.org/pyosmium
Author: 
Author-email: Sarah Hoffmann <lonvia@denofr.de>
License: BSD-2-Clause
Location: /usr/local/lib/python3.14/site-packages
Requires: requests
Required-by:
5:46
Good - pyosmium 4.3.1 is installed. Let me check its API more carefully.

The plan:
1. Download the Austria PBF
2. Use osmium to parse it
3. Find all ways whose name ends with 'Gürtel' in Vienna
4. Buffer those by 500m
5. Extract all highways that intersect that buffer
6. Extract all public-transport route relations that cross that buffer
7. Write to GeoPackage in EPSG:31287 (MGI / Austria Lambert) - which is Austria's standard projected CRS

Let me think about the approach. Using osmium in Python can be complex. Let me think about an efficient approach:

Option A: Use osmium for everything
Option B: Use osmium to extract OSM data, then process with geopandas/shapely

Let me go with a simpler approach:

1. Download the PBF
2. Use osmium Python API to:
   a. First pass: find all ways named *Gürtel, collect their node coordinates, build geometries, buffer 500m
   b. Second pass: extract all highways intersecting that buffer
   c. Third pass: extract all PT route relations crossing the buffer

Actually, osmium is a streaming parser, so I need to be strategic. Let me think about this more carefully.

The approach:
1. Use `osmium` to read the file in handler-based approach
2. First, find all ways whose tags have name ending with "Gürtel" in Vienna area
3. Build their geometries, union them, buffer by 500m
4. Then scan again for highways that intersect the buffer
5. And scan for PT route relations

Actually, with osmium the handler approach streams through once. I could do it in stages:
- Stage 1: Read file, collect Gürtel way nodes, store their node locations
- Stage 2: Read file again, collect all highways that intersect the buffer
- Stage 3: Read file again, collect PT route relations

But reading the PBF three times would be slow. Let me think of a better approach.

Actually, I can use `osmium` tool from command line to filter first, but the CLI is not installed. Or I can install it.

Actually, let me think about an alternative approach:
1. Download PBF
2. Use `osmium` command-line (install via apt or pip) to extract just Vienna area
3. Then process smaller file

Or I can do it all in Python with a single pass approach:
- First pass: collect all Gürtel way nodes and all node locations
- But osmium handlers stream, so I can't query forward...

Let me think differently. The cleanest approach:

1. First pass: Collect ways with name ending "Gürtel", store their node IDs, and also collect node locations. Build geometries, buffer 500m.
2. Second pass: Collect all highways intersecting buffer, and all PT route relations.

Actually, even better: I can use the "osmium" tool (the C++ tool) via pip or apt to filter. Let me check if I can install it.

Wait, `osmium` CLI could be installed via `apt install osmium-tool` or via `pip install osmium-tool`. Let me check.

Actually, the osmium Python package does NOT provide the CLI. I need `osmium-tool` for the CLI.

Let me just use Python osmium handlers. The flow:

Handler 1 - Find Gürtel ways:
- In `way()` callback, check if tags have `name` ending with "Gürtel"
- If so, store way_id and node IDs
- In `node()` callback, store coordinates for needed nodes
- Then build geometries

Handler 2 - Find highways intersecting buffer and PT route relations:
- In `way()` callback, check if highway tag exists and geometry intersects buffer
- In `relation()` callback, check if type=route and route in (bus,tram,subway,train,trolleybus,light_rail)
  - For relations, collect member way IDs, then build full geometry

Actually, this is getting complex. Let me write a comprehensive script.

Let me think about pyosmium 4.x API. The handler interface changed. Let me check the osmium API.

In pyosmium 4.x:
- `osmium.SimpleHandler` is the base class
- Override `way()`, `node()`, `relation()` methods
- For node locations for ways, use `osmium.NodeLocationsForWays` as a handler wrapper

Let me write the script step by step.

Actually, I think the most robust approach given the complexity:

1. Download the PBF
2. Use osmium Python bindings to iterate once, collecting all the data we need
3. Process with geopandas/shapely

Let me write this:

```python
import osmium
import geopandas as gpd
import shapely.geometry as geom
import shapely.ops as ops
from collections import defaultdict
import sys

# Step 1: Find all Gürtel ways and their node coordinates
class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_ways = {}  # way_id -> list of node_ids
        self.guertel_tags = {}  # way_id -> tags dict
        self.needed_nodes = set()
        self.node_coords = {}  # node_id -> (lon, lat)
        
    def way(self, w):
        if 'name' in w.tags and w.tags['name'].endswith('Gürtel'):
            self.guertel_ways[w.id] = [n.ref for n in w.nodes]
            self.guertel_tags[w.id] = dict((tag.k, tag.v) for tag in w.tags)
            self.needed_nodes.update(n.ref for n in w.nodes)
    
    def node(self, n):
        if n.id in self.needed_nodes:
            self.node_coords[n.id] = (n.location.lon, n.location.lat)

# Step 2: Find highways intersecting buffer and PT route relations
class Extractor(osmium.SimpleHandler):
    def __init__(self, buffer_geom):
        super().__init__()
        self.buffer_geom = buffer_geom  # shapely polygon in wgs84
        self.highways = []  # list of (geometry_wgs84, tags_dict)
        self.pt_relations = []  # list of (member_way_ids, tags_dict)
        self.pt_relation_member_ways = {}  # relation_id -> set of way_ids
        self.pt_relation_tags = {}
        self.highway_way_nodes = {}  # way_id -> list of node_ids for highways
        self.highway_tags = {}  # way_id -> tags
        self.pt_needed_nodes = set()
        self.highway_needed_nodes = set()
        # For PT relations, we need to first collect relation members,
        # then on the second pass collect way geometries
        
    def way(self, w):
        # Collect all highway ways (we'll filter by intersection later)
        if 'highway' in w.tags:
            node_ids = [n.ref for n in w.nodes]
            self.highway_way_nodes[w.id] = node_ids
            self.highway_tags[w.id] = dict((tag.k, tag.v) for tag in w.tags)
            self.highway_needed_nodes.update(node_ids)
    
    def relation(self, r):
        if r.tags.get('type') == 'route' and r.tags.get('route') in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            member_way_ids = []
            for member in r.members:
                if member.type == 'w':
                    member_way_ids.append(member.ref)
            self.pt_relation_member_ways[r.id] = member_way_ids
            self.pt_relation_tags[r.id] = dict((tag.k, tag.v) for tag in r.tags)

# Hmm, this approach has issues because I need node locations for all these ways too.
```

Actually, let me think about this differently. The osmium library streams data, so I need to:

1. First pass: Collect Gürtel ways + all their nodes' locations
2. Build the buffer geometry
3. Second pass: Collect highways intersecting buffer + their node locations
4. Third pass: Collect PT route relations + their member way geometries + node locations

This means reading the PBF 3 times. That's fine, the PBF is likely ~1GB and reading it 3 times is fast.

Or I could do it in 2 passes:
1. Pass 1: Collect Gürtel ways, all highways, all PT relations, and ALL node locations
   But storing ALL node locations for a whole country PBF would be massive (hundreds of millions of nodes)

Better to do it in multiple passes but only store what we need.

Actually, let me use a smarter approach:

Pass 1: Find all Gürtel ways, store their node IDs. Then read the file again to get those node locations.
Pass 2: Build buffer. Then read the file again to get highways intersecting the buffer + their node locations.
Pass 3: Read the file again to get PT route relations + their member way node locations.

Actually, osmium has `osmium.NodeLocationsForWays` handler which can be chained. Let me use that.

The `osmium.NodeLocationsForWays` handler automatically populates node locations in way objects. So if I chain handlers properly, I can get way geometries in a single pass.

But the problem is I need to filter ways based on a spatial condition (intersecting the buffer), which I don't know until after I've found the Gürtel ways and computed the buffer.

Let me think more carefully about how to do this efficiently.

Approach: 
1. First pass with osmium - use a handler to find all ways with name ending "Gürtel" and record their node IDs. Then a second pass over the file to collect those node locations.
2. Build the buffer from those geometries.
3. Second set of passes: find all highways that intersect the buffer, and all PT route relations.

For PT relations, I need to collect the relation members (ways), then find all those ways' geometries, then merge them into a single line per relation.

Let me write this properly. I'll read the PBF multiple times.

Actually, I just realized - the Austria PBF is probably about 1-2 GB. Downloading it will take a while but should be fine.

Let me write the script:

```python
#!/usr/bin/env python3
"""
Extract Vienna Gürtel corridor highways and PT route relations from Austria OSM PBF.
Output: vienna_network.gpkg with two layers.
"""

import osmium
import geopandas as gpd
import pandas as pd
import shapely.geometry as geom
import shapely.ops as ops
import shapely.wkb as wkblib
from collections import defaultdict
import os
import sys
import urllib.request
import time

PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"

# Download PBF if needed
if not os.path.exists(PBF_PATH):
    print("Downloading Austria PBF...")
    urllib.request.urlretrieve(PBF_URL, PBF_PATH)
    print("Download complete.")

# ============ Pass 1: Find Gürtel ways and their node locations ============
class GuertelWayCollector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_nodes = {}  # way_id -> [(lon, lat)]
        self.guertel_tags = {}
        self._needed_node_ids = set()
        self._node_coords = {}
        self._collecting_nodes = False
        
    def way(self, w):
        if 'name' in w.tags and w.tags['name'].endswith('Gürtel'):
            node_ids = [n.ref for n in w.nodes]
            self.guertel_way_nodes[w.id] = node_ids
            self.guertel_tags[w.id] = dict((tag.k, tag.v) for tag in w.tags)
            self._needed_node_ids.update(node_ids)

    def node(self, n):
        if n.id in self._needed_node_ids:
            self._node_coords[n.id] = (n.location.lon, n.location.lat)

    def resolve_geometries(self):
        """Build LineString geometries for Gürtel ways."""
        geoms = {}
        for wid, node_ids in self.guertel_way_nodes.items():
            coords = [self._node_coords[nid] for nid in node_ids if nid in self._node_coords]
            if len(coords) >= 2:
                geoms[wid] = geom.LineString(coords)
        return geoms

print("Pass 1: Collecting Gürtel ways...")
pass1 = GuertelWayCollector()
pass1.apply_file(PBF_PATH)

print(f"  Found {len(pass1.guertel_way_nodes)} Gürtel ways")
guertel_geoms = pass1.resolve_geometries()
print(f"  Resolved {len(guertel_geoms)} geometries")

# Merge all Gürtel ways into a single geometry and buffer by 500m
if not guertel_geoms:
    print("ERROR: No Gürtel ways found!")
    sys.exit(1)

guertel_union = ops.unary_union(list(guertel_geoms.values()))
# Buffer in degrees is tricky - 500m ≈ 0.0045 degrees at Vienna's latitude (~48.2°N)
# Better approach: project to a local CRS, buffer, then project back
# Or just use a reasonable degree buffer
# Actually, let's be precise: use pyproj to transform

from pyproj import Transformer
from functools import partial

# Vienna is around lat=48.2, lon=16.37
# Use a local mercator-like projection centered on Vienna
wgs84 = 'EPSG:4326'
vienna_local = '+proj=tmerc +lat_0=48.2 +lon_0=16.37 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs'

transformer_to_m = Transformer.from_crs(wgs84, vienna_local, always_xy=True)
transformer_to_wgs = Transformer.from_crs(vienna_local, wgs84, always_xy=True)

def buffer_in_meters(geom_wgs, meters):
    """Buffer a WGS84 geometry by meters using a local projection."""
    coords_m = [transformer_to_m.transform(x, y) for x, y in geom_wgs.coords]
    geom_m = geom.LineString(coords_m)
    buf_m = geom_m.buffer(meters)
    if buf_m.geom_type == 'Polygon':
        exterior = [transformer_to_wgs.transform(x, y) for x, y in buf_m.exterior.coords]
        return geom.Polygon(exterior)
    return buf_m

# Actually, let's simplify - buffer the union of all Gürtel way geometries
# by 500m using a proper projected buffer
guertel_union_m = ops.transform(transformer_to_m.transform, guertel_union)
buffer_m = guertel_union_m.buffer(500)
buffer_wgs = ops.transform(transformer_to_wgs.transform, buffer_m)

print(f"  Buffer area: {buffer_m.area:.0f} m²")

# ============ Pass 2: Extract highways intersecting the buffer ============
class HighwayExtractor(osmium.SimpleHandler):
    def __init__(self, buffer_wgs):
        super().__init__()
        self.buffer = buffer_wgs
        self.highway_way_nodes = {}
        self.highway_tags = {}
        self._needed_node_ids = set()
        self._node_coords = {}
        
    def way(self, w):
        if 'highway' in w.tags:
            node_ids = [n.ref for n in w.nodes]
            self.highway_way_nodes[w.id] = node_ids
            self.highway_tags[w.id] = dict((tag.k, tag.v) for tag in w.tags)
            self._needed_node_ids.update(node_ids)
    
    def node(self, n):
        if n.id in self._needed_node_ids:
            self._node_coords[n.id] = (n.location.lon, n.location.lat)
    
    def get_highways_in_buffer(self):
        """Filter highways intersecting the buffer and return (geom, tags) pairs."""
        results = []
        for wid, node_ids in self.highway_way_nodes.items():
            coords = [self._node_coords[nid] for nid in node_ids if nid in self._node_coords]
            if len(coords) >= 2:
                line = geom.LineString(coords)
                if line.intersects(self.buffer):
                    results.append((line, self.highway_tags[wid]))
        return results

print("Pass 2: Collecting highways...")
pass2 = HighwayExtractor(buffer_wgs)
pass2.apply_file(PBF_PATH)
highway_data = pass2.get_highways_in_buffer()
print(f"  Found {len(highway_data)} highway segments in buffer")

# ============ Pass 3: Extract PT route relations ============
class PTRelationExtractor(osmium.SimpleHandler):
    def __init__(self, buffer_wgs):
        super().__init__()
        self.buffer = buffer_wgs
        self.relation_member_ways = {}  # rel_id -> set of way_ids
        self.relation_tags = {}
        self.way_geoms_in_relations = {}  # way_id -> LineString
        self._way_ids_needed = set()
        self._node_ids_needed = set()
        self._node_coords = {}
        self._collecting_ways = False
        
    def relation(self, r):
        route = r.tags.get('route')
        if r.tags.get('type') == 'route' and route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            member_way_ids = []
            for member in r.members:
                if member.type == 'w':
                    member_way_ids.append(member.ref)
            if member_way_ids:
                self.relation_member_ways[r.id] = member_way_ids
                self.relation_tags[r.id] = dict((tag.k, tag.v) for tag in r.tags)
                self._way_ids_needed.update(member_way_ids)
    
    # After collecting relations, we need to read the file again to get way geometries
    # Actually this is getting complex - let me store needed way IDs and do another pass

# Hmm, this is getting complicated with multiple passes. Let me rethink.

# Actually, let me use a different approach. Instead of multiple osmium passes,
# I can collect ALL relation data first, then do a way + node pass for the needed ways.
```

OK, this approach is getting complex. Let me think of a cleaner way.

The issue is that osmium streams the file, so to get relation geometries I need:
1. Find relations of interest -> collect their member way IDs
2. Find those ways -> collect their node IDs  
3. Find those nodes -> get coordinates
4. Build geometries

This is 3 passes over the PBF, but it's the only way with osmium.

Actually, I can optimize: since I already have all highway node coordinates from Pass 2, and PT relation ways overlap with highways, maybe I can combine passes.

Let me write a cleaner script that does everything step by step.

Actually, let me just install `osmium-tool` CLI to make this easier, or use `osmium` Python bindings effectively.

Let me write a single comprehensive script:

```python
#!/usr/bin/env python3
"""Extract Vienna Gürtel network to GeoPackage."""

import osmium
import geopandas as gpd
import pandas as pd
import shapely.geometry as geom
import shapely.ops as ops
from pyproj import Transformer
import os
import sys
import urllib.request

PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"
OUT_PATH = "/work/vienna_network.gpkg"
TARGET_CRS = "EPSG:31287"

# Step 0: Download PBF
if not os.path.exists(PBF_PATH):
    print("Downloading Austria PBF...")
    urllib.request.urlretrieve(PBF_URL, PBF_PATH)
    print("Download complete.")

# ============================================================
# PASS 1: Find all Gürtel ways, collect their node coords
# ============================================================
class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_nodes = {}  # way_id -> [node_id, ...]
        self.guertel_tags = {}
        self._needed_nodes = set()
        self._node_coords = {}
    
    def way(self, w):
        name = w.tags.get('name')
        if name and name.endswith('Gürtel'):
            node_ids = [n.ref for n in w.nodes]
            self.guertel_way_nodes[w.id] = node_ids
            self.guertel_tags[w.id] = dict(w.tags)
            self._needed_nodes.update(node_ids)
    
    def node(self, n):
        if n.id in self._needed_nodes:
            self._node_coords[n.id] = (n.location.lon, n.location.lat)

print("=== Pass 1: Finding Gürtel ways ===")
gfinder = GuertelFinder()
gfinder.apply_file(PBF_PATH)
print(f"  Found {len(gfinder.guertel_way_nodes)} ways")
print(f"  Gürtel tags: {list(gfinder.guertel_tags.values())[:3]}")

# Build Gürtel geometries
guertel_lines = []
for wid, node_ids in gfinder.guertel_way_nodes.items():
    coords = [gfinder._node_coords[nid] for nid in node_ids if nid in gfinder._node_coords]
    if len(coords) >= 2:
        guertel_lines.append(geom.LineString(coords))

print(f"  Built {len(guertel_lines)} Gürtel line geometries")
guertel_union = ops.unary_union(guertel_lines)

# Buffer by 500m using a local projection
# Vienna ~ lat=48.2, lon=16.37
wgs84 = "EPSG:4326"
local_crs = "+proj=tmerc +lat_0=48.2 +lon_0=16.37 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs"
t_to_m = Transformer.from_crs(wgs84, local_crs, always_xy=True)
t_to_wgs = Transformer.from_crs(local_crs, wgs84, always_xy=True)

guertel_m = ops.transform(t_to_m.transform, guertel_union)
buffer_m = guertel_m.buffer(500)
buffer_wgs = ops.transform(t_to_wgs.transform, buffer_m)
print(f"  Buffer area: {buffer_m.area:.0f} m²")

# ============================================================
# PASS 2: Extract highways intersecting buffer
# ============================================================
class HighwayCollector(osmium.SimpleHandler):
    def __init__(self, buffer_wgs):
        super().__init__()
        self.buffer = buffer_wgs
        self.highway_way_nodes = {}
        self.highway_tags = {}
        self._needed_nodes = set()
        self._node_coords = {}
    
    def way(self, w):
        if 'highway' in w.tags:
            node_ids = [n.ref for n in w.nodes]
            self.highway_way_nodes[w.id] = node_ids
            self.highway_tags[w.id] = dict(
                (k, v) for k, v in ((tag.k, tag.v) for tag in w.tags)
                if k in ('name', 'highway', 'maxspeed', 'lanes', 'surface', 'oneway')
            )
            self._needed_nodes.update(node_ids)
    
    def node(self, n):
        if n.id in self._needed_nodes:
            self._node_coords[n.id] = (n.location.lon, n.location.lat)
    
    def get_filtered(self):
        rows = []
        for wid, node_ids in self.highway_way_nodes.items():
            coords = [self._node_coords[nid] for nid in node_ids if nid in self._node_coords]
            if len(coords) >= 2:
                line = geom.LineString(coords)
                if line.intersects(self.buffer):
                    rows.append((line, self.highway_tags[wid]))
        return rows

print("=== Pass 2: Collecting highways ===")
hcoll = HighwayCollector(buffer_wgs)
hcoll.apply_file(PBF_PATH)
highway_rows = hcoll.get_filtered()
print(f"  Found {len(highway_rows)} highway segments")

# ============================================================
# PASS 3: Extract PT route relations
# ============================================================
# We need to:
# 3a. Find PT route relations, collect their member way IDs
# 3b. Find those ways' node IDs and collect node coords
# 3c. Build merged geometry per relation

print("=== Pass 3a: Finding PT route relations ===")
class PTFindRelations(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.rel_member_ways = {}  # rel_id -> list of way_ids
        self.rel_tags = {}
        self._needed_way_ids = set()
    
    def relation(self, r):
        route = r.tags.get('route')
        if r.tags.get('type') == 'route' and route in ('bus', 'tram', 'subway', 'train', 'trolleybus', 'light_rail'):
            way_ids = [m.ref for m in r.members if m.type == 'w']
            if way_ids:
                self.rel_member_ways[r.id] = way_ids
                self.rel_tags[r.id] = dict((tag.k, tag.v) for tag in r.tags)
                self._needed_way_ids.update(way_ids)

pt_find = PTFindRelations()
pt_find.apply_file(PBF_PATH)
print(f"  Found {len(pt_find.rel_member_ways)} PT route relations")
print(f"  Needed way IDs: {len(pt_find._needed_way_ids)}")
print(f"  Route types: {set(r.get('route') for r in pt_find.rel_tags.values())}")

# Pass 3b: Get geometries for those ways
print("=== Pass 3b: Getting PT way geometries ===")
class PTWayGeomCollector(osmium.SimpleHandler):
    def __init__(self, needed_way_ids):
        super().__init__()
        self.needed_way_ids = needed_way_ids
        self.way_node_ids = {}
        self._needed_node_ids = set()
        self._node_coords = {}
    
    def way(self, w):
        if w.id in self.needed_way_ids:
            node_ids = [n.ref for n in w.nodes]
            self.way_node_ids[w.id] = node_ids
            self._needed_node_ids.update(node_ids)
    
    def node(self, n):
        if n.id in self._needed_node_ids:
            self._node_coords[n.id] = (n.location.lon, n.location.lat)

pt_ways = PTWayGeomCollector(pt_find._needed_way_ids)
pt_ways.apply_file(PBF_PATH)
print(f"  Found {len(pt_ways.way_node_ids)} ways out of {len(pt_find._needed_way_ids)} needed")

# Build way geometries
way_geoms = {}
for wid, node_ids in pt_ways.way_node_ids.items():
    coords = [pt_ways._node_coords[nid] for nid in node_ids if nid in pt_ways._node_coords]
    if len(coords) >= 2:
        way_geoms[wid] = geom.LineString(coords)
print(f"  Built {len(way_geoms)} way geometries")

# Build relation geometries (merge member ways into MultiLineString)
# Only include relations whose ways intersect the buffer
pt_relation_rows = []
for rel_id, way_ids in pt_find.rel_member_ways.items():
    member_lines = []
    for wid in way_ids:
        if wid in way_geoms:
            member_lines.append(way_geoms[wid])
    if not member_lines:
        continue
    
    # Check if any member intersects the buffer
    rel_union = ops.unary_union(member_lines)
    # For MultiLineString, if it crosses the buffer
    if rel_union.intersects(buffer_wgs):
        # Ensure it's a MultiLineString
        if rel_union.geom_type == 'LineString':
            rel_union = geom.MultiLineString([rel_union])
        pt_relation_rows.append((rel_union, pt_find.rel_tags[rel_id]))

print(f"  Found {len(pt_relation_rows)} PT relations crossing the buffer")

# ============================================================
# Write to GeoPackage
# ============================================================
print("=== Writing GeoPackage ===")

# Build highway GeoDataFrame
highway_data = []
highway_geoms_list = []
for line_geom, tags in highway_rows:
    highway_data.append(tags)
    highway_geoms_list.append(line_geom)

gdf_highways = gpd.GeoDataFrame(highway_data, geometry=highway_geoms_list, crs=wgs84)
gdf_highways = gdf_highways.to_crs(TARGET_CRS)
# Keep only the requested columns (plus geometry)
keep_cols = [c for c in ['name', 'highway', 'maxspeed', 'lanes', 'surface', 'oneway'] if c in gdf_highways.columns]
gdf_highways = gdf_highways[keep_cols + ['geometry'] if 'geometry' in gdf_highways.columns else keep_cols]
print(f"  Highways: {len(gdf_highways)} features")

# Build PT routes GeoDataFrame
pt_data = []
pt_geoms_list = []
for multi_line, tags in pt_relation_rows:
    pt_data.append(tags)
    pt_geoms_list.append(multi_line)

gdf_pt = gpd.GeoDataFrame(pt_data, geometry=pt_geoms_list, crs=wgs84)
gdf_pt = gdf_pt.to_crs(TARGET_CRS)
keep_pt_cols = [c for c in ['ref', 'name', 'operator', 'route'] if c in gdf_pt.columns]
gdf_pt = gdf_pt[keep_pt_cols + ['geometry'] if 'geometry' in gdf_pt.columns else keep_pt_cols]
print(f"  PT routes: {len(gdf_pt)} features")

# Write to GPKG
gdf_highways.to_file(OUT_PATH, layer='highways', driver='GPKG')
gdf_pt.to_file(OUT_PATH, layer='pt_routes', driver='GPKG')

print(f"Done! Output: {OUT_PATH}")
```

This looks good. Let me refine it and write the final script.

One concern: the `HighwayCollector` stores EVERY highway way in Austria (potentially millions) just to filter later by buffer intersection. This could use a lot of memory. But we reduce it by only storing node IDs and tags. Still, the _needed_nodes could be huge.

Actually, let me think of a better approach. Instead of storing all highway ways and their node IDs (which could be millions), I can:
1. First find the Gürtel ways and their buffer
2. Then do a single pass to collect both highways AND PT stuff
3. Use a handler with NodeLocationsForWays to get way geometries on the fly

With `osmium.NodeLocationsForWays`, the handler chains so that `way()` receives ways with already-resolved node locations. So I can check intersection in the way handler itself!

But the problem is: NodeLocationsForWays needs to store all node locations (or use an index), and for a whole country that's a lot of RAM.

A better approach: use a disk-based index for node locations. osmium supports `osmium.index.create_map('sparse_file_array', 'nodes.cache')` or similar.

Wait, but I need to do this in two separate passes (first to find Gürtel, second to find highways/PT). The NodeLocationsForWays handler would need to read all nodes first.

Let me reconsider. Maybe I should:
1. First pass: Find Gürtel ways + their node coords (small, just Vienna area)
2. Second pass with NodeLocationsForWays: Find highways intersecting buffer + PT route relations
   - Use a disk-based node location index so we don't use too much RAM
   - Filter on the fly: for each way, check if it intersects the buffer; for relations, check member ways

Actually, let me use a slightly different approach that's more memory efficient:

1. First pass: Find Gürtel ways, collect their node coords, build buffer
2. Second pass: Read the file but only process nodes that are near Vienna (within some bounding box slightly larger than the buffer). Use a handler with a node location cache.

Wait, osmium doesn't work like that - you can't skip nodes easily in SimpleHandler.

Let me just go with my original plan but optimize: instead of storing ALL Austrian highway ways, I'll first compute a bounding box around the buffer (to roughly filter ways by their node bounding box before checking intersection), or I'll just use NodeLocationsForWays with a disk-based index.

Actually, the simplest approach for highways: use `osmium.NodeLocationsForWays` with a handler that collects only ways whose bbox intersects the buffer's bbox. This way I don't have to store all ways and then filter.

But `NodeLocationsForWays` needs to see ALL nodes first to build the location cache. That's unavoidable.

Let me use a different strategy altogether. Since the Gürtel is in Vienna, and Vienna is a small part of Austria, I can:

1. Find bounding box of Gürtel + 500m buffer → ~(16.2, 48.1, 16.5, 48.3)
2. Use `osmium` with `apply_file(..., locations=True)` to get node locations
3. In the handler, collect ways that are within this bounding box

Wait, osmium's `apply_file` doesn't have a `locations` parameter in Python. You use `osmium.NodeLocationsForWays` as a handler wrapper.

Let me clean this up and just write a robust script. Let me go step-by-step.

Here's my revised plan:

Step 1: Find Gürtel ways + node coords. Store as LineStrings. Build buffer.

Step 2: Find highways intersecting buffer.
   - Create a handler that uses NodeLocationsForWays
   - Since this handler needs node locations for ALL nodes in the PBF, I'll use a disk-based index to avoid RAM issues
   - In the way() callback, check if way intersects buffer, collect if so

Step 3: Find PT route relations.
   - Collect relation IDs with member way IDs
   - Then collect those ways' geometries (using NodeLocationsForWays again)
   - Filter by intersection with buffer
   - Merge way geometries per relation into MultiLineString

Step 4: Write to GPKG

Let me use osmium.index.create_map for disk-based storage.

Here's the key part for osmium 4.x - NodeLocationsForWays accepts a storage location:

```python
from osmium.index import create_map

# Create a file-based node location index
node_locations = create_map("sparse_file_array", "nodes.cache")

# Use it with NodeLocationsForWays
handler = osmium.NodeLocationsForWays(node_locations)
```

Actually let me check what storage types are available:

```
flex_mmap, mmap_file_array, sparse_file_array, sparse_mem_array, dense_file_array, dense_mmap_array
```

For a whole country, `sparse_file_array` or `dense_file_array` would work.

Let me finalize the script.
5:46
5:46
Bash python3 -c "
python3 -c "
from osmium.index import create_map
# Check available map types
import osmium
print(dir(osmium))
print('---')
import osmium.index as idx
print(dir(idx))
" 2>&1
['BackReferenceWriter', 'BaseFilter', 'BaseHandler', 'BufferIterator', 'FileProcessor', 'ForwardReferenceWriter', 'IdTracker', 'InvalidLocationError', 'MergeInputReader', 'NodeLocationsForWays', 'OsmFileIterator', 'SimpleHandler', 'SimpleWriter', 'WriteHandler', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '_osmium', 'apply', 'area', 'back_reference_writer', 'file_processor', 'filter', 'forward_reference_writer', 'geom', 'helper', 'index', 'io', 'make_simple_handler', 'osm', 'osmium', 'simple_handler', 'zip_processors']
---
['IdSet', 'LocationTable', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'create_map', 'map_types']
5:51
Let me check the available map types to understand how to use NodeLocationsForWays.
5:51
Bash python3 -c "
python3 -c "
from osmium.index import map_types, create_map
print('Map types:', map_types)
" 2>&1
Map types: <built-in method map_types of pybind11_builtins.pybind11_detail_function_record_v1_system_libstdcpp_gxx_abi_1xxx_use_cxx11_abi_1 object at 0x7f4d1f318250>
5:51
Bash python3 -c "
python3 -c "
import osmium
help(osmium.NodeLocationsForWays.__init__)
" 2>&1
Help on method descriptor __init__ in module osmium._osmium:

__init__(...)
    __init__(self: osmium._osmium.NodeLocationsForWays, arg0: osmium::index::map::Map<unsigned long, osmium::Location>) -> None
6:37
So I need to create a map with `create_map` and pass it to `NodeLocationsForWays`. Let me write the script.

Here's my plan:

1. Download the Austria PBF
2. First pass: find all Gürtel ways (name ends in 'Gürtel') and create a 500m buffer around them
3. Second pass: extract all highways intersecting that buffer
4. Third pass: extract all PT route relations that cross the buffer

Actually, osmium is a single-pass handler framework. Let me think about the most efficient approach.

Approach:
- Pass 1: Read the PBF, find all way IDs whose name ends in 'Gürtel', assemble their geometries using NodeLocationsForWays, buffer them by 500m, and save the buffer geometry.
- Pass 2: Read the PBF again, find all ways that are highways intersecting the buffer, and all relations that are PT routes whose member ways intersect the buffer, then assemble geometry for each.

Actually, with osmium I can do multiple things in one pass if I handle it carefully. But it's tricky because NodeLocationsForWays is needed for way geometry.

Let me think about this differently.

Option A: Two-pass approach
1. First pass: Collect Gürtel way nodes and build buffer
2. Second pass: Extract intersecting highways and PT routes

Option B: Use osmium's file_processor which handles node locations automatically.

Let me use a simpler approach - I'll do it in stages:

Stage 1: Download the PBF
Stage 2: Use osmium to find Gürtel ways and create buffer
Stage 3: Use osmium to extract highways in buffer
Stage 4: Use osmium to extract PT routes in buffer
Stage 5: Write to GeoPackage

Actually, let me think about a cleaner approach using a handler that stores everything.

Let me write a Python script that:

1. Downloads the PBF
2. First pass: reads the PBF with NodeLocationsForWays, collects all ways whose name ends in 'Gürtel', builds their linestrings, buffers by 500m
3. Second pass: reads the PBF again, collects:
   a. All highway ways that intersect the buffer -> save as LineString
   b. All route relations (bus, tram, subway, train, trolleybus, light_rail) that intersect the buffer -> merge member ways into MultiLineString

For relations, I need to:
- Find relations with type=route and route in [bus, tram, subway, train, trolleybus, light_rail]
- For each such relation, collect the member ways that are in the buffer
- Merge those ways into a MultiLineString

Actually, for PT routes, the instruction says "each route as a single feature" - meaning each relation becomes one feature, with geometry being the union of all member way geometries.

Let me write this step by step.

For the first pass, I'll use SimpleHandler with NodeLocationsForWays.

Let me write the complete script.
6:37
6:37
Write /work/solve.py
content · 483 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor buffer) and PT route relations from Austria OSM PBF
and write to vienna_network.gpkg in EPSG:31287 (MGI / Austria Lambert).
"""

import os
import sys
import json
import urllib.request
import time
import warnings

import osmium
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import LineString, MultiLineString, Point, box
from shapely.ops import unary_union, transform
from pyproj import Transformer
from collections import defaultdict

warnings.filterwarnings("ignore")

# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
PBF_URL = "https://download.geofabrik.de/europe/austria-latest.osm.pbf"
PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287  # MGI / Austria Lambert

# ---------------------------------------------------------------------------
# Helper: download with retry
# ---------------------------------------------------------------------------
def download_file(url, path, max_retries=3):
    for attempt in range(1, max_retries + 1):
        try:
            print(f"Downloading {url} ... (attempt {attempt})")
            urllib.request.urlretrieve(url, path)
            size_mb = os.path.getsize(path) / (1024 * 1024)
            print(f"Downloaded {size_mb:.0f} MB")
            return
        except Exception as e:
            print(f"Attempt {attempt} failed: {e}")
            if attempt < max_retries:
                time.sleep(5)
            else:
                raise RuntimeError(f"Failed to download after {max_retries} attempts")

if not os.path.exists(PBF_PATH):
    download_file(PBF_URL, PBF_PATH)
else:
    print(f"Using existing {PBF_PATH}")

# ---------------------------------------------------------------------------
# Coordinate transforms
# ---------------------------------------------------------------------------
# WGS84 -> MGI / Austria Lambert
wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
# MGI -> WGS84 (for buffer in metres we work in projected CRS)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

# ---------------------------------------------------------------------------
# STAGE 1: Find Gürtel ways, build their projected geometry, buffer by 500 m
# ---------------------------------------------------------------------------
print("\n=== STAGE 1: Identifying Gürtel corridor ===")

class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.node_locations = {}  # id -> (lon, lat)

    def way(self, w):
        name = w.tags.get("name", "")
        if name.endswith("Gürtel") or name.endswith("Guertel"):
            self.guertel_way_ids.add(w.id)
            # Store node references for later geometry building
            self._store_nodes(w)

    def _store_nodes(self, w):
        # We'll handle this via NodeLocationsForWays in the processing
        pass

    def node(self, n):
        if n.id in self.node_locations or True:  # store all for completeness
            pass  # handled by NodeLocationsForWays

# Use NodeLocationsForWays to get full way geometries
class GuertelGeometryCollector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_linestrings = []  # in WGS84
        self.way_nodes = {}

    def way(self, w):
        name = w.tags.get("name", "")
        if name.endswith("Gürtel") or name.endswith("Guertel"):
            self.guertel_way_ids.add(w.id)
            # Collect node references
            self.way_nodes[w.id] = [n.ref for n in w.nodes]

# First pass with NodeLocationsForWays to find Gürtel way nodes
print("  Reading PBF to find Gürtel ways...")

# We'll do this in two sub-steps:
# 1. Find Gürtel way IDs and their node refs
# 2. Collect node locations for those nodes
# 3. Build geometries

# Actually, let's use a different approach: use osmium's file_processor
# or just do a two-pass approach with location handler.

class WayNodeCollector(osmium.SimpleHandler):
    """Collect Gürtel way IDs and their node references."""
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}  # way_id -> [node_refs]

    def way(self, w):
        name = w.tags.get("name", "")
        if name and ("Gürtel" in name or "Guertel" in name) and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(w.id)
            self.guertel_way_nodes[w.id] = [n.ref for n in w.nodes]

class LocationHandler(osmium.SimpleHandler):
    """Store node locations for nodes we care about."""
    def __init__(self, wanted_nodes):
        super().__init__()
        self.wanted_nodes = wanted_nodes
        self.locations = {}  # node_id -> (lon, lat)

    def node(self, n):
        if n.id in self.wanted_nodes:
            self.locations[n.id] = (n.location.lon, n.location.lat)

# Sub-pass 1: find Gürtel way IDs and node refs
collector = WayNodeCollector()
collector.apply_file(PBF_PATH)

print(f"  Found {len(collector.guertel_way_ids)} Gürtel ways")

# Collect all unique node references from Gürtel ways
all_guertel_nodes = set()
for way_id, node_refs in collector.guertel_way_nodes.items():
    for ref in node_refs:
        all_guertel_nodes.add(ref)

print(f"  Total Gürtel way node references: {len(all_guertel_nodes)}")

# Sub-pass 2: get locations of those nodes
loc_handler = LocationHandler(all_guertel_nodes)
loc_handler.apply_file(PBF_PATH)

print(f"  Retrieved {len(loc_handler.locations)} node locations")

# Build Gürtel linestrings in WGS84
guertel_lines_wgs = []
for way_id, node_refs in collector.guertel_way_nodes.items():
    coords = []
    valid = True
    for ref in node_refs:
        if ref in loc_handler.locations:
            coords.append(loc_handler.locations[ref])
        else:
            valid = False
            break
    if valid and len(coords) >= 2:
        guertel_lines_wgs.append(LineString(coords))

print(f"  Built {len(guertel_lines_wgs)} Gürtel linestrings")

# Project to MGI / Austria Lambert and buffer by 500 m
def project_line(geom):
    if geom is None:
        return None
    return transform(lambda x, y: wgs_to_mgi.transform(x, y), geom)

guertel_lines_mgi = [project_line(ln) for ln in guertel_lines_wgs if ln is not None]

# Union all Gürtel geometries and buffer
if len(guertel_lines_mgi) == 1:
    guertel_union_mgi = guertel_lines_mgi[0]
else:
    guertel_union_mgi = unary_union(guertel_lines_mgi)

buffer_mgi = guertel_union_mgi.buffer(500.0)

# Also compute bounding box for faster filtering
buffer_bounds_mgi = buffer_mgi.bounds  # (minx, miny, maxx, maxy)
buffer_bbox_mgi = box(*buffer_bounds_mgi)

print(f"  Buffer area: {buffer_mgi.area / 1e6:.2f} km²")

# ---------------------------------------------------------------------------
# STAGE 2: Extract highway ways intersecting the buffer
# ---------------------------------------------------------------------------
print("\n=== STAGE 2: Extracting highway ways ===")

class HighwayCollector(osmium.SimpleHandler):
    """Collect highway way IDs and their node refs for ways intersecting the buffer."""
    def __init__(self, buffer_geom_wgs84, buffer_bbox_wgs84):
        super().__init__()
        self.buffer_geom_wgs84 = buffer_geom_wgs84
        self.buffer_bbox_wgs84 = buffer_bbox_wgs84
        self.highway_way_nodes = {}  # way_id -> {tags, node_refs}
        self.highway_way_tags = {}   # way_id -> tag dict
        self.all_highway_nodes = set()

    def way(self, w):
        if "highway" not in w.tags:
            return
        # Collect all highway ways initially; we'll filter by geometry later
        way_id = w.id
        node_refs = [n.ref for n in w.nodes]
        tags = {}
        for k, v in w.tags:
            tags[k] = v

        self.highway_way_nodes[way_id] = node_refs
        self.highway_way_tags[way_id] = tags
        for ref in node_refs:
            self.all_highway_nodes.add(ref)

# Convert buffer to WGS84 for quick bounds check
buffer_geom_wgs84 = transform(lambda x, y: mgi_to_wgs.transform(x, y), buffer_mgi)
buffer_bounds_wgs84 = buffer_geom_wgs84.bounds
buffer_bbox_wgs84 = box(*buffer_bounds_wgs84)

print("  Reading PBF for highway ways...")
hw_collector = HighwayCollector(buffer_geom_wgs84, buffer_bbox_wgs84)
hw_collector.apply_file(PBF_PATH)

print(f"  Found {len(hw_collector.highway_way_nodes)} highway ways (pre-geometry-filter)")

# Get node locations for all highway nodes
print(f"  Retrieving {len(hw_collector.all_highway_nodes)} node locations...")
loc_handler2 = LocationHandler(hw_collector.all_highway_nodes)
loc_handler2.apply_file(PBF_PATH)
print(f"  Retrieved {len(loc_handler2.locations)} node locations")

# Build geometries and filter by buffer intersection
highway_features = []  # list of dicts with geometry and properties

for way_id, node_refs in hw_collector.highway_way_nodes.items():
    coords = []
    valid = True
    for ref in node_refs:
        if ref in loc_handler2.locations:
            coords.append(loc_handler2.locations[ref])
        else:
            valid = False
            break
    if not valid or len(coords) < 2:
        continue

    line_wgs = LineString(coords)

    # Quick bbox check
    if not line_wgs.intersects(buffer_bbox_wgs84):
        continue

    # Full intersection check
    if not line_wgs.intersects(buffer_geom_wgs84):
        continue

    # Project to MGI
    line_mgi = project_line(line_wgs)
    if line_mgi is None:
        continue

    tags = hw_collector.highway_way_tags[way_id]

    highway_features.append({
        "geometry": line_mgi,
        "properties": {
            "osm_id": way_id,
            "name": tags.get("name"),
            "highway": tags.get("highway"),
            "maxspeed": tags.get("maxspeed"),
            "lanes": tags.get("lanes"),
            "surface": tags.get("surface"),
            "oneway": tags.get("oneway"),
        }
    })

print(f"  Collected {len(highway_features)} highway features intersecting the buffer")

# ---------------------------------------------------------------------------
# STAGE 3: Extract public-transport route relations
# ---------------------------------------------------------------------------
print("\n=== STAGE 3: Extracting PT route relations ===")

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

class PTRelationCollector(osmium.SimpleHandler):
    """Collect PT route relations and their member ways."""
    def __init__(self):
        super().__init__()
        self.relations = {}  # rel_id -> {tags, member_way_ids}
        self.all_relation_way_ids = set()

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return

        rel_id = r.id
        tags = {}
        for k, v in r.tags:
            tags[k] = v

        member_way_ids = []
        for member in r.members:
            if member.type == "w":
                member_way_ids.append(member.ref)

        if not member_way_ids:
            return

        self.relations[rel_id] = {
            "tags": tags,
            "member_way_ids": member_way_ids,
        }
        for wid in member_way_ids:
            self.all_relation_way_ids.add(wid)

print("  Reading PBF for PT route relations...")
pt_collector = PTRelationCollector()
pt_collector.apply_file(PBF_PATH)

print(f"  Found {len(pt_collector.relations)} PT route relations")

# Get node locations for all member way nodes
# First, we need to know which nodes the member ways use. Let's do another PBF pass.
class MemberWayNodeCollector(osmium.SimpleHandler):
    def __init__(self, wanted_way_ids):
        super().__init__()
        self.wanted_way_ids = wanted_way_ids
        self.way_nodes = {}  # way_id -> [node_refs]
        self.all_nodes = set()

    def way(self, w):
        if w.id in self.wanted_way_ids:
            refs = [n.ref for n in w.nodes]
            self.way_nodes[w.id] = refs
            for ref in refs:
                self.all_nodes.add(ref)

print("  Reading PBF for member way node refs...")
mw_collector = MemberWayNodeCollector(pt_collector.all_relation_way_ids)
mw_collector.apply_file(PBF_PATH)

print(f"  Found {len(mw_collector.way_nodes)} member ways with {len(mw_collector.all_nodes)} node refs")

# Get node locations
print(f"  Retrieving {len(mw_collector.all_nodes)} node locations...")
loc_handler3 = LocationHandler(mw_collector.all_nodes)
loc_handler3.apply_file(PBF_PATH)
print(f"  Retrieved {len(loc_handler3.locations)} node locations")

# Build way geometries in WGS84
way_geoms_wgs = {}
for way_id, node_refs in mw_collector.way_nodes.items():
    coords = []
    valid = True
    for ref in node_refs:
        if ref in loc_handler3.locations:
            coords.append(loc_handler3.locations[ref])
        else:
            valid = False
            break
    if valid and len(coords) >= 2:
        way_geoms_wgs[way_id] = LineString(coords)

print(f"  Built geometries for {len(way_geoms_wgs)} ways")

# For each relation, collect member way geometries that intersect the buffer
pt_features = []

for rel_id, rel_data in pt_collector.relations.items():
    member_lines_wgs = []
    for wid in rel_data["member_way_ids"]:
        if wid in way_geoms_wgs:
            line = way_geoms_wgs[wid]
            # Quick check: does this way intersect the buffer?
            if line.intersects(buffer_bbox_wgs84) and line.intersects(buffer_geom_wgs84):
                member_lines_wgs.append(line)

    if not member_lines_wgs:
        continue

    # Merge into a single MultiLineString
    if len(member_lines_wgs) == 1:
        geom_wgs = member_lines_wgs[0]
    else:
        try:
            merged = unary_union(member_lines_wgs)
            if merged.geom_type == "MultiLineString":
                geom_wgs = merged
            elif merged.geom_type == "LineString":
                geom_wgs = MultiLineString([merged])
            else:
                # Could be GeometryCollection or something else - skip
                continue
        except Exception:
            continue

    # Project to MGI
    geom_mgi = transform(lambda x, y: wgs_to_mgi.transform(x, y), geom_wgs)

    tags = rel_data["tags"]
    pt_features.append({
        "geometry": geom_mgi,
        "properties": {
            "osm_id": rel_id,
            "ref": tags.get("ref"),
            "name": tags.get("name"),
            "operator": tags.get("operator"),
            "route": tags.get("route"),
        }
    })

print(f"  Collected {len(pt_features)} PT route features intersecting the buffer")

# ---------------------------------------------------------------------------
# STAGE 4: Write to GeoPackage
# ---------------------------------------------------------------------------
print("\n=== STAGE 4: Writing GeoPackage ===")

# Build GeoDataFrames
gdf_highways = gpd.GeoDataFrame(
    [f["properties"] for f in highway_features],
    geometry=[f["geometry"] for f in highway_features],
    crs=f"EPSG:{TARGET_EPSG}"
)

gdf_pt = gpd.GeoDataFrame(
    [f["properties"] for f in pt_features],
    geometry=[f["geometry"] for f in pt_features],
    crs=f"EPSG:{TARGET_EPSG}"
)

# Ensure geometry types are correct
gdf_highways = gdf_highways.set_geometry("geometry")
gdf_highways = gdf_highways[gdf_highways.geometry.type == "LineString"]

gdf_pt = gdf_pt.set_geometry("geometry")
# Ensure MultiLineString
for idx in gdf_pt.index:
    geom = gdf_pt.at[idx, "geometry"]
    if geom.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([geom])

print(f"  Highways: {len(gdf_highways)} features")
print(f"  PT routes: {len(gdf_pt)} features")

# Write to GeoPackage
gdf_highways.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")

print(f"\nWritten to {GPKG_PATH}")

# Verify
print("\n=== Verification ===")
for layer_name in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer_name)
    print(f"\nLayer '{layer_name}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    print(f"  Columns: {list(gdf.columns)}")
    if len(gdf) > 0:
        print(f"  Sample properties:")
        for col in gdf.columns:
            if col != "geometry":
                vals = gdf[col].dropna().unique()[:5]
                print(f"    {col}: {list(vals)}")

print("\nDone!")
Wrote /work/solve.py (16710 chars).
6:41
The script looks quite long. Let me think about potential issues and run it.

Key concerns:
1. The PBF is large (~300-400 MB). The script downloads it if not present.
2. We're making multiple passes over the PBF which is slow but necessary with osmium.
3. The `name.endswith("Gürtel")` check needs to account for possible trailing whitespace.
4. We need to ensure the node location handler properly retrieves node coordinates.

Let me run it and see what happens.
6:41
Bash cd /work && python3 solve.py 2>&1
<exit_code>-1</exit_code>
<error>Command timed out after 600000ms</error>
16:57
The script timed out. The PBF file is likely very large and reading it multiple times is expensive. Let me restructure the approach to be more efficient:

1. Download the PBF first (separate command)
2. Use a single-pass approach where possible
3. Or use duckdb/osmium CLI to do filtering more efficiently

Actually, the main issue is that the script does a full PBF read several times. Let me think about this differently.

The PBF for all of Austria is probably around 1-2 GB. Reading it 5 times (Gürtel finder, highway node collector, highway location handler, PT relation collector, member way node collector, location handler 3) = 6 full PBF reads. That's way too slow.

Let me restructure to do:
1. One pass: collect Gürtel way info + all highway ways + PT relations, and all the node refs we need
2. A second pass (with NodeLocationsForWays) to get geometry

Actually, let me use NodeLocationsForWays in a single pass approach. The issue is that `apply` with NodeLocationsForWays caches all node locations, so we can only do one pass. But we can collect everything in one handler.

Let me restructure:
- Single pass with NodeLocationsForWays that:
  - Finds Gürtel ways and builds their geometries
  - Collects all highway ways that might intersect (just IDs and node refs)
  - Collects all PT relations and member way IDs
  - Stores ALL node locations needed (via NodeLocationsForWays)

Wait, NodeLocationsForWays stores ALL node locations from the file, which would be all of Austria - that's a lot of memory. But it's the only way to resolve way geometries on the fly.

Actually, a better approach: use a handler that first collects everything in memory (way IDs, node refs, etc.), then build geometries from the cached locations. But we need to get node locations somehow.

Let me try a single-pass approach that just collects all the IDs and node refs in one pass, then queries node locations. Actually, the problem is we still need to do separate passes.

Alternative: use osmium's file_processor or use NodeLocationsForWays handler in one pass.

Let me think about this more carefully.

Actually the issue is the 10-minute timeout. Let me try:
1. First download the PBF with a CLI command (wget has better resume support)
2. Then run the script more efficiently

Let me rewrite the script to minimize passes. With NodeLocationsForWays, we get node locations for free - it stores all locations in a map as it processes nodes, and makes them available when processing ways. So we can do everything in ONE pass with NodeLocationsForWays.

But the memory usage might be high for all of Austria. However, it should work.

Let me restructure:

One handler:
- Relies on NodeLocationsForWays for node locations
- On way(): record Gürtel ways (build line, add to buffer computation), record highway ways with their full geometry
- On relation(): record PT relations with member way IDs
- On node(): nothing special needed as NodeLocationsForWays handles it

But the problem is that NodeLocationsForWays processes ALL nodes in the file, which for Austria could be hundreds of millions. This is slow but might work within the timeout.

Actually, I think the download itself might be taking long. Let me download separately first, then run the script.

Wait, the timeout was on the entire python script. Let me check if the download completed or if it's stuck on reading.

Let me try downloading first with curl/wget, then writing a more optimized script.
16:57
16:57
Bash ls -lh /work/austria-latest.osm.pbf 2>&1
-rw-r--r-- 1 agent agent 765M Jun 17 03:18 /work/austria-latest.osm.pbf
17:45
OK, the PBF is 765 MB and the script timed out because it was doing 4 full passes over the file. Let me rewrite to be much more efficient:

1. Single pass: collect Gürtel ways, highway ways, PT relations, and node references all at once
2. Then maybe a second pass for node locations

Actually, with osmium SimpleHandler we can collect everything in one pass, then a second pass for node locations. But the issue is that NodeLocationsForWays adds locations to ways as they're read, which is designed for a single pass. But we can't use it if we need to also look at relations.

Let me think about this differently. The most efficient approach:
- Pass 1: Use a handler that:
  - Collects Gürtel way IDs + their node refs
  - Collects all highway way IDs + their node refs
  - Collects PT relation IDs + member way IDs + tags
  - Collects all node IDs from everything above
- Then we know exactly which nodes we need.
- Pass 2: Use a handler that only stores locations for those nodes
- Then build geometries in memory

But it seems the big PBF + multiple passes is causing the timeout. Let me try a different approach: use osmium to do a single pass with location caching, store results to intermediate files, then process.

Actually, the key optimization: instead of reading the whole file multiple times, let me use `NodeLocationsForWays` which can handle node locations in a single pass. But the challenge is that we need to:
1. First identify Gürtel ways (to build buffer)
2. Then identify highways and relations within that buffer

This is inherently a two-phase process. But we can combine phases 1 and initial data collection.

Let me write a more efficient version:
- Pass 1: Find Gürtel ways, highway ways (all of them), PT relations - all at once. Store all node refs.
- Pass 2: Get all needed node locations at once
- Then filter geometries in memory

Wait, actually the issue might be that the script is slow because it's doing 4 full passes of a 765MB file. Let me reduce to 2 passes.

Actually re-reading my code - I'm doing:
1. WayNodeCollector -> whole file (Gürtel)
2. LocationHandler -> whole file (Gürtel nodes)
3. HighwayCollector -> whole file (highway ways)
4. LocationHandler2 -> whole file (highway nodes)
5. PTRelationCollector -> whole file (relations)
6. MemberWayNodeCollector -> whole file (member way nodes)
7. LocationHandler3 -> whole file (member way nodes)

That's 7 passes! Way too many.

Let me consolidate into a single pass that collects everything, then one more pass for all node locations.
17:45
17:45
Write /work/solve.py
content · 349 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor buffer) and PT route relations from Austria OSM PBF
and write to vienna_network.gpkg in EPSG:31287 (MGI / Austria Lambert).

Two-pass approach:
  Pass 1 - collect:  Gürtel ways, all highway ways, PT relations + member way refs, all node refs
  Pass 2 - locations: retrieve coordinates for all needed nodes
  Then filter by buffer intersection and write.
"""

import os, sys, json, time, warnings
import osmium
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import LineString, MultiLineString, Point, box
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

# ---------------------------------------------------------------------------
PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287  # MGI / Austria Lambert

# ---------------------------------------------------------------------------
wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

def project_wgs_to_mgi(geom):
    return transform(lambda x, y: wgs_to_mgi.transform(x, y), geom) if geom else None

def project_mgi_to_wgs(geom):
    return transform(lambda x, y: mgi_to_wgs.transform(x, y), geom) if geom else None

# ---------------------------------------------------------------------------
# PASS 1: Collect everything
# ---------------------------------------------------------------------------
print("PASS 1: Collecting way/relation metadata...")

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

class Collector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        # Gürtel ways
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}   # way_id -> [node_ref]
        # Highway ways (all of them; we'll filter spatially later)
        self.highway_way_nodes = {}   # way_id -> [node_ref]
        self.highway_way_tags = {}    # way_id -> tag dict
        # PT relations
        self.pt_relations = {}        # rel_id -> {tags, member_way_ids}
        # All node refs we'll need
        self.all_node_refs = set()

    def way(self, w):
        tags = dict((k, v) for k, v in w.tags)
        name = tags.get("name", "")
        way_id = w.id
        node_refs = [n.ref for n in w.nodes]

        # Gürtel detection
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(way_id)
            self.guertel_way_nodes[way_id] = node_refs
            for ref in node_refs:
                self.all_node_refs.add(ref)

        # Highway detection
        if "highway" in tags:
            self.highway_way_nodes[way_id] = node_refs
            self.highway_way_tags[way_id] = tags
            for ref in node_refs:
                self.all_node_refs.add(ref)

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        member_way_ids = [m.ref for m in r.members if m.type == "w"]
        if not member_way_ids:
            return
        self.pt_relations[r.id] = {"tags": tags, "member_way_ids": member_way_ids}
        # We'll collect member way node refs later — store member way IDs
        # They'll be in highway_way_nodes if they exist

collector = Collector()
collector.apply_file(PBF_PATH)

print(f"  Gürtel ways: {len(collector.guertel_way_ids)}")
print(f"  Highway ways (all): {len(collector.highway_way_nodes)}")
print(f"  PT relations: {len(collector.pt_relations)}")
print(f"  Unique node refs: {len(collector.all_node_refs)}")

# ---------------------------------------------------------------------------
# Build Gürtel geometry from stored node refs — need node locations
# PASS 2: Get node locations for everything
# ---------------------------------------------------------------------------
print("\nPASS 2: Retrieving node locations...")

class NodeLocationHandler(osmium.SimpleHandler):
    def __init__(self, wanted_nodes):
        super().__init__()
        self.wanted_nodes = wanted_nodes
        self.locations = {}  # node_id -> (lon, lat)

    def node(self, n):
        if n.id in self.wanted_nodes:
            self.locations[n.id] = (n.location.lon, n.location.lat)

loc_handler = NodeLocationHandler(collector.all_node_refs)
loc_handler.apply_file(PBF_PATH)
print(f"  Retrieved {len(loc_handler.locations)} node locations")

# ---------------------------------------------------------------------------
# Build Gürtel geometry in WGS84
# ---------------------------------------------------------------------------
print("\nBuilding Gürtel geometry...")
guertel_lines = []
for way_id, node_refs in collector.guertel_way_nodes.items():
    coords = []
    ok = True
    for ref in node_refs:
        if ref in loc_handler.locations:
            coords.append(loc_handler.locations[ref])
        else:
            ok = False
            break
    if ok and len(coords) >= 2:
        guertel_lines.append(LineString(coords))

if not guertel_lines:
    print("ERROR: No Gürtel ways found!")
    sys.exit(1)

print(f"  Built {len(guertel_lines)} Gürtel linestrings")

# Project, union, buffer 500 m
guertel_lines_mgi = [project_wgs_to_mgi(ln) for ln in guertel_lines]
guertel_union_mgi = unary_union(guertel_lines_mgi) if len(guertel_lines_mgi) > 1 else guertel_lines_mgi[0]
buffer_mgi = guertel_union_mgi.buffer(500.0)

# WGS84 version for spatial filtering
buffer_wgs84 = project_mgi_to_wgs(buffer_mgi)
buffer_bbox_wgs84 = box(*buffer_wgs84.bounds)

print(f"  Buffer area: {buffer_mgi.area / 1e6:.2f} km²")

# ---------------------------------------------------------------------------
# Filter highway ways by intersection with buffer
# ---------------------------------------------------------------------------
print("\nFiltering highway ways by buffer intersection...")

def build_line_wgs(node_refs, locations):
    coords = []
    for ref in node_refs:
        if ref in locations:
            coords.append(locations[ref])
        else:
            return None
    if len(coords) < 2:
        return None
    return LineString(coords)

highway_features = []
hw_ids_done = set()

for way_id, node_refs in collector.highway_way_nodes.items():
    line_wgs = build_line_wgs(node_refs, loc_handler.locations)
    if line_wgs is None:
        continue
    if not line_wgs.intersects(buffer_bbox_wgs84):
        continue
    if not line_wgs.intersects(buffer_wgs84):
        continue

    line_mgi = project_wgs_to_mgi(line_wgs)
    tags = collector.highway_way_tags[way_id]
    highway_features.append({
        "geometry": line_mgi,
        "name": tags.get("name"),
        "highway": tags.get("highway"),
        "maxspeed": tags.get("maxspeed"),
        "lanes": tags.get("lanes"),
        "surface": tags.get("surface"),
        "oneway": tags.get("oneway"),
        "osm_id": way_id,
    })
    hw_ids_done.add(way_id)

print(f"  Highway features intersecting buffer: {len(highway_features)}")

# ---------------------------------------------------------------------------
# Build PT route features
# ---------------------------------------------------------------------------
print("\nBuilding PT route features...")

# Build a lookup of way_id -> LineString (WGS84) for all highway ways that we have coords for
# But we need the member ways of PT relations — which may not be tagged as highway!
# Let's build from all known ways (those in highway_way_nodes that have geometries)

# Actually many PT routes use ways that ARE highways, but some might be dedicated rail tracks.
# We need to consider ways that PT relations reference. Let's check which member ways
# we have geometries for.

# Build all possible way geometries from our node cache
print("  Building member way geometries...")
way_geom_cache = {}  # way_id -> LineString WGS84
way_ids_with_geom = set()

# We already processed highway_way_nodes - use that cache
for way_id, node_refs in collector.highway_way_nodes.items():
    line = build_line_wgs(node_refs, loc_handler.locations)
    if line is not None:
        way_geom_cache[way_id] = line
        way_ids_with_geom.add(way_id)

# Also check PT member ways that might not be highways
# Collect any additional member way IDs we don't have yet
extra_way_ids = set()
for rel_id, rel_data in collector.pt_relations.items():
    for wid in rel_data["member_way_ids"]:
        if wid not in way_ids_with_geom:
            extra_way_ids.add(wid)

print(f"  PT member ways already in cache: {len(way_ids_with_geom)}")
print(f"  PT member ways needing extra fetch: {len(extra_way_ids)}")

# We need node refs for these extra ways. We don't have them stored.
# We could do another pass, but that's expensive.
# For the missing ones, try to build from what we can...

# Actually, let's do a targeted pass for these extra way node refs
if extra_way_ids:
    print("  Fetching extra way node refs and their locations...")
    class ExtraWayCollector(osmium.SimpleHandler):
        def __init__(self, wanted_way_ids):
            super().__init__()
            self.wanted_way_ids = wanted_way_ids
            self.way_nodes = {}
            self.all_nodes = set()
        def way(self, w):
            if w.id in self.wanted_way_ids:
                refs = [n.ref for n in w.nodes]
                self.way_nodes[w.id] = refs
                for ref in refs:
                    self.all_nodes.add(ref)

    extra = ExtraWayCollector(extra_way_ids)
    extra.apply_file(PBF_PATH)

    extra_locs = NodeLocationHandler(extra.all_nodes)
    extra_locs.apply_file(PBF_PATH)

    for way_id, node_refs in extra.way_nodes.items():
        line = build_line_wgs(node_refs, extra_locs.locations)
        if line is not None:
            way_geom_cache[way_id] = line

print(f"  Total way geometries available: {len(way_geom_cache)}")

# Build PT features
pt_features = []
for rel_id, rel_data in collector.pt_relations.items():
    member_lines = []
    for wid in rel_data["member_way_ids"]:
        if wid in way_geom_cache:
            line = way_geom_cache[wid]
            if line.intersects(buffer_wgs84):
                member_lines.append(line)

    if not member_lines:
        continue

    # Build MultiLineString
    if len(member_lines) == 1:
        geom_wgs = member_lines[0]
    else:
        merged = unary_union(member_lines)
        if merged.geom_type == "MultiLineString":
            geom_wgs = merged
        elif merged.geom_type == "LineString":
            geom_wgs = MultiLineString([merged])
        else:
            continue

    geom_mgi = project_wgs_to_mgi(geom_wgs)
    tags = rel_data["tags"]
    pt_features.append({
        "geometry": geom_mgi,
        "osm_id": rel_id,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  PT route features intersecting buffer: {len(pt_features)}")

# ---------------------------------------------------------------------------
# Write GeoPackage
# ---------------------------------------------------------------------------
print("\nWriting GeoPackage...")

# Build DataFrames
hw_df = pd.DataFrame(highway_features)
pt_df = pd.DataFrame(pt_features)

gdf_hw = gpd.GeoDataFrame(hw_df, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
gdf_pt = gpd.GeoDataFrame(pt_df, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")

# Ensure LineString for highways
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]
# Ensure MultiLineString for PT routes
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])

print(f"  Highways layer: {len(gdf_hw)} features")
print(f"  PT routes layer: {len(gdf_pt)} features")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")

print(f"\nWritten to {GPKG_PATH}")

# ---------------------------------------------------------------------------
# Verify
# ---------------------------------------------------------------------------
print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    print(f"  Columns: {list(gdf.columns)}")
    if len(gdf) > 0:
        for col in gdf.columns:
            if col != "geometry":
                vals = gdf[col].dropna().unique()[:5]
                print(f"    {col}: {list(vals)}")

print("\nDone!")
Wrote /work/solve.py (12810 chars).
17:52
The script now does 2 full PBF passes (one for collecting metadata, one for node locations) and potentially a third small pass for extra way refs. Let me run it with a longer timeout.
17:52
Bash cd /work && timeout 480 python3 solve.py 2>&1
PASS 1: Collecting way/relation metadata...
  Gürtel ways: 415
  Highway ways (all): 2501914
  PT relations: 12433
  Unique node refs: 23564536

PASS 2: Retrieving node locations...
  Retrieved 23564536 node locations

Building Gürtel geometry...
  Built 415 Gürtel linestrings
  Buffer area: 22.83 km²

Filtering highway ways by buffer intersection...
26:37
The script is timing out in the filtering phase because it's doing a geometric intersection test for each of the 2.5M highway ways, each requiring building a LineString from node refs. I need to optimize this significantly.

Key ideas:
1. First filter by bounding box before building the full geometry
2. Use the node locations more efficiently - maybe build the geometry only if way nodes fall within a bbox
3. Or use a different approach entirely

Actually, the main bottleneck is building LineStrings for 2.5M ways and doing intersection tests. I should:

1. First compute the WGS84 bounding box of the buffer
2. For each way, check if its nodes fall within the bbox before building the full geometry
3. Use spatial indexing via osmium's capabilities

Actually, a better approach: osmium has a location handler that can do on-the-fly geometry building. Let me use osmium's NodeLocationsForWays handler combined with a simpler handler that only stores ways that pass the bbox test.

Wait - the key insight is that we can't build LineStrings for 2.5M ways. Instead, I should do a pre-filter: for each way, check if any of its node references fall within the bbox of the buffer. This is much cheaper since I already have all node locations loaded.

Let me restructure:
1. Pre-compute the buffer bbox in WGS84
2. For each way, quick check if any node falls within the bbox (using node locations dict)
3. Only if yes, build the full line and do the full intersection test

This should dramatically reduce the number of full geometry constructions.
26:37
26:37
Write /work/solve.py
content · 318 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor buffer) and PT route relations from Austria OSM PBF
and write to vienna_network.gpkg in EPSG:31287.

Optimized: use node-location pre-filter to avoid building geoms for non-intersecting ways.
"""

import os, sys, warnings
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, box, Point
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

# ---------------------------------------------------------------------------
PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287

wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

def proj_wgs_to_mgi(geom):
    return transform(lambda x, y: wgs_to_mgi.transform(x, y), geom)

def proj_mgi_to_wgs(geom):
    return transform(lambda x, y: mgi_to_wgs.transform(x, y), geom)

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

# ---------------------------------------------------------------------------
# PASS 1: Collect Gürtel way IDs + node refs, PT relation member way IDs
# Also collect ALL node refs we'll ever need
# ---------------------------------------------------------------------------
print("=== PASS 1: Collect metadata ===")

class Collector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}
        self.highway_way_nodes = {}
        self.highway_way_tags = {}
        self.pt_relations = {}
        self.pt_member_way_ids = set()
        self.node_refs = set()

    def way(self, w):
        tags = dict((k, v) for k, v in w.tags)
        name = tags.get("name", "")
        wid = w.id
        refs = [n.ref for n in w.nodes]

        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(wid)
            self.guertel_way_nodes[wid] = refs

        if "highway" in tags:
            self.highway_way_nodes[wid] = refs
            self.highway_way_tags[wid] = tags

        for r in refs:
            self.node_refs.add(r)

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        mw = [m.ref for m in r.members if m.type == "w"]
        if not mw:
            return
        self.pt_relations[r.id] = {"tags": tags, "member_way_ids": mw}
        for wid in mw:
            self.pt_member_way_ids.add(wid)

collector = Collector()
collector.apply_file(PBF_PATH)
print(f"  Gürtel ways: {len(collector.guertel_way_ids)}")
print(f"  Highway ways: {len(collector.highway_way_nodes)}")
print(f"  PT relations: {len(collector.pt_relations)}")
print(f"  PT member ways: {len(collector.pt_member_way_ids)}")
print(f"  Unique node refs: {len(collector.node_refs)}")

# ---------------------------------------------------------------------------
# PASS 2: Get node locations
# ---------------------------------------------------------------------------
print("\n=== PASS 2: Node locations ===")

class LocHandler(osmium.SimpleHandler):
    def __init__(self, wanted):
        super().__init__()
        self.wanted = wanted
        self.locs = {}

    def node(self, n):
        if n.id in self.wanted:
            self.locs[n.id] = (n.location.lon, n.location.lat)

loc = LocHandler(collector.node_refs)
loc.apply_file(PBF_PATH)
print(f"  Got {len(loc.locs)} locations")

# ---------------------------------------------------------------------------
# Build Gürtel buffer
# ---------------------------------------------------------------------------
print("\n=== Build Gürtel buffer ===")

def build_line(refs, locs):
    coords = []
    for r in refs:
        if r in locs:
            coords.append(locs[r])
        else:
            return None
    return LineString(coords) if len(coords) >= 2 else None

guertel_lines = []
for wid, refs in collector.guertel_way_nodes.items():
    ln = build_line(refs, loc.locs)
    if ln:
        guertel_lines.append(ln)

guertel_mgi = [proj_wgs_to_mgi(ln) for ln in guertel_lines]
guertel_union = unary_union(guertel_mgi) if len(guertel_mgi) > 1 else guertel_mgi[0]
buffer_mgi = guertel_union.buffer(500.0)
buffer_wgs84 = proj_mgi_to_wgs(buffer_mgi)
buffer_bbox = buffer_wgs84.bounds  # (minx, miny, maxx, maxy)
bbox_poly = box(*buffer_bbox)
print(f"  Buffer area: {buffer_mgi.area/1e6:.2f} km²")
print(f"  Buffer bbox: {buffer_bbox}")

# Pre-compute bbox for fast node check
minx, miny, maxx, maxy = buffer_bbox

# ---------------------------------------------------------------------------
# Filter highways — use node bbox pre-check
# ---------------------------------------------------------------------------
print("\n=== Filter highways ===")

# For a way to potentially intersect the buffer, at least one of its nodes
# must be within the buffer bbox. Since we already have all locations,
# we can cheaply check node coordinates.
def way_passes_bbox(refs, locs):
    for r in refs:
        p = locs.get(r)
        if p and minx <= p[0] <= maxx and miny <= p[1] <= maxy:
            return True
    return False

highways_out = []
count, passed_bbox = 0, 0
for wid, refs in collector.highway_way_nodes.items():
    count += 1
    if count % 200000 == 0:
        print(f"    Progress: {count}/{len(collector.highway_way_nodes)} (bbox passed: {passed_bbox})")

    if not way_passes_bbox(refs, loc.locs):
        continue
    passed_bbox += 1

    ln = build_line(refs, loc.locs)
    if ln is None:
        continue
    if not ln.intersects(buffer_wgs84):
        continue

    ln_mgi = proj_wgs_to_mgi(ln)
    tags = collector.highway_way_tags[wid]
    highways_out.append({
        "geometry": ln_mgi,
        "osm_id": wid,
        "name": tags.get("name"),
        "highway": tags.get("highway"),
        "maxspeed": tags.get("maxspeed"),
        "lanes": tags.get("lanes"),
        "surface": tags.get("surface"),
        "oneway": tags.get("oneway"),
    })

print(f"  Total ways: {count}, bbox-passed: {passed_bbox}, intersected: {len(highways_out)}")

# ---------------------------------------------------------------------------
# Build PT route geometries
# ---------------------------------------------------------------------------
print("\n=== Build PT routes ===")

# First, ensure we have geometries for all PT member ways
# Find PT member ways we don't yet have
hw_way_ids = set(collector.highway_way_nodes.keys())
missing_pt_ways = collector.pt_member_way_ids - hw_way_ids
print(f"  PT member ways in highway set: {len(collector.pt_member_way_ids & hw_way_ids)}")
print(f"  PT member ways NOT in highway set: {len(missing_pt_ways)}")

# Collect these missing way node refs
if missing_pt_ways:
    class ExtraWayCollector(osmium.SimpleHandler):
        def __init__(self, wids):
            super().__init__()
            self.wanted = wids
            self.way_nodes = {}
            self.extra_nodes = set()
        def way(self, w):
            if w.id in self.wanted:
                refs = [n.ref for n in w.nodes]
                self.way_nodes[w.id] = refs
                for r in refs:
                    self.extra_nodes.add(r)

    extra = ExtraWayCollector(missing_pt_ways)
    extra.apply_file(PBF_PATH)

    # Get locations for extra nodes
    extra_locs = LocHandler(extra.extra_nodes)
    extra_locs.apply_file(PBF_PATH)

    # Build geometries and add to highway_way_nodes cache
    for wid, refs in extra.way_nodes.items():
        collector.highway_way_nodes[wid] = refs
    loc.locs.update(extra_locs.locs)

# Build a cache of WGS84 way geometries for all ways that we need
# Only build for ways intersecting buffer bbox
print("  Building way geometry cache for PT member ways...")
way_geom_wgs = {}  # way_id -> LineString WGS84
all_relevant_ways = collector.pt_member_way_ids & set(collector.highway_way_nodes.keys())
count2 = 0
for wid in all_relevant_ways:
    count2 += 1
    if count2 % 5000 == 0:
        print(f"    Progress: {count2}/{len(all_relevant_ways)}")
    refs = collector.highway_way_nodes[wid]
    if not way_passes_bbox(refs, loc.locs):
        continue
    ln = build_line(refs, loc.locs)
    if ln and ln.intersects(buffer_wgs84):
        way_geom_wgs[wid] = ln

print(f"  PT member ways with geom intersecting buffer: {len(way_geom_wgs)}")

# Build route features
print("  Assembling route geometries...")
pt_out = []
for rid, rdata in collector.pt_relations.items():
    members = []
    for wid in rdata["member_way_ids"]:
        if wid in way_geom_wgs:
            members.append(way_geom_wgs[wid])
    if not members:
        continue

    if len(members) == 1:
        gw = members[0]
    else:
        merged = unary_union(members)
        if merged.geom_type == "MultiLineString":
            gw = merged
        elif merged.geom_type == "LineString":
            gw = MultiLineString([merged])
        else:
            continue

    gm = proj_wgs_to_mgi(gw)
    tags = rdata["tags"]
    pt_out.append({
        "geometry": gm,
        "osm_id": rid,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  Route features: {len(pt_out)}")

# ---------------------------------------------------------------------------
# Write GeoPackage
# ---------------------------------------------------------------------------
print("\n=== Write GeoPackage ===")

gdf_hw = gpd.GeoDataFrame(highways_out, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]

gdf_pt = gpd.GeoDataFrame(pt_out, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])

print(f"  Highways: {len(gdf_hw)} features, geom types: {gdf_hw.geometry.type.unique()}")
print(f"  PT routes: {len(gdf_pt)} features, geom types: {gdf_pt.geometry.type.unique()}")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")
print(f"\nWritten to {GPKG_PATH}")

# ---------------------------------------------------------------------------
# Verify
# ---------------------------------------------------------------------------
print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    cols = [c for c in gdf.columns if c != "geometry"]
    print(f"  Columns: {cols}")
    if len(gdf) > 0:
        for c in cols:
            vals = gdf[c].dropna().unique()[:5]
            print(f"    {c}: {list(vals)}")

print("\nDone!")
Wrote /work/solve.py (11069 chars).
27:39
The approach of iterating through 2.5 million ways in Python is going to be very slow. The `collector.highway_way_nodes` dict alone has 2.5M entries with each containing a list of node refs. Let me think about a better approach.

The key insight is that for filtering highways, we only need to check ways whose nodes fall within the buffer bbox. But with 23M+ nodes, we can't pre-filter easily...

A much faster approach: use osmium's `NodeLocationsForWays` handler which resolves node locations automatically, and in the `way()` callback, we can check if the way is within the buffer. This avoids storing all 2.5M way nodes in memory and avoids a second pass.

Actually, the issue is that we need the buffer first (from Gürtel ways). So we need two passes regardless. But we can optimize by:
1. First pass: find Gürtel ways and build buffer
2. Second pass: use NodeLocationsForWays to resolve locations on the fly, and filter highway ways and PT member ways

Wait, osmium's NodeLocationsForWays requires a handler that provides node coordinates. Let's use a real handler approach.

Actually, let me reconsider. The problem is memory and time. Let me use a streaming approach with NodeLocationsForWays for the second pass, which will resolve locations efficiently.

Let me try a different architecture:

Pass 1: Find Gürtel way IDs and their node refs (minimal).
Then get the Gürtel node locations, build buffer.

Pass 2: Use osmium with NodeLocationsForWays to process all ways, and for each way check if its nodes' bounding box overlaps the buffer. This way we don't store all the way data.

But NodeLocationsForWays stores locations for... all nodes? That's the same problem.

Actually, the issue in the current script is that:
1. We store ALL highway way node refs (2.5M ways × ~5 nodes = ~12M refs) in memory
2. We look up each one in a dict 

The bottleneck is the `for wid, refs in collector.highway_way_nodes.items():` loop that processes 2.5M ways.

Let me try a streaming approach instead - use NodeLocationsForWays combined with a handler that, during the way() callback, can build the geometry immediately and check intersection. But we need the buffer first...

Wait - we can still do it in 2 passes:
- Pass 1: Find Gürtel ways, get their node locations, build buffer
- Pass 2: Re-read the PBF with NodeLocationsForWays, and for each way with highway tag, build geometry on-the-fly (from the node location cache that NodeLocationsForWays maintains) and check intersection

NodeLocationsForWays stores all node locations in a map as it reads the file. So it would need to store ~23M nodes in memory anyway. But libosmium's NodeLocationsForWays uses a flexible storage (e.g., memory-mapped files), so it can handle this.

Let me rewrite using a single-pass for the highway extraction (after buffer is known) with NodeLocationsForWays.

Actually the current script's issue is that it's doing:
```python
for wid, refs in collector.highway_way_nodes.items():
    ...
    if not way_passes_bbox(refs, loc.locs):
        continue
```

This loops 2.5M times. Even if each iteration is fast, Python overhead makes this slow.

Instead, I should use osmium's NodeLocationsForWays which resolves node locations for us, and in the way handler, directly build the LineString geometry (osmium provides a geometry builder). This would be much faster as it's all in C++.

Let me rewrite with this approach.
27:39
27:39
Write /work/solve.py
content · 276 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor buffer) and PT routes to vienna_network.gpkg.
EPSG:31287.

Strategy:
  Phase 1: Find Gürtel ways, build 500m buffer (simple pass).
  Phase 2: Re-read PBF with NodeLocationsForWays, building geometries
           on the fly for highway ways and PT relations. Only keep those
           intersecting the buffer.
"""

import os, sys, warnings, json
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, shape, mapping
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287

wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

# ---------------------------------------------------------------------------
# PHASE 1: Find Gürtel ways, build node locations, create buffer
# ---------------------------------------------------------------------------
print("=== PHASE 1: Gürtel buffer ===")

class GuertelWayFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}

    def way(self, w):
        name = w.tags.get("name", "")
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(w.id)
            self.guertel_way_nodes[w.id] = [n.ref for n in w.nodes]

finder = GuertelWayFinder()
finder.apply_file(PBF_PATH)
print(f"  Gürtel ways: {len(finder.guertel_way_ids)}")

# Get Gürtel node locations
guertel_nodes = set()
for refs in finder.guertel_way_nodes.values():
    for r in refs:
        guertel_nodes.add(r)

class NodeGetter(osmium.SimpleHandler):
    def __init__(self, wanted):
        super().__init__()
        self.wanted = wanted
        self.locs = {}

    def node(self, n):
        if n.id in self.wanted:
            self.locs[n.id] = (n.location.lon, n.location.lat)

ng = NodeGetter(guertel_nodes)
ng.apply_file(PBF_PATH)
print(f"  Gürtel node locations: {len(ng.locs)}")

# Build Gürtel shapes
guertel_lines = []
for wid, refs in finder.guertel_way_nodes.items():
    pts = []
    ok = True
    for r in refs:
        if r in ng.locs:
            pts.append(ng.locs[r])
        else:
            ok = False
            break
    if ok and len(pts) >= 2:
        guertel_lines.append(LineString(pts))

guertel_mgi = [transform(lambda x, y: wgs_to_mgi.transform(x, y), ln) for ln in guertel_lines]
if len(guertel_mgi) > 1:
    guertel_union = unary_union(guertel_mgi)
else:
    guertel_union = guertel_mgi[0]

buffer_mgi = guertel_union.buffer(500.0)
buffer_wgs84 = transform(lambda x, y: mgi_to_wgs.transform(x, y), buffer_mgi)
buffer_bbox = buffer_wgs84.bounds  # (minx, miny, maxx, maxy)
print(f"  Buffer area: {buffer_mgi.area/1e6:.2f} km²")
print(f"  Buffer bbox (wgs): {buffer_bbox}")

# ---------------------------------------------------------------------------
# PHASE 2: Streaming extraction with NodeLocationsForWays
# ---------------------------------------------------------------------------
print("\n=== PHASE 2: Streaming extraction ===")

from osmium import NodeLocationsForWays
from osmium.index import create_map

# We'll use a handler that gets fed node locations automatically
class ExtractionHandler(osmium.SimpleHandler):
    def __init__(self, buffer_wgs84, buffer_bbox, guertel_way_ids):
        super().__init__()
        self.buffer = buffer_wgs84
        self.bminx, self.bminy, self.bmaxx, self.bmaxy = buffer_bbox
        self.guertel_ids = guertel_way_ids

        # Results
        self.highway_features = []   # (linestring_wgs, tags_dict)
        # PT relations - store temporarily to assemble after
        self.pt_relations = {}       # rel_id -> {tags, member_way_ids}
        self.pt_member_lines = {}    # way_id -> LineString (wgs84)
        self.pt_member_ways_seen = set()

    def way(self, w):
        wid = w.id
        tags = dict((k, v) for k, v in w.tags)

        # Build geometry from resolved node locations
        try:
            pts = [(n.lon, n.lat) for n in w.nodes if n.location.valid()]
        except:
            return

        if len(pts) < 2:
            return

        # Fast bbox check
        lons = [p[0] for p in pts]
        lats = [p[1] for p in pts]
        if max(lons) < self.bminx or min(lons) > self.bmaxx or \
           max(lats) < self.bminy or min(lats) > self.bmaxy:
            return

        line_wgs = LineString(pts)

        if not line_wgs.intersects(self.buffer):
            return

        # This way intersects the buffer
        if "highway" in tags:
            self.highway_features.append((line_wgs, {k: tags.get(k) for k in ("name","highway","maxspeed","lanes","surface","oneway") if k in tags}))

        # Also store for PT member ways (all ways that could be PT members)
        self.pt_member_lines[wid] = line_wgs

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        mw = [(m.ref, m.type) for m in r.members]
        way_members = [ref for ref, typ in mw if typ == "w"]
        if way_members:
            self.pt_relations[r.id] = {"tags": tags, "member_way_ids": way_members}

# Build a location map
print("  Setting up node location index...")
idx = create_map("flex_mem")
nlw_handler = NodeLocationsForWays(idx)

# Create the extraction handler
ext = ExtractionHandler(buffer_wgs84, buffer_bbox, finder.guertel_way_ids)

# Apply together - using the SimpleHandler chain approach
print("  Reading PBF and extracting features...")
# Apply with node locations
nlw_handler.apply_file(PBF_PATH, locations=True)
ext.apply_file(PBF_PATH)

# Wait, this won't work - we need to apply both together.
# osmium.apply() can handle multiple handlers
# Let's use osmium.apply instead
print("  Re-reading with streaming pipeline...")
osmium.apply(
    PBF_PATH,
    nlw_handler,
    ext,
    locations=True
)

print(f"  Highway features: {len(ext.highway_features)}")
print(f"  PT relations collected: {len(ext.pt_relations)}")
print(f"  PT member way geoms available: {len(ext.pt_member_lines)}")

# ---------------------------------------------------------------------------
# Assemble PT routes
# ---------------------------------------------------------------------------
print("\n=== Assemble PT routes ===")

pt_features = []
for rid, rdata in ext.pt_relations.items():
    members = []
    for wid in rdata["member_way_ids"]:
        if wid in ext.pt_member_lines:
            members.append(ext.pt_member_lines[wid])

    if not members:
        continue

    if len(members) == 1:
        gw = members[0]
    else:
        merged = unary_union(members)
        if merged.geom_type == "MultiLineString":
            gw = merged
        elif merged.geom_type == "LineString":
            gw = MultiLineString([merged])
        else:
            continue

    gm = transform(lambda x, y: wgs_to_mgi.transform(x, y), gw)
    tags = rdata["tags"]
    pt_features.append({
        "geometry": gm,
        "osm_id": rid,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  PT route features: {len(pt_features)}")

# ---------------------------------------------------------------------------
# Write GeoPackage
# ---------------------------------------------------------------------------
print("\n=== Write GeoPackage ===")

# Build highway GDF
hw_rows = []
for ln, tg in ext.highway_features:
    ln_mgi = transform(lambda x, y: wgs_to_mgi.transform(x, y), ln)
    row = {"geometry": ln_mgi}
    row.update(tg)
    hw_rows.append(row)

gdf_hw = gpd.GeoDataFrame(hw_rows, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
# Keep only linestring (some might have been null)
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]

gdf_pt = gpd.GeoDataFrame(pt_features, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])

print(f"  Highways: {len(gdf_hw)} features")
print(f"  PT routes: {len(gdf_pt)} features")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")
print(f"\nWritten to {GPKG_PATH}")

print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    cols = [c for c in gdf.columns if c != "geometry"]
    print(f"  Columns: {cols}")
    if len(gdf) > 0:
        for c in cols:
            vals = gdf[c].dropna().unique()[:5]
            print(f"    {c}: {list(vals)}")

print("\nDone!")
Wrote /work/solve.py (9204 chars).
28:08
Wait, the approach with `osmium.apply` and `NodeLocationsForWays` as a handler might not work correctly. Let me think about this more carefully.

Actually, the issue is that with `osmium.apply`, the `NodeLocationsForWays` handler needs to process nodes before ways. Let me check the proper way to use it.

The right approach is:
1. Use `osmium.apply()` with `NodeLocationsForWays` as a handler that goes first, followed by our extraction handler.
2. The `locations=True` flag is actually not a parameter of `apply` - it's handled by the NodeLocationsForWays handler itself.

Actually, looking at the pyosmium docs, the correct way is:

```python
handler = NodeLocationsForWays(index)
handler.apply_file(PBF_PATH, locations=True)
```

But wait, we need BOTH handlers to process together. Let me look at how to chain handlers.

Actually, `NodeLocationsForWays` can be used with `apply()`:

```python
from osmium import apply
apply(PBF_PATH, nlw_handler, ext_handler, locations=True)
```

The `locations=True` parameter tells the reader to provide node locations to the handlers. Let me check the API...

Actually, I think the issue is simpler. Let me use a different approach entirely.

Let me use `osmium.FileProcessor` or just use a single handler that uses `NodeLocationsForWays` as a base.

Actually, in pyosmium, the way to stream with node locations is:

```python
h = NodeLocationsForWays(index)
# Then use apply which processes both:
apply(PBF_PATH, h, my_handler)
```

The `NodeLocationsForWays` intercepts nodes, stores locations, and when a way is seen, it resolves the node refs to locations.

But there's a catch: `NodeLocationsForWays` inherits from `osmium.SimpleHandler`, so when you pass multiple handlers to `apply()`, they both process the same data stream. The `NodeLocationsForWays` handler processes nodes to store locations, and your handler processes ways that now have resolved locations.

Let me try this approach. The key is that `NodeLocationsForWays` is the first handler (processes nodes first), and my handler is second (processes ways that already have resolved node locations).

Let me fix the script and try it.
28:08
Write /work/solve.py
content · 261 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor buffer) and PT routes to vienna_network.gpkg.
EPSG:31287.

Strategy:
  1) Phase 1: minimal scan to find Gürtel way IDs + their node refs, get locations, build buffer.
  2) Phase 2: use osmium.apply with NodeLocationsForWays + extraction handler
     in a streaming pipeline — only ways intersecting the buffer are kept.
"""

import os, sys, warnings
import osmium
from osmium import apply
from osmium import NodeLocationsForWays
from osmium.index import create_map
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString, box
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287

wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

# ===================================================================
# PHASE 1 — Gürtel ways → buffer
# ===================================================================
print("=== PHASE 1: Gürtel buffer ===")

class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}

    def way(self, w):
        name = w.tags.get("name", "")
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(w.id)
            self.guertel_way_nodes[w.id] = [n.ref for n in w.nodes]

finder = GuertelFinder()
finder.apply_file(PBF_PATH)
print(f"  Gürtel ways: {len(finder.guertel_way_ids)}")

# Get node locations for Gürtel nodes
guertel_nodes = set()
for refs in finder.guertel_way_nodes.values():
    guertel_nodes.update(refs)

class NodeGetter(osmium.SimpleHandler):
    def __init__(self, wanted):
        super().__init__()
        self.wanted = wanted
        self.locs = {}

    def node(self, n):
        if n.id in self.wanted:
            self.locs[n.id] = (n.location.lon, n.location.lat)

ng = NodeGetter(guertel_nodes)
ng.apply_file(PBF_PATH)
print(f"  Node locations: {len(ng.locs)}")

# Build Gürtel geometries
guertel_lines = []
for wid, refs in finder.guertel_way_nodes.items():
    pts = [ng.locs[r] for r in refs if r in ng.locs]
    if len(pts) >= 2:
        guertel_lines.append(LineString(pts))

guertel_mgi = [transform(lambda x, y: wgs_to_mgi.transform(x, y), ln) for ln in guertel_lines]
guertel_union = unary_union(guertel_mgi) if len(guertel_mgi) > 1 else guertel_mgi[0]
buffer_mgi = guertel_union.buffer(500.0)
buffer_wgs84 = transform(lambda x, y: mgi_to_wgs.transform(x, y), buffer_mgi)
bminx, bminy, bmaxx, bmaxy = buffer_wgs84.bounds
print(f"  Buffer area: {buffer_mgi.area/1e6:.2f} km²")
print(f"  Bbox: {bminx:.4f},{bminy:.4f} → {bmaxx:.4f},{bmaxy:.4f}")

# ===================================================================
# PHASE 2 — Streaming extraction with NodeLocationsForWays
# ===================================================================
print("\n=== PHASE 2: Streaming extraction ===")

class StreamHandler(osmium.SimpleHandler):
    """Collects highway ways + PT relations and their member way geometries."""
    def __init__(self, buffer_geom, bbox):
        super().__init__()
        self.buffer = buffer_geom
        self.bminx, self.bminy, self.bmaxx, self.bmaxy = bbox

        self.highway_features = []   # [(LineString(wgs), dict_of_tags)]
        self.pt_relations = {}       # rel_id -> {tags, member_way_ids}
        self.way_geom_cache = {}     # way_id -> LineString(wgs84)

    def way(self, w):
        wid = w.id
        tags = dict((k, v) for k, v in w.tags)

        # Build geometry from resolved node locations
        pts = []
        for n in w.nodes:
            loc = n.location
            if loc.valid():
                pts.append((loc.lon, loc.lat))
        if len(pts) < 2:
            return

        # Quick bbox check
        lons = [p[0] for p in pts]
        lats = [p[1] for p in pts]
        if max(lons) < self.bminx or min(lons) > self.bmaxx or \
           max(lats) < self.bminy or min(lats) > self.bmaxy:
            return

        line = LineString(pts)

        if not line.intersects(self.buffer):
            return

        # Store way geometry for potential PT use
        self.way_geom_cache[wid] = line

        # If it's a highway, store it
        if "highway" in tags:
            kept = {}
            for k in ("name", "highway", "maxspeed", "lanes", "surface", "oneway"):
                if k in tags:
                    kept[k] = tags[k]
            kept["osm_id"] = wid
            self.highway_features.append((line, kept))

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        mw_ids = [m.ref for m in r.members if m.type == "w"]
        if not mw_ids:
            return
        self.pt_relations[r.id] = {"tags": tags, "member_way_ids": mw_ids}

# Set up node location index
print("  Creating node location index...")
index = create_map("flex_mem")
nlw = NodeLocationsForWays(index)

handler = StreamHandler(buffer_wgs84, (bminx, bminy, bmaxx, bmaxy))

print("  Reading PBF with streaming pipeline...")
apply(PBF_PATH, nlw, handler, locations=True)

print(f"  Highway features: {len(handler.highway_features)}")
print(f"  PT relations: {len(handler.pt_relations)}")
print(f"  Way geom cache: {len(handler.way_geom_cache)}")

# ===================================================================
# Assemble PT routes
# ===================================================================
print("\n=== Assemble PT routes ===")

pt_features = []
for rid, rdata in handler.pt_relations.items():
    members = []
    for wid in rdata["member_way_ids"]:
        if wid in handler.way_geom_cache:
            members.append(handler.way_geom_cache[wid])

    if not members:
        continue

    if len(members) == 1:
        gw = members[0]
    else:
        merged = unary_union(members)
        if merged.geom_type == "MultiLineString":
            gw = merged
        elif merged.geom_type == "LineString":
            gw = MultiLineString([merged])
        else:
            continue

    gm = transform(lambda x, y: wgs_to_mgi.transform(x, y), gw)
    tags = rdata["tags"]
    pt_features.append({
        "geometry": gm,
        "osm_id": rid,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  PT route features: {len(pt_features)}")

# ===================================================================
# Write GeoPackage
# ===================================================================
print("\n=== Write GeoPackage ===")

# Highways
hw_rows = []
for ln, td in handler.highway_features:
    ln_mgi = transform(lambda x, y: wgs_to_mgi.transform(x, y), ln)
    td = dict(td)  # copy
    td["geometry"] = ln_mgi
    hw_rows.append(td)

gdf_hw = gpd.GeoDataFrame(hw_rows, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]

# PT routes
gdf_pt = gpd.GeoDataFrame(pt_features, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])

print(f"  Highways layer: {len(gdf_hw)} features")
print(f"  PT routes layer: {len(gdf_pt)} features")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")
print(f"\nWritten to {GPKG_PATH}")

# ===================================================================
# Verify
# ===================================================================
print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    cols = [c for c in gdf.columns if c != "geometry"]
    print(f"  Columns: {cols}")
    if len(gdf) > 0:
        for c in cols:
            vals = gdf[c].dropna().unique()[:5]
            print(f"    {c}: {list(vals)}")

# Print coordinate magnitude check
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    if len(gdf) > 0:
        bounds = gdf.total_bounds
        print(f"\n  {layer} total_bounds (CRS {TARGET_EPSG}): {bounds}")
        print(f"  X range: {bounds[0]:.0f}–{bounds[2]:.0f}")
        print(f"  Y range: {bounds[1]:.0f}–{bounds[3]:.0f}")

print("\nDone!")
Wrote /work/solve.py (8927 chars).
28:18
Wait - I realized there's an issue with the streaming approach. The `apply(PBF_PATH, nlw, handler, locations=True)` call with `locations=True` means the handler receives ways with resolved node locations (the nodes are replaced with their location data by NodeLocationsForWays). But `locations=True` might conflict with the first handler. Let me check how osmium.apply works.

Actually, looking at pyosmium documentation, `osmium.apply` applies multiple handlers to the data. `NodeLocationsForWays` is a special handler that stores node locations and then replaces node references in ways with actual coordinates. When you chain them:

```python
apply(PBF_PATH, nlw, handler, locations=True)
```

The `locations=True` parameter tells the reader to emit node locations. Then `nlw` stores them, and when ways come through, `nlw` replaces node refs with locations. Then `handler` gets the ways with resolved locations.

This should work. But let me check the exact API. The pyosmium docs say:

```python
from osmium import apply
apply('file.osm.pbf', locations_handler, my_handler)
```

And `NodeLocationsForWays` needs to be the first handler. The `locations=True` might need to be on the reader. Let me try a simpler approach first.

Actually, looking at the pyosmium code, `apply` signature might be:
```python
apply(filename, *handlers, locations=False, idx=None)
```

So `apply(PBF_PATH, nlw, handler, locations=True)` should work - it passes `locations=True` to the reader. Let me try it.

One issue - `nlw` needs an index. I create one with `create_map("flex_mem")`. That should work.

Let me run it and see.
28:18
Bash cd /work && timeout 480 python3 solve.py 2>&1
=== PHASE 1: Gürtel buffer ===
  Gürtel ways: 415
  Node locations: 1932
  Buffer area: 22.83 km²
  Bbox: 15.1603,47.0245 → 16.4054,48.2408

=== PHASE 2: Streaming extraction ===
  Creating node location index...
  Reading PBF with streaming pipeline...
Traceback (most recent call last):
  File "/work/solve.py", line 160, in <module>
    apply(PBF_PATH, nlw, handler, locations=True)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: apply(): incompatible function arguments. The following argument types are supported:
    1. (reader: pyosmium::PyReader, handler: pyosmium::BaseHandler) -> None
    2. (reader: pyosmium::PyReader, *args) -> None
    3. (filename: str, handler: pyosmium::BaseHandler) -> None
    4. (filename: str, *args) -> None
    5. (filename: os.PathLike | str | bytes, handler: pyosmium::BaseHandler) -> None
    6. (filename: os.PathLike | str | bytes, *args) -> None
    7. (filename: osmium::io::File, handler: pyosmium::BaseHandler) -> None
    8. (filename: osmium::io::File, *args) -> None

Invoked with: '/work/austria-latest.osm.pbf', <osmium._osmium.NodeLocationsForWays object at 0x7f3d22f030b0>, <__main__.StreamHandler object at 0x7f3d2327fb60>; kwargs: locations=True
31:22
The `apply` function doesn't support `locations=True` kwarg. `NodeLocationsForWays` needs to be applied differently. Let me check how to properly use it.

Looking at the error, `apply` can take multiple handlers as positional args. The issue is that `NodeLocationsForWays` is a special handler that acts as a filter — it needs to be applied as a handler that processes nodes and gives node locations to the next handler.

Actually, looking at the pyosmium docs, `NodeLocationsForWays` is itself a handler that you chain. The way to use it is:

```python
apply(PBF_PATH, nlw, handler)
```

This should work since `nlw` (NodeLocationsForWays) is a handler. The `locations=True` kwarg is not needed — the `NodeLocationsForWays` handler takes care of that.

Let me just remove the `locations=True` kwarg.
31:22
Edit /work/solve.py
Edited /work/solve.py.
31:32
Actually, `apply()` with `NodeLocationsForWays` + a handler should work since `apply` takes handlers as positional args. But the issue is that `NodeLocationsForWays` is effectively a filter/handler that resolves node locations, and then the second handler gets the ways with resolved locations. Let me check how osmium's handlers work with apply for this case.

Looking at the error: `apply()` doesn't accept `locations=True` kwarg. Let me fix that and it should work since `NodeLocationsForWays` does its job automatically.
31:32
Bash cd /work && timeout 480 python3 solve.py 2>&1
=== PHASE 1: Gürtel buffer ===
  Gürtel ways: 415
  Node locations: 1932
  Buffer area: 22.83 km²
  Bbox: 15.1603,47.0245 → 16.4054,48.2408

=== PHASE 2: Streaming extraction ===
  Creating node location index...
  Reading PBF with streaming pipeline...
39:56
The streaming approach is still timing out because NodeLocationsForWays + the handler are processing the entire PBF, and storing way geometries for the whole thing is memory intensive. The issue is the `apply(PBF_PATH, nlw, handler)` - because `nlw` (NodeLocationsForWays) needs to process every node first before ways can be resolved. And then the handler's `way()` method stores every way's geometry in `way_geom_cache`.

Let me think about this differently. The buffer covers a large part of Vienna (Gürtel is a ring road around the inner city). The bbox seems correct: 15.16,47.02 to 16.41,48.24 - that's basically all of Vienna and more.

The problem is that `NodeLocationsForWays` reads the entire file to populate node locations, then the handler reads it again. But actually, `apply()` with multiple handlers should do it in one pass.

Wait, looking at the osmium docs more carefully: `apply(PBF_PATH, nlw, handler)` applies the chain: first `nlw` (which is a NodeLocationsForWays, not a regular handler), then `handler`. But `NodeLocationsForWays` is a special filter that modifies the stream so that way nodes have resolved locations. So the pipeline should work.

The issue is that for every way in Austria (millions of them), we're:
1. Resolving node coordinates
2. Building a list of points
3. Doing bbox check
4. Storing if it passes

The bbox spans all of Vienna and surroundings, so many ways pass the bbox check. And we're storing ALL way geometries that pass the bbox check in `way_geom_cache`, which could be massive.

Let me be smarter:
1. We don't need to store every way geometry - we only need to store geometries for:
   a. Highway ways that intersect the buffer (for the highways layer)
   b. PT member ways that intersect the buffer (for the pt_routes layer)

But we don't know which ways are PT member ways until we see the relations. And relations come after ways in the PBF file...

Actually, osmium reads OSM files in order: nodes, then ways, then relations. So when we see a way, we haven't seen the relations yet. So we can't know if a way is a PT member.

Alternative approach:
1. First pass: collect PT relation member way IDs (no node locations needed)
2. Second pass: use NodeLocationsForWays but only track highway ways + PT member ways intersecting the buffer

Let me do a two-phase approach with Phase 2 just collecting IDs of PT member ways, and Phase 3 being the full geometry extraction.

Actually, let me just use DuckDB/OGR to do this more efficiently. Or use osmium CLI if available.
39:56
39:56
Write /work/solve.py
content · 272 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor 500m buffer) and PT routes to vienna_network.gpkg.
EPSG:31287.

Optimized approach:
  1) Scan for Gürtel ways → build buffer.
  2) Scan for all PT relations, collect member way IDs.
  3) Scan with NodeLocationsForWays: only build geometries for 
     highway ways + PT member ways that intersect the buffer.
"""

import os, sys, warnings
import osmium
import geopandas as gpd
import pandas as pd
from shapely.geometry import LineString, MultiLineString
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287

wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

# ===================================================================
# STEP 1 — Gürtel ways → buffer
# ===================================================================
print("=== STEP 1: Gürtel buffer ===")

class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}

    def way(self, w):
        name = w.tags.get("name", "")
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            self.guertel_way_ids.add(w.id)
            self.guertel_way_nodes[w.id] = [n.ref for n in w.nodes]

finder = GuertelFinder()
finder.apply_file(PBF_PATH)
print(f"  Gürtel ways: {len(finder.guertel_way_ids)}")

guertel_nodes = set()
for refs in finder.guertel_way_nodes.values():
    guertel_nodes.update(refs)

class NodeGetter(osmium.SimpleHandler):
    def __init__(self, wanted):
        super().__init__()
        self.wanted = wanted
        self.locs = {}

    def node(self, n):
        if n.id in self.wanted:
            self.locs[n.id] = (n.location.lon, n.location.lat)

ng = NodeGetter(guertel_nodes)
ng.apply_file(PBF_PATH)
print(f"  Node locations: {len(ng.locs)}")

guertel_lines = []
for wid, refs in finder.guertel_way_nodes.items():
    pts = [ng.locs[r] for r in refs if r in ng.locs]
    if len(pts) >= 2:
        guertel_lines.append(LineString(pts))

guertel_mgi = [transform(lambda x, y: wgs_to_mgi.transform(x, y), ln) for ln in guertel_lines]
guertel_union = unary_union(guertel_mgi) if len(guertel_mgi) > 1 else guertel_mgi[0]
buffer_mgi = guertel_union.buffer(500.0)
buffer_wgs84 = transform(lambda x, y: mgi_to_wgs.transform(x, y), buffer_mgi)
bminx, bminy, bmaxx, bmaxy = buffer_wgs84.bounds
print(f"  Buffer area: {buffer_mgi.area/1e6:.2f} km²")

# ===================================================================
# STEP 2 — Collect PT relation member way IDs (no geometry)
# ===================================================================
print("\n=== STEP 2: PT relations ===")

class PtRelCollector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.relations = {}          # rel_id -> {tags, member_way_ids}
        self.all_member_way_ids = set()

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        mw_ids = [m.ref for m in r.members if m.type == "w"]
        if not mw_ids:
            return
        self.relations[r.id] = {"tags": tags, "member_way_ids": mw_ids}
        self.all_member_way_ids.update(mw_ids)

pt_collector = PtRelCollector()
pt_collector.apply_file(PBF_PATH)
print(f"  PT relations: {len(pt_collector.relations)}")
print(f"  Unique PT member way IDs: {len(pt_collector.all_member_way_ids)}")

# ===================================================================
# STEP 3 — Streaming extraction with NodeLocationsForWays
# Only process: highway ways + PT member ways intersecting buffer
# ===================================================================
print("\n=== STEP 3: Feature extraction ===")

class StreamHandler(osmium.SimpleHandler):
    def __init__(self, buffer_geom, bbox, pt_member_way_ids):
        super().__init__()
        self.buffer = buffer_geom
        self.bminx, self.bminy, self.bmaxx, self.bmaxy = bbox
        self.pt_member_way_ids = pt_member_way_ids

        self.highway_features = []    # [(LineString, attrs)]
        self.pt_way_geoms = {}        # way_id -> LineString(wgs84) - only PT member ways intersecting buffer

    def way(self, w):
        wid = w.id
        tags = dict((k, v) for k, v in w.tags)

        # Quick check: do we care about this way?
        is_highway = "highway" in tags
        is_pt_member = wid in self.pt_member_way_ids

        if not is_highway and not is_pt_member:
            return

        # Build geometry
        pts = [(n.lon, n.lat) for n in w.nodes if n.location.valid()]
        if len(pts) < 2:
            return

        lons = [p[0] for p in pts]
        lats = [p[1] for p in pts]
        if max(lons) < self.bminx or min(lons) > self.bmaxx or \
           max(lats) < self.bminy or min(lats) > self.bmaxy:
            return

        line = LineString(pts)

        if not line.intersects(self.buffer):
            return

        # Store PT member way geometry
        if is_pt_member:
            self.pt_way_geoms[wid] = line

        # Store highway feature
        if is_highway:
            kept = {}
            for k in ("name", "highway", "maxspeed", "lanes", "surface", "oneway"):
                if k in tags:
                    kept[k] = tags[k]
            kept["osm_id"] = wid
            self.highway_features.append((line, kept))

from osmium.index import create_map
from osmium import NodeLocationsForWays, apply

print("  Creating node location index...")
index = create_map("flex_mem")
nlw = NodeLocationsForWays(index)

handler = StreamHandler(buffer_wgs84, (bminx, bminy, bmaxx, bmaxy), pt_collector.all_member_way_ids)

print("  Reading PBF with streaming pipeline...")
apply(PBF_PATH, nlw, handler)

print(f"  Highway features: {len(handler.highway_features)}")
print(f"  PT member way geometries: {len(handler.pt_way_geoms)}")

# ===================================================================
# STEP 4 — Assemble PT routes
# ===================================================================
print("\n=== STEP 4: Assemble PT routes ===")

pt_features = []
for rid, rdata in pt_collector.relations.items():
    members = []
    for wid in rdata["member_way_ids"]:
        if wid in handler.pt_way_geoms:
            members.append(handler.pt_way_geoms[wid])

    if not members:
        continue

    if len(members) == 1:
        gw = members[0]
    else:
        merged = unary_union(members)
        if merged.geom_type == "MultiLineString":
            gw = merged
        elif merged.geom_type == "LineString":
            gw = MultiLineString([merged])
        else:
            continue

    gm = transform(lambda x, y: wgs_to_mgi.transform(x, y), gw)
    tags = rdata["tags"]
    pt_features.append({
        "geometry": gm,
        "osm_id": rid,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  PT route features: {len(pt_features)}")

# ===================================================================
# STEP 5 — Write GeoPackage
# ===================================================================
print("\n=== Write GeoPackage ===")

# Highways
hw_rows = []
for ln, td in handler.highway_features:
    ln_mgi = transform(lambda x, y: wgs_to_mgi.transform(x, y), ln)
    td = dict(td)
    td["geometry"] = ln_mgi
    hw_rows.append(td)

gdf_hw = gpd.GeoDataFrame(hw_rows, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]

# PT routes
gdf_pt = gpd.GeoDataFrame(pt_features, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])
    elif g.geom_type != "MultiLineString":
        del pt_features[idx]

print(f"  Highways layer: {len(gdf_hw)} features")
print(f"  PT routes layer: {len(gdf_pt)} features")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")
print(f"\nWritten to {GPKG_PATH}")

# ===================================================================
# Verify
# ===================================================================
print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    cols = [c for c in gdf.columns if c != "geometry"]
    print(f"  Columns: {cols}")
    if len(gdf) > 0:
        for c in cols:
            vals = gdf[c].dropna().unique()[:5]
            print(f"    {c}: {list(vals)}")
        bounds = gdf.total_bounds
        print(f"  Bounds: {bounds}")

print("\nDone!")
Wrote /work/solve.py (9272 chars).
40:02
The approach should be much faster now because:
1. Phase 1: find Gürtel ways + buffer - fast
2. Phase 2: collect PT relation member way IDs - fast, no geometry
3. Phase 3: streaming with NodeLocationsForWays - only processes highway ways and PT member ways that pass bbox check. Since we check `is_highway or is_pt_member` before building geometry, it only tries to build geometry for ways we care about. But wait - the handler still gets called for ALL ways, it's just the geometry building that's conditional. The building part is the expensive part (iterating through w.nodes). Actually the check is at the top of way() and it returns early if not highway and not pt_member. That's good.

But we need to be careful - the pt_member_way_ids set has all the member way IDs. Let me check: how many unique PT member way IDs are there? It could be a lot. Let me run and see.

Actually, I think the main hang was the first approach where the buffer bbox was very wide (15.16 to 16.41 longitude) which covers most of eastern Austria. That means many ways pass the bbox check. But now with the targeted approach (only checking highway ways and PT member ways), it should be faster.

Let me run it.
40:02
Bash cd /work && timeout 480 python3 solve.py 2>&1
=== STEP 1: Gürtel buffer ===
  Gürtel ways: 415
  Node locations: 1932
  Buffer area: 22.83 km²

=== STEP 2: PT relations ===
  PT relations: 12433
  Unique PT member way IDs: 379778

=== STEP 3: Feature extraction ===
  Creating node location index...
  Reading PBF with streaming pipeline...
  Highway features: 17767
  PT member way geometries: 3057

=== STEP 4: Assemble PT routes ===
  PT route features: 729

=== Write GeoPackage ===
  Highways layer: 17767 features
  PT routes layer: 729 features

Written to /work/vienna_network.gpkg

=== Verification ===

Layer 'highways':
  CRS: EPSG:31287
  Rows: 17767
  Geometry types: <ArrowStringArray>
['LineString']
Length: 1, dtype: str
  Columns: ['name', 'highway', 'surface', 'osm_id', 'maxspeed', 'lanes', 'oneway']
    name: ['Anton-Schmid-Promenade', 'Sankt-Peter-Gürtel', 'Sankt-Peter-Hauptstraße', 'Waltendorfer Gürtel', 'Koßgasse']
    highway: ['cycleway', 'primary', 'secondary', 'unclassified', 'residential']
    surface: ['asphalt', 'concrete', 'paving_stones', 'sett', 'unpaved']
    osm_id: [np.int64(3220517), np.int64(3992920), np.int64(3992921), np.int64(3993614), np.int64(3993615)]
    maxspeed: ['60', '50', '30', '100', '7']
    lanes: ['7', '5', '3', '2', '1']
    oneway: ['no', 'yes', '-1', 'reversible']
  Bounds: [536156.69397674 349200.10562584 628765.24432609 487679.13770081]

Layer 'pt_routes':
  CRS: EPSG:31287
  Rows: 729
  Geometry types: <ArrowStringArray>
['MultiLineString']
Length: 1, dtype: str
  Columns: ['osm_id', 'ref', 'name', 'operator', 'route']
    osm_id: [np.int64(8526), np.int64(20309), np.int64(22494), np.int64(23294), np.int64(27010)]
    ref: ['S2', 'U3', 'U1', '1', '7']
    name: ['S2: Mödling => Laa', 'U3: Ottakring => Simmering', 'U1: Oberlaa => Leopoldau', 'Tram 1: Eggenberg/UKH => Mariatrost', 'Tram 7: Wetzelsdorf Burgenlandstraße => LKH Med Uni/Klinikum Nord']
    operator: ['ÖBB-Personenverkehr AG', 'Wiener Linien', 'Graz Linien', 'Holding Graz Linien', 'GKB']
    route: ['train', 'subway', 'tram', 'bus', 'light_rail']
  Bounds: [536156.69397674 349053.81223973 629161.06352941 487679.13770081]

Done!
46:44
The script worked! But I notice the name column for highways is missing some entries that look odd - "Waltendorfer Gürtel" and "Sankt-Peter-Gürtel" shouldn't be in the buffer of Vienna's Gürtel. Those are Gürtel streets in Graz, not Vienna. The issue is that the user asked for a buffer around "Vienna's Gürtel" - the highway ways whose name ends in 'Gürtel'. The script found ALL ways with names ending in Gürtel across all of Austria, then built the buffer. But Vienna's Gürtel is specifically the ring road in Vienna.

Let me check the coordinate bounds. The MGI bounds show x: 536156-628765 and y: 349200-487679. These are metric coordinates in EPSG:31287. For Vienna, the center coordinates in MGI/Austria Lambert would be roughly:
- Vienna city center is at about 48.2°N, 16.37°E
- In EPSG:31287, Vienna would be around x: 550000-570000, y: 360000-390000 approximately

The bounds we got are x: 536156-628765, y: 349200-487679 - this is quite large, covering a region from around Wiener Neustadt to north of Vienna. This suggests that the "Gürtel" ways found include streets in other cities too, which made the buffer much larger.

Let me check: the user said "Vienna's Gürtel, defined as the highway ways whose name ends in 'Gürtel'". But in Austria, "Gürtel" appears in many cities (Graz has a Waltendorfer Gürtel, etc.). The user's definition wasn't geographically scoped.

Actually, re-reading: "around Vienna's Gürtel, defined as the highway ways whose name ends in 'Gürtel'". This is a bit ambiguous - they want the Gürtel in Vienna, defined by the name ending. But maybe they assume that the Gürtel name is unique to Vienna's ring road.

However, the output looks reasonable - highways include things near Vienna. The names like "Waltendorfer Gürtel" and "Sankt-Peter-Gürtel" indicate that Gürtels from other cities (Graz has Waltendorfer Gürtel) are also in the buffer because the buffer is union of all Gürtel ways + 500m. This creates a huge buffer covering most of eastern Austria.

Wait, 415 Gürtel ways across all of Austria? That seems like a lot. Let me check what these are.

Actually, the buffer area is 22.83 km², which is quite small. The bounds are from 536156 to 628765 - that's about 92km wide, and from 349200 to 487679 - that's about 138km tall. So the buffer is a large irregular shape covering multiple disconnected clusters (each Gürtel street with its 500m buffer). This means all highways intersecting ANY Gürtel + 500m buffer in all of Austria are included.

I think the user meant specifically "Vienna's Gürtel" - the famous ring road in Vienna. The way to scope this is to only use Gürtel ways in Vienna. Let me add a location constraint.

Actually, looking at this more carefully - the Gürtel in Vienna is a well-known ring road. The name "Gürtel" in Vienna specifically refers to the road named just "Gürtel" or "Wiener Gürtel". The user said "whose name ends in 'Gürtel'" - but in Vienna the road is typically just called "Gürtel" or has specific segment names like "Neubaugürtel", "Mariahilfer Gürtel", etc.

Let me check what Gürtel ways actually exist in the data. The issue is that ways named "Waltendorfer Gürtel" (in Graz) and "Sankt-Peter-Gürtel" (in Graz) are also included. The buffer from all these scattered Gürtel ways across Austria results in a huge disjointed buffer.

I should filter the Gürtel ways to only include those in/near Vienna. I can do this by checking if the Gürtel way's centroid is within the Vienna area.

Let me check what Gürtel names we have and perhaps restrict to Vienna.

Actually, re-reading: "Vienna's Gürtel, defined as the highway ways whose name ends in 'Gürtel'" - I think they mean the Gürtel road in Vienna. The definition is a bit loose but they probably expect just the Viennese Gürtel.

Let me look at the data and filter.
46:44
46:44
Write /work/check_guertel.py
content · 44 lines · py
#!/usr/bin/env python3
"""Check Gürtel way names and locations to filter for Vienna's Gürtel only."""

import osmium
from shapely.geometry import LineString
from shapely.ops import transform
from pyproj import Transformer

PBF_PATH = "/work/austria-latest.osm.pbf"

wgs_to_mgi = Transformer.from_crs(4326, 31287, always_xy=True)

class Checker(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel = []  # (name, centroid_x_mgi, centroid_y_mgi)
        self.node_locs = {}

    def node(self, n):
        self.node_locs[n.id] = (n.location.lon, n.location.lat)

    def way(self, w):
        name = w.tags.get("name", "")
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            refs = [n.ref for n in w.nodes]
            pts = [self.node_locs[r] for r in refs if r in self.node_locs]
            if len(pts) >= 2:
                ln = LineString(pts)
                cx, cy = wgs_to_mgi.transform(ln.centroid.x, ln.centroid.y)
                self.guertel.append((name, cx, cy, len(refs)))

c = Checker()
c.apply_file(PBF_PATH)

# Vienna city center in MGI / Austria Lambert ≈ (558000, 380000)
# Let's see all Gürtel ways grouped by region
print(f"{'Name':<40s} {'MGI_X':>10s} {'MGI_Y':>10s} {'Nodes':>6s}")
print("-" * 70)
for name, cx, cy, nn in sorted(c.guertel, key=lambda x: x[0]):
    # Approximate: Vienna is around (550000-570000, 360000-400000)
    is_vienna = "vienna" if 540000 < cx < 580000 and 350000 < cy < 410000 else ""
    print(f"{name:<40s} {cx:>10.0f} {cy:>10.0f} {nn:>6d}  {is_vienna}")

print(f"\nTotal: {len(c.guertel)} Gürtel ways")
Wrote /work/check_guertel.py (1621 chars).
46:55
I need to run this, but the issue is that the PBF wasn't read with NodeLocationsForWays so the node locations won't be available. Let me fix the checker.
46:55
Write /work/check_guertel.py
content · 46 lines · py
#!/usr/bin/env python3
"""Check Gürtel way names and approximate locations using way bbox."""

import osmium
from osmium import NodeLocationsForWays, apply
from osmium.index import create_map

PBF_PATH = "/work/austria-latest.osm.pbf"

class GuertelChecker(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel = []

    def way(self, w):
        name = w.tags.get("name", "")
        if not name or not name.rstrip().endswith(("Gürtel", "Guertel")):
            return

        # Collect valid node locations from the way
        lats, lons = [], []
        for n in w.nodes:
            loc = n.location
            if loc.valid():
                lats.append(loc.lat)
                lons.append(loc.lon)

        if not lats:
            return

        self.guertel.append((name, min(lons), min(lats), max(lons), max(lats), len(lons)))

index = create_map("flex_mem")
nlw = NodeLocationsForWays(index)
checker = GuertelChecker()

apply(PBF_PATH, nlw, checker)

print(f"{'Name':<40s} {'Lon_min':>9s} {'Lat_min':>9s} {'Lon_max':>9s} {'Lat_max':>9s} {'Nodes':>6s}")
print("-" * 80)
for name, lon1, lat1, lon2, lat2, nn in sorted(checker.guertel, key=lambda x: x[0]):
    # Vienna Gürtel is around 16.3-16.4E, 48.1-48.3N
    is_vienna = "VIENNA" if 16.0 < lon1 < 16.5 and 48.1 < lat1 < 48.3 else ""
    print(f"{name:<40s} {lon1:>9.4f} {lat1:>9.4f} {lon2:>9.4f} {lat2:>9.4f} {nn:>6d}  {is_vienna}")

print(f"\nTotal: {len(checker.guertel)} Gürtel ways")
Wrote /work/check_guertel.py (1495 chars).
47:04
The issue with the first approach was that using osmium.SimpleHandler without NodeLocationsForWays means locations aren't resolved. The second approach uses the streaming pipeline correctly.

Let me run it to see which Gürtel ways are in Vienna vs Graz.
47:04
Bash cd /work && timeout 120 python3 check_guertel.py 2>&1
Name                                       Lon_min   Lat_min   Lon_max   Lat_max  Nodes
--------------------------------------------------------------------------------
Arbeitergasse, Gürtel                      16.3454   48.1841   16.3460   48.1848     21  VIENNA
Arbeitergasse, Gürtel                      16.3453   48.1842   16.3458   48.1848     14  VIENNA
Arbeitergasse, Gürtel                      16.3457   48.1839   16.3459   48.1840      5  VIENNA
Arbeitergasse, Gürtel                      16.3455   48.1840   16.3458   48.1841      6  VIENNA
Arbeitergasse, Gürtel                      16.3463   48.1839   16.3464   48.1840      5  VIENNA
Betriebsbahnhof Gürtel                     16.3497   48.2308   16.3523   48.2319     19  VIENNA
Brücke über den Mariahilfer Gürtel         16.3381   48.1885   16.3384   48.1891     18  VIENNA
Bürohaus Gürtel                            16.3430   48.2184   16.3434   48.2186      8  VIENNA
Döblinger Gürtel                           16.3548   48.2330   16.3549   48.2331      3  VIENNA
Döblinger Gürtel                           16.3565   48.2351   16.3566   48.2352      6  VIENNA
Döblinger Gürtel                           16.3555   48.2354   16.3567   48.2362     13  VIENNA
Döblinger Gürtel                           16.3554   48.2362   16.3555   48.2363      3  VIENNA
Döblinger Gürtel                           16.3545   48.2331   16.3549   48.2344      4  VIENNA
Döblinger Gürtel                           16.3538   48.2322   16.3540   48.2324      7  VIENNA
Döblinger Gürtel                           16.3543   48.2319   16.3546   48.2321      6  VIENNA
Döblinger Gürtel                           16.3553   48.2358   16.3553   48.2359      2  VIENNA
Döblinger Gürtel                           16.3554   48.2362   16.3554   48.2362      2  VIENNA
Döblinger Gürtel                           16.3553   48.2359   16.3554   48.2362      2  VIENNA
Döblinger Gürtel                           16.3550   48.2349   16.3552   48.2355      5  VIENNA
Döblinger Gürtel                           16.3543   48.2321   16.3548   48.2330      9  VIENNA
Döblinger Gürtel                           16.3549   48.2331   16.3565   48.2352     11  VIENNA
Döblinger Gürtel                           16.3549   48.2344   16.3555   48.2354      7  VIENNA
Döblinger Gürtel                           16.3552   48.2355   16.3553   48.2358      2  VIENNA
Döblinger Gürtel                           16.3540   48.2324   16.3545   48.2331      8  VIENNA
Döblinger Gürtel                           16.3549   48.2344   16.3550   48.2349      5  VIENNA
Eggenberger Gürtel                         15.4228   47.0627   15.4240   47.0637      7  
Eggenberger Gürtel                         15.4193   47.0712   15.4194   47.0713      4  
Eggenberger Gürtel                         15.4197   47.0696   15.4199   47.0703      3  
Eggenberger Gürtel                         15.4227   47.0626   15.4238   47.0636      8  
Eggenberger Gürtel                         15.4210   47.0643   15.4219   47.0656      8  
Eggenberger Gürtel                         15.4208   47.0656   15.4210   47.0662      5  
Eggenberger Gürtel                         15.4205   47.0656   15.4211   47.0674      7  
Eggenberger Gürtel                         15.4199   47.0690   15.4201   47.0696      4  
Eggenberger Gürtel                         15.4197   47.0696   15.4198   47.0701      5  
Eggenberger Gürtel                         15.4196   47.0703   15.4197   47.0707      3  
Eggenberger Gürtel                         15.4207   47.0662   15.4208   47.0665      4  
Eggenberger Gürtel                         15.4196   47.0707   15.4196   47.0711      3  
Eggenberger Gürtel                         15.4203   47.0678   15.4203   47.0679      3  
Eggenberger Gürtel                         15.4198   47.0686   15.4201   47.0696      4  
Eggenberger Gürtel                         15.4211   47.0646   15.4217   47.0656      6  
Eggenberger Gürtel                         15.4219   47.0636   15.4227   47.0643      6  
Eggenberger Gürtel                         15.4194   47.0711   15.4194   47.0712      2  
Eggenberger Gürtel                         15.4217   47.0637   15.4228   47.0646      5  
Eggenberger Gürtel                         15.4204   47.0679   15.4204   47.0681      2  
Eggenberger Gürtel                         15.4201   47.0680   15.4202   47.0686      4  
Eggenberger Gürtel                         15.4204   47.0674   15.4205   47.0679      3  
Eggenberger Gürtel                         15.4202   47.0679   15.4203   47.0680      2  
Eggenberger Gürtel                         15.4204   47.0679   15.4204   47.0679      3  
Eggenberger Gürtel                         15.4201   47.0681   15.4204   47.0690      4  
Eggenberger Gürtel                         15.4203   47.0665   15.4207   47.0678      4  
Eggenberger Gürtel                         15.4195   47.0712   15.4195   47.0713      3  
Eggenberger Gürtel                         15.4195   47.0711   15.4196   47.0712      3  
Eggenberger Gürtel                         15.4194   47.0701   15.4197   47.0711      5  
Gaudenzdorfer Gürtel                       16.3394   48.1873   16.3397   48.1874      3  VIENNA
Gaudenzdorfer Gürtel                       16.3391   48.1876   16.3403   48.1881     13  VIENNA
Gaudenzdorfer Gürtel                       16.3441   48.1856   16.3444   48.1858      5  VIENNA
Gaudenzdorfer Gürtel                       16.3428   48.1871   16.3431   48.1872      7  VIENNA
Gaudenzdorfer Gürtel                       16.3427   48.1815   16.3476   48.1871     20  VIENNA
Gaudenzdorfer Gürtel                       16.3491   48.1802   16.3493   48.1803      2  VIENNA
Gaudenzdorfer Gürtel                       16.3488   48.1803   16.3491   48.1804      4  VIENNA
Gaudenzdorfer Gürtel                       16.3397   48.1874   16.3403   48.1876      4  VIENNA
Gaudenzdorfer Gürtel                       16.3421   48.1871   16.3427   48.1878      5  VIENNA
Gaudenzdorfer Gürtel                       16.3435   48.1870   16.3437   48.1872      7  VIENNA
Gaudenzdorfer Gürtel                       16.3427   48.1868   16.3429   48.1869      6  VIENNA
Gaudenzdorfer Gürtel                       16.3481   48.1804   16.3488   48.1809      4  VIENNA
Gaudenzdorfer Gürtel                       16.3476   48.1809   16.3481   48.1814      4  VIENNA
Gaudenzdorfer Gürtel                       16.3388   48.1882   16.3391   48.1885      3  VIENNA
Gaudenzdorfer Gürtel                       16.3403   48.1876   16.3421   48.1880     15  VIENNA
Gaudenzdorfer Gürtel                       16.3476   48.1814   16.3476   48.1815      5  VIENNA
Gaudenzdorfer Gürtel                       16.3391   48.1881   16.3391   48.1882      3  VIENNA
Gumpendorfer Gürtel                        16.3397   48.1887   16.3404   48.1888      5  VIENNA
Gumpendorfer Gürtel                        16.3382   48.1888   16.3397   48.1901     12  VIENNA
Gumpendorfer Gürtel                        16.3380   48.1901   16.3382   48.1905      4  VIENNA
Gumpendorfer Gürtel                        16.3380   48.1905   16.3380   48.1906      2  VIENNA
Gürtel                                     15.1685   48.0052   15.1691   48.0062      3  
Gürtel                                     15.1674   48.0062   15.1685   48.0075      7  
Gürtel                                     15.1691   48.0047   15.1704   48.0052     11  
Gürtel                                     15.1670   48.0075   15.1674   48.0078      4  
Heisse am Gürtel                           16.3385   48.2083   16.3386   48.2084      5  VIENNA
Hernalser Gürtel                           16.3394   48.2132   16.3402   48.2132      5  VIENNA
Hernalser Gürtel                           16.3405   48.2152   16.3412   48.2165     10  VIENNA
Hernalser Gürtel                           16.3398   48.2124   16.3411   48.2148      7  VIENNA
Hernalser Gürtel                           16.3399   48.2141   16.3408   48.2141      6  VIENNA
Hernalser Gürtel                           16.3418   48.2172   16.3424   48.2173      8  VIENNA
Hernalser Gürtel                           16.3392   48.2122   16.3414   48.2165     14  VIENNA
Hernalser Gürtel                           16.3412   48.2158   16.3416   48.2164     11  VIENNA
Hernalser Gürtel                           16.3414   48.2163   16.3415   48.2163      3  VIENNA
Hernalser Gürtel                           16.3405   48.2151   16.3407   48.2151      2  VIENNA
Hernalser Gürtel                           16.3398   48.2151   16.3403   48.2151     14  VIENNA
Hernalser Gürtel                           16.3408   48.2152   16.3414   48.2152      3  VIENNA
Hernalser Gürtel                           16.3412   48.2165   16.3414   48.2165      2  VIENNA
Hernalser Gürtel                           16.3416   48.2163   16.3420   48.2164      5  VIENNA
Hernalser Gürtel                           16.3420   48.2163   16.3424   48.2172     10  VIENNA
Hernalser Gürtel                           16.3412   48.2165   16.3415   48.2170      4  VIENNA
Hernalser Gürtel                           16.3415   48.2170   16.3417   48.2173      7  VIENNA
Hernalser Gürtel                           16.3416   48.2152   16.3421   48.2153      5  VIENNA
Hernalser Gürtel                           16.3392   48.2128   16.3405   48.2151      6  VIENNA
Hernalser Gürtel                           16.3405   48.2151   16.3405   48.2152      4  VIENNA
Hernalser Gürtel                           16.3389   48.2122   16.3392   48.2128      4  VIENNA
Hernalser Gürtel                           16.3411   48.2148   16.3414   48.2152      7  VIENNA
Hernalser Gürtel                           16.3414   48.2152   16.3420   48.2163      5  VIENNA
Hernalser Gürtel                           16.3407   48.2151   16.3413   48.2151      5  VIENNA
Hernalser Gürtel                           16.3405   48.2152   16.3407   48.2152      2  VIENNA
Hernalser Gürtel                           16.3417   48.2173   16.3418   48.2173      2  VIENNA
Hernalser Gürtel                           16.3414   48.2164   16.3416   48.2165      3  VIENNA
Hernalser Gürtel                           16.3407   48.2152   16.3408   48.2152      3  VIENNA
Karlauer Gürtel                            15.4329   47.0574   15.4335   47.0575      4  
Karlauer Gürtel                            15.4330   47.0574   15.4335   47.0574      4  
Kleingartenverein Landstraßer Gürtel       16.3869   48.1855   16.3904   48.1877     22  VIENNA
Landstraßer Gürtel                         16.3926   48.1876   16.3937   48.1879      5  VIENNA
Landstraßer Gürtel                         16.3813   48.1883   16.3813   48.1883      3  VIENNA
Landstraßer Gürtel                         16.3939   48.1864   16.3954   48.1873      6  VIENNA
Landstraßer Gürtel                         16.3812   48.1884   16.3852   48.1885     14  VIENNA
Landstraßer Gürtel                         16.3886   48.1882   16.3903   48.1883      6  VIENNA
Landstraßer Gürtel                         16.3868   48.1883   16.3886   48.1884      4  VIENNA
Landstraßer Gürtel                         16.3851   48.1882   16.3886   48.1884      6  VIENNA
Landstraßer Gürtel                         16.3840   48.1884   16.3842   48.1885      5  VIENNA
Landstraßer Gürtel                         16.3886   48.1882   16.3902   48.1882      6  VIENNA
Landstraßer Gürtel                         16.3961   48.1862   16.3965   48.1863      8  VIENNA
Landstraßer Gürtel                         16.3868   48.1882   16.3886   48.1884      5  VIENNA
Landstraßer Gürtel                         16.3815   48.1885   16.3826   48.1886      4  VIENNA
Landstraßer Gürtel                         16.3913   48.1879   16.3920   48.1880      3  VIENNA
Landstraßer Gürtel                         16.3937   48.1862   16.3961   48.1876      8  VIENNA
Landstraßer Gürtel                         16.3840   48.1884   16.3851   48.1884      6  VIENNA
Landstraßer Gürtel                         16.3826   48.1884   16.3840   48.1884      2  VIENNA
Landstraßer Gürtel                         16.3852   48.1884   16.3868   48.1885      6  VIENNA
Landstraßer Gürtel                         16.3966   48.1846   16.3987   48.1855     13  VIENNA
Landstraßer Gürtel                         16.3954   48.1859   16.3961   48.1864      8  VIENNA
Landstraßer Gürtel                         16.3897   48.1881   16.3901   48.1881      2  VIENNA
Landstraßer Gürtel                         16.3913   48.1879   16.3926   48.1882      7  VIENNA
Landstraßer Gürtel                         16.3920   48.1874   16.3936   48.1879      8  VIENNA
Landstraßer Gürtel                         16.3936   48.1873   16.3939   48.1874      3  VIENNA
Landstraßer Gürtel                         16.3814   48.1885   16.3815   48.1885      2  VIENNA
Landstraßer Gürtel                         16.3886   48.1881   16.3897   48.1882      3  VIENNA
Landstraßer Gürtel                         16.3813   48.1885   16.3814   48.1885      3  VIENNA
Landstraßer Gürtel                         16.3813   48.1883   16.3826   48.1884      9  VIENNA
Landstraßer Gürtel                         16.3961   48.1858   16.3962   48.1859      2  VIENNA
Lerchenfelder Gürtel                       16.3379   48.2086   16.3381   48.2087      5  VIENNA
Lerchenfelder Gürtel                       16.3387   48.2105   16.3392   48.2108      7  VIENNA
Lerchenfelder Gürtel                       16.3389   48.2120   16.3397   48.2122     10  VIENNA
Lerchenfelder Gürtel                       16.3382   48.2086   16.3389   48.2087      7  VIENNA
Lerchenfelder Gürtel                       16.3387   48.2116   16.3389   48.2122      9  VIENNA
Lerchenfelder Gürtel                       16.3392   48.2105   16.3395   48.2107      5  VIENNA
Lerchenfelder Gürtel                       16.3373   48.2042   16.3388   48.2084     14  VIENNA
Lerchenfelder Gürtel                       16.3373   48.2074   16.3380   48.2084      5  VIENNA
Lerchenfelder Gürtel                       16.3381   48.2084   16.3388   48.2086      6  VIENNA
Lerchenfelder Gürtel                       16.3379   48.2087   16.3382   48.2087      4  VIENNA
Lerchenfelder Gürtel                       16.3388   48.2084   16.3389   48.2086      6  VIENNA
Lerchenfelder Gürtel                       16.3366   48.2042   16.3373   48.2074     10  VIENNA
Lerchenfelder Gürtel                       16.3392   48.2104   16.3394   48.2105      4  VIENNA
Lerchenfelder Gürtel                       16.3368   48.2042   16.3369   48.2055      8  VIENNA
Lerchenfelder Gürtel                       16.3387   48.2086   16.3395   48.2111     18  VIENNA
Lerchenfelder Gürtel                       16.3394   48.2103   16.3397   48.2120     17  VIENNA
Lerchenfelder Gürtel                       16.3387   48.2114   16.3387   48.2116      2  VIENNA
Lerchenfelder Gürtel                       16.3385   48.2092   16.3387   48.2108      8  VIENNA
Lerchenfelder Gürtel                       16.3381   48.2086   16.3385   48.2092      9  VIENNA
Lerchenfelder Gürtel                       16.3389   48.208
[... truncated ...] 15.4230   47.0316   15.4239   47.0318      3  
Weblinger Gürtel                           15.4239   47.0296   15.4286   47.0316     13  
Weblinger Gürtel                           15.4259   47.0296   15.4281   47.0306      8  
Weblinger Gürtel                           15.4281   47.0292   15.4295   47.0296      9  
Weblinger Gürtel                           15.4305   47.0291   15.4308   47.0292      3  
Weblinger Gürtel                           15.4138   47.0345   15.4152   47.0347     11  
Weblinger Gürtel                           15.4226   47.0319   15.4236   47.0322      6  
Weblinger Gürtel                           15.4148   47.0334   15.4148   47.0337      5  
Weblinger Gürtel                           15.4150   47.0333   15.4152   47.0333      2  
Weblinger Gürtel                           15.4151   47.0317   15.4185   47.0329     28  
Weblinger Gürtel                           15.4209   47.0325   15.4210   47.0325      2  
Weblinger Gürtel                           15.4191   47.0333   15.4192   47.0335      4  
Weblinger Gürtel                           15.4310   47.0290   15.4322   47.0290      5  
Weblinger Gürtel                           15.4187   47.0332   15.4187   47.0332      2  
Weblinger Gürtel                           15.4185   47.0336   15.4186   47.0339      3  
Weblinger Gürtel                           15.4186   47.0335   15.4187   47.0336      2  
Weblinger Gürtel                           15.4318   47.0306   15.4319   47.0306      2  
Weblinger Gürtel                           15.4322   47.0291   15.4330   47.0291      2  
Weblinger Gürtel                           15.4331   47.0291   15.4334   47.0291      2  
Weblinger Gürtel                           15.4322   47.0290   15.4331   47.0291      2  
Weblinger Gürtel                           15.4308   47.0291   15.4322   47.0291      9  
Weblinger Gürtel                           15.4192   47.0327   15.4193   47.0331      3  
Weblinger Gürtel                           15.4126   47.0339   15.4130   47.0339      2  
Weblinger Gürtel                           15.4149   47.0329   15.4153   47.0331      5  
Weblinger Gürtel                           15.4123   47.0341   15.4129   47.0343     10  
Weblinger Gürtel                           15.4132   47.0327   15.4179   47.0332     11  
Weblinger Gürtel                           15.4137   47.0333   15.4140   47.0333      3  
Weblinger Gürtel                           15.4319   47.0306   15.4321   47.0307      3  
Weblinger Gürtel                           15.4191   47.0325   15.4192   47.0325      2  
Weblinger Gürtel                           15.4179   47.0325   15.4191   47.0327      4  
Weblinger Gürtel                           15.4192   47.0324   15.4205   47.0326      4  
Weblinger Gürtel                           15.4171   47.0326   15.4191   47.0329      5  
Weblinger Gürtel                           15.4178   47.0340   15.4181   47.0347      4  
Weblinger Gürtel                           15.4123   47.0339   15.4126   47.0341      2  
Weblinger Gürtel                           15.4123   47.0341   15.4123   47.0341      4  
Weblinger Gürtel                           15.4192   47.0316   15.4235   47.0325      8  
Weblinger Gürtel                           15.4191   47.0326   15.4192   47.0326      2  
Weblinger Gürtel                           15.4192   47.0332   15.4192   47.0333      2  
Weblinger Gürtel                           15.4192   47.0331   15.4192   47.0332      2  
Weblinger Gürtel                           15.4205   47.0332   15.4210   47.0332      3  
Weblinger Gürtel                           15.4191   47.0327   15.4193   47.0327      2  
Weblinger Gürtel                           15.4189   47.0331   15.4192   47.0331      4  
Weblinger Gürtel                           15.4197   47.0325   15.4209   47.0327      6  
Weblinger Gürtel                           15.4192   47.0329   15.4195   47.0331      5  
Weblinger Gürtel                           15.4195   47.0327   15.4197   47.0329      5  
Weblinger Gürtel                           15.4286   47.0295   15.4288   47.0296      2  
Weblinger Gürtel                           15.4295   47.0292   15.4300   47.0292      4  
Weblinger Gürtel                           15.4288   47.0294   15.4292   47.0295      3  
Weblinger Gürtel                           15.4235   47.0313   15.4243   47.0316      3  
Weblinger Gürtel                           15.4243   47.0306   15.4259   47.0313      5  
Weblinger Gürtel                           15.4145   47.0331   15.4157   47.0332      3  
Weblinger Gürtel                           15.4205   47.0318   15.4230   47.0324      5  
Weblinger Gürtel                           15.4152   47.0329   15.4176   47.0333      7  
Weblinger Gürtel                           15.4136   47.0335   15.4137   47.0335      2  
Weblinger Gürtel                           15.4137   47.0333   15.4150   47.0335      8  
Weblinger Gürtel                           15.4129   47.0335   15.4136   47.0338      6  
Weblinger Gürtel                           15.4187   47.0332   15.4187   47.0335      5  
Weblinger Gürtel                           15.4214   47.0322   15.4226   47.0324      7  
Weblinger Gürtel                           15.4236   47.0319   15.4241   47.0329     12  
Weblinger Gürtel                           15.4144   47.0345   15.4145   47.0348      4  
Weblinger Gürtel                           15.4189   47.0319   15.4192   47.0325      5  
Weblinger Gürtel                           15.4144   47.0347   15.4159   47.0349      8  
Weblinger Gürtel                           15.4299   47.0292   15.4305   47.0293      3  
Weblinger Gürtel                           15.4210   47.0324   15.4213   47.0325      2  
Weblinger Gürtel                           15.4181   47.0340   15.4185   47.0340      5  
Weblinger Gürtel                           15.4140   47.0332   15.4145   47.0333      2  
Weblinger Gürtel                           15.4162   47.0329   15.4171   47.0330      2  
Weblinger Gürtel                           15.4157   47.0330   15.4162   47.0331      2  
Weblinger Gürtel                           15.4129   47.0333   15.4137   47.0335      6  
Weblinger Gürtel                           15.4176   47.0347   15.4178   47.0352      5  
Weblinger Gürtel                           15.4166   47.0349   15.4176   47.0352      7  
Weblinger Gürtel                           15.4128   47.0347   15.4138   47.0349      9  
Weblinger Gürtel                           15.4185   47.0339   15.4185   47.0340      2  
Wiedner Gürtel                             16.3742   48.1862   16.3756   48.1867      2  VIENNA
Wiedner Gürtel                             16.3683   48.1844   16.3716   48.1855      6  VIENNA
Wiedner Gürtel                             16.3811   48.1883   16.3812   48.1884      4  VIENNA
Wiedner Gürtel                             16.3733   48.1861   16.3734   48.1861      3  VIENNA
Wiedner Gürtel                             16.3716   48.1855   16.3732   48.1860      2  VIENNA
Wiedner Gürtel                             16.3683   48.1842   16.3705   48.1850      6  VIENNA
Wiedner Gürtel                             16.3750   48.1866   16.3766   48.1871      3  VIENNA
Wiedner Gürtel                             16.3716   48.1853   16.3729   48.1857      4  VIENNA
Wiedner Gürtel                             16.3774   48.1871   16.3782   48.1874      4  VIENNA
Wiedner Gürtel                             16.3773   48.1871   16.3774   48.1873      2  VIENNA
Wiedner Gürtel                             16.3742   48.1863   16.3756   48.1867      2  VIENNA
Wiedner Gürtel                             16.3788   48.1878   16.3792   48.1879      3  VIENNA
Wiedner Gürtel                             16.3767   48.1870   16.3774   48.1871      5  VIENNA
Wiedner Gürtel                             16.3716   48.1854   16.3732   48.1859      2  VIENNA
Wiedner Gürtel                             16.3684   48.1841   16.3700   48.1848      5  VIENNA
Wiedner Gürtel                             16.3793   48.1877   16.3803   48.1880      4  VIENNA
Wiedner Gürtel                             16.3766   48.1871   16.3773   48.1873      4  VIENNA
Wiedner Gürtel                             16.3773   48.1873   16.3788   48.1878      5  VIENNA
Wiedner Gürtel                             16.3792   48.1879   16.3808   48.1883      5  VIENNA
Wiedner Gürtel                             16.3682   48.1843   16.3713   48.1854      5  VIENNA
Wiedner Gürtel                             16.3741   48.1861   16.3767   48.1870      5  VIENNA
Wiedner Gürtel                             16.3738   48.1861   16.3741   48.1861      4  VIENNA
Wiedner Gürtel                             16.3789   48.1876   16.3793   48.1877      4  VIENNA
Wiedner Gürtel                             16.3782   48.1874   16.3789   48.1876      2  VIENNA
Wiedner Gürtel                             16.3803   48.1880   16.3811   48.1882      9  VIENNA
Wiedner Gürtel                             16.3742   48.1864   16.3750   48.1866      2  VIENNA
Wiedner Gürtel                             16.3729   48.1857   16.3731   48.1858      2  VIENNA
Wiedner Gürtel                             16.3705   48.1850   16.3716   48.1853      3  VIENNA
Wiedner Gürtel                             16.3713   48.1854   16.3733   48.1861      4  VIENNA
Wiedner Gürtel                             16.3811   48.1882   16.3813   48.1883      4  VIENNA
Wiedner Gürtel                             16.3808   48.1883   16.3811   48.1883      4  VIENNA
Wiedner Gürtel                             16.3756   48.1867   16.3767   48.1870      2  VIENNA
Wiedner Gürtel                             16.3756   48.1867   16.3766   48.1871      2  VIENNA
Wiedner Gürtel                             16.3713   48.1854   16.3716   48.1855      2  VIENNA
Wiedner Gürtel                             16.3705   48.1850   16.3716   48.1854      3  VIENNA
Währinger Gürtel                           16.3529   48.2317   16.3538   48.2322      7  VIENNA
Währinger Gürtel                           16.3500   48.2271   16.3502   48.2271      3  VIENNA
Währinger Gürtel                           16.3508   48.2298   16.3516   48.2307      9  VIENNA
Währinger Gürtel                           16.3477   48.2225   16.3478   48.2226      3  VIENNA
Währinger Gürtel                           16.3424   48.2172   16.3429   48.2184      7  VIENNA
Währinger Gürtel                           16.3500   48.2271   16.3502   48.2282      2  VIENNA
Währinger Gürtel                           16.3502   48.2282   16.3506   48.2298      4  VIENNA
Währinger Gürtel                           16.3423   48.2190   16.3436   48.2219     11  VIENNA
Währinger Gürtel                           16.3488   48.2232   16.3495   48.2245      8  VIENNA
Währinger Gürtel                           16.3516   48.2307   16.3533   48.2313      4  VIENNA
Währinger Gürtel                           16.3484   48.2234   16.3488   48.2238      6  VIENNA
Währinger Gürtel                           16.3421   48.2184   16.3429   48.2185      7  VIENNA
Währinger Gürtel                           16.3505   48.2298   16.3508   48.2298      3  VIENNA
Währinger Gürtel                           16.3425   48.2184   16.3429   48.2184      3  VIENNA
Währinger Gürtel                           16.3502   48.2296   16.3532   48.2314     54  VIENNA
Währinger Gürtel                           16.3436   48.2219   16.3436   48.2219      3  VIENNA
Währinger Gürtel                           16.3506   48.2298   16.3509   48.2303      5  VIENNA
Währinger Gürtel                           16.3497   48.2260   16.3500   48.2271      7  VIENNA
Währinger Gürtel                           16.3420   48.2173   16.3425   48.2184      6  VIENNA
Währinger Gürtel                           16.3542   48.2317   16.3543   48.2317      5  VIENNA
Währinger Gürtel                           16.3541   48.2317   16.3542   48.2317      2  VIENNA
Währinger Gürtel                           16.3478   48.2226   16.3488   48.2232      7  VIENNA
Währinger Gürtel                           16.3501   48.2299   16.3519   48.2313     17  VIENNA
Währinger Gürtel                           16.3418   48.2177   16.3421   48.2185      3  VIENNA
Währinger Gürtel                           16.3477   48.2225   16.3495   48.2238     14  VIENNA
Währinger Gürtel                           16.3497   48.2246   16.3498   48.2251      6  VIENNA
Währinger Gürtel                           16.3498   48.2251   16.3500   48.2260      6  VIENNA
Währinger Gürtel                           16.3500   48.2260   16.3508   48.2298     10  VIENNA
Währinger Gürtel                           16.3492   48.2261   16.3499   48.2292      8  VIENNA
Währinger Gürtel                           16.3489   48.2246   16.3490   48.2252     10  VIENNA
Währinger Gürtel                           16.3495   48.2238   16.3497   48.2246      9  VIENNA
Währinger Gürtel                           16.3499   48.2292   16.3501   48.2299      7  VIENNA
Währinger Gürtel                           16.3436   48.2219   16.3449   48.2223      8  VIENNA
Währinger Gürtel                           16.3488   48.2238   16.3488   48.2239      3  VIENNA
Währinger Gürtel                           16.3488   48.2239   16.3489   48.2246      3  VIENNA
Währinger Gürtel                           16.3437   48.2203   16.3447   48.2213      8  VIENNA
Währinger Gürtel                           16.3447   48.2213   16.3477   48.2225     15  VIENNA
Währinger Gürtel                           16.3430   48.2186   16.3435   48.2198      4  VIENNA
Währinger Gürtel                           16.3429   48.2185   16.3430   48.2186      2  VIENNA
Währinger Gürtel                           16.3421   48.2185   16.3423   48.2190      5  VIENNA
Währinger Gürtel                           16.3435   48.2198   16.3437   48.2203      4  VIENNA
Währinger Gürtel                           16.3519   48.2313   16.3529   48.2317      6  VIENNA
Währinger Gürtel                           16.3417   48.2173   16.3418   48.2177      5  VIENNA
Währinger Gürtel                           16.3533   48.2313   16.3541   48.2317      5  VIENNA
Währinger Gürtel                           16.3421   48.2184   16.3425   48.2185      6  VIENNA
Währinger Gürtel                           16.3495   48.2245   16.3496   48.2246      5  VIENNA
Währinger Gürtel                           16.3490   48.2252   16.3492   48.2261      8  VIENNA
Währinger Gürtel                           16.3492   48.2251   16.3494   48.2261      7  VIENNA
Währinger Gürtel                           16.3449   48.2223   16.3462   48.2227      3  VIENNA
Währinger Gürtel                           16.3462   48.2227   16.3473   48.2230      3  VIENNA
Währinger Gürtel                           16.3473   48.2230   16.3484   48.2234      6  VIENNA
Währinger Gürtel                           16.3477   48.2226   16.3478   48.2226      2  VIENNA
Währinger Gürtel                           16.3429   48.2184   16.3429   48.2185      3  VIENNA

Total: 415 Gürtel ways
48:16
I can see that the Vienna Gürtel ways are those with names like:
- Döblinger Gürtel
- Gaudenzdorfer Gürtel
- Gumpendorfer Gürtel
- Hernalser Gürtel
- Landstraßer Gürtel
- Lerchenfelder Gürtel
- Währinger Gürtel
- Wiedner Gürtel
- ...and a few others like "Brücke über den Mariahilfer Gürtel", "Betriebsbahnhof Gürtel", etc.

And non-Vienna ones like Eggenberger Gürtel (Graz), Karlauer Gürtel (Graz), Weblinger Gürtel (Graz), etc.

The simplest approach to filter for Vienna's Gürtel is to use a spatial filter: only keep Gürtel ways in the Vienna area (around 16.3-16.4E, 48.1-48.3N). My current buffer already covers all Vienna Gürtel plus a 500m buffer, which should automatically exclude Graz.

But wait - the original results already had the buffer with area 22.83 km² which seems right for Vienna. The issue was that some highway ways and PT routes from outside Vienna were also intersecting the buffer because the buffer bbox was computed from ALL Gürtel ways (including Graz ones), so the bbox was huge.

Let me look at the buffer bbox from the first run:
```
Bbox: 15.1603,47.0245 → 16.4054,48.2408
```

That bbox covers from Graz (15.4E, 47.0N) up to Vienna (16.4E, 48.2N). So the buffer was too large because it included the Graz Gürtel ways.

I need to filter the Gürtel ways to only those in/near Vienna (around 16.3E, 48.2N), then build a proper buffer around just the Vienna Gürtel.
48:16
48:16
Write /work/solve.py
content · 290 lines · py
#!/usr/bin/env python3
"""
Extract highways (Gürtel corridor 500m buffer) and PT routes to vienna_network.gpkg.
EPSG:31287 (MGI / Austria Lambert).

Approach:
  1) Find Gürtel ways, spatially filter to Vienna area only.
  2) Build 500m buffer around Vienna Gürtel.
  3) Collect all PT relation member way IDs.
  4) Streaming extraction with NodeLocationsForWays — keep ways intersecting buffer.
  5) Assemble PT routes, write GeoPackage.
"""

import os, sys, warnings
import osmium
from osmium import NodeLocationsForWays, apply
from osmium.index import create_map
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString
from shapely.ops import unary_union, transform
from pyproj import Transformer

warnings.filterwarnings("ignore")

PBF_PATH = "/work/austria-latest.osm.pbf"
GPKG_PATH = "/work/vienna_network.gpkg"
TARGET_EPSG = 31287

wgs_to_mgi = Transformer.from_crs(4326, TARGET_EPSG, always_xy=True)
mgi_to_wgs = Transformer.from_crs(TARGET_EPSG, 4326, always_xy=True)

ROUTE_TYPES = {"bus", "tram", "subway", "train", "trolleybus", "light_rail"}

# Vienna bounding box (approx) in WGS84 — keep only Gürtel ways in this area
VIENNA_BBOX_WGS = (16.0, 48.0, 16.6, 48.4)  # minx, miny, maxx, maxy

# ===================================================================
# STEP 1 — Vienna Gürtel ways → buffer
# ===================================================================
print("=== STEP 1: Find Vienna Gürtel ways ===")

class GuertelFinder(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.guertel_way_ids = set()
        self.guertel_way_nodes = {}
        self.guertel_names = []

    def way(self, w):
        name = w.tags.get("name", "")
        if name and name.rstrip().endswith(("Gürtel", "Guertel")):
            refs = [n.ref for n in w.nodes]
            self.guertel_way_ids.add(w.id)
            self.guertel_way_nodes[w.id] = refs
            self.guertel_names.append((w.id, name[:40]))

finder = GuertelFinder()
finder.apply_file(PBF_PATH)
print(f"  All Gürtel ways: {len(finder.guertel_way_ids)}")

# Get node locations for all Gürtel nodes
guertel_nodes = set()
for refs in finder.guertel_way_nodes.values():
    guertel_nodes.update(refs)

class NodeGetter(osmium.SimpleHandler):
    def __init__(self, wanted):
        super().__init__()
        self.wanted = wanted
        self.locs = {}

    def node(self, n):
        if n.id in self.wanted:
            self.locs[n.id] = (n.location.lon, n.location.lat)

ng = NodeGetter(guertel_nodes)
ng.apply_file(PBF_PATH)
print(f"  Gürtel node locations: {len(ng.locs)}")

# Build Gürtel geometries and filter to Vienna
guertel_lines = []
vminx, vminy, vmaxx, vmaxy = VIENNA_BBOX_WGS
vienna_way_ids = set()

for wid, refs in finder.guertel_way_nodes.items():
    pts = [ng.locs[r] for r in refs if r in ng.locs]
    if len(pts) < 2:
        continue
    # Check if way falls within Vienna bbox
    lons = [p[0] for p in pts]
    lats = [p[1] for p in pts]
    avg_lon = sum(lons) / len(lons)
    avg_lat = sum(lats) / len(lats)
    if vminx <= avg_lon <= vmaxx and vminy <= avg_lat <= vmaxy:
        guertel_lines.append(LineString(pts))
        vienna_way_ids.add(wid)

print(f"  Vienna Gürtel ways: {len(vienna_way_ids)}")

if not guertel_lines:
    print("ERROR: No Vienna Gürtel ways found!")
    sys.exit(1)

guertel_mgi = [transform(lambda x, y: wgs_to_mgi.transform(x, y), ln) for ln in guertel_lines]
guertel_union = unary_union(guertel_mgi) if len(guertel_mgi) > 1 else guertel_mgi[0]
buffer_mgi = guertel_union.buffer(500.0)
buffer_wgs84 = transform(lambda x, y: mgi_to_wgs.transform(x, y), buffer_mgi)
bminx, bminy, bmaxx, bmaxy = buffer_wgs84.bounds
print(f"  Buffer area: {buffer_mgi.area/1e6:.2f} km²")
print(f"  Buffer bbox (wgs84): {bminx:.4f},{bminy:.4f} → {bmaxx:.4f},{bmaxy:.4f}")

# ===================================================================
# STEP 2 — Collect PT relation member way IDs
# ===================================================================
print("\n=== STEP 2: PT relations ===")

class PtRelCollector(osmium.SimpleHandler):
    def __init__(self):
        super().__init__()
        self.relations = {}
        self.all_member_way_ids = set()

    def relation(self, r):
        route = r.tags.get("route", "")
        if route not in ROUTE_TYPES:
            return
        tags = dict((k, v) for k, v in r.tags)
        mw_ids = [m.ref for m in r.members if m.type == "w"]
        if not mw_ids:
            return
        self.relations[r.id] = {"tags": tags, "member_way_ids": mw_ids}
        self.all_member_way_ids.update(mw_ids)

pt_collector = PtRelCollector()
pt_collector.apply_file(PBF_PATH)
print(f"  PT relations: {len(pt_collector.relations)}")
print(f"  Unique PT member way IDs: {len(pt_collector.all_member_way_ids)}")

# ===================================================================
# STEP 3 — Streaming feature extraction
# ===================================================================
print("\n=== STEP 3: Feature extraction ===")

class StreamHandler(osmium.SimpleHandler):
    def __init__(self, buffer_geom, bbox, pt_member_way_ids):
        super().__init__()
        self.buffer = buffer_geom
        self.bminx, self.bminy, self.bmaxx, self.bmaxy = bbox
        self.pt_member_way_ids = pt_member_way_ids

        self.highway_features = []    # [(LineString, attrs)]
        self.pt_way_geoms = {}        # way_id -> LineString(wgs84)

    def way(self, w):
        wid = w.id
        tags = dict((k, v) for k, v in w.tags)

        is_highway = "highway" in tags
        is_pt_member = wid in self.pt_member_way_ids

        if not is_highway and not is_pt_member:
            return

        pts = [(n.lon, n.lat) for n in w.nodes if n.location.valid()]
        if len(pts) < 2:
            return

        lons = [p[0] for p in pts]
        lats = [p[1] for p in pts]
        if max(lons) < self.bminx or min(lons) > self.bmaxx or \
           max(lats) < self.bminy or min(lats) > self.bmaxy:
            return

        line = LineString(pts)

        if not line.intersects(self.buffer):
            return

        if is_pt_member:
            self.pt_way_geoms[wid] = line

        if is_highway:
            kept = {}
            for k in ("name", "highway", "maxspeed", "lanes", "surface", "oneway"):
                if k in tags:
                    kept[k] = tags[k]
            kept["osm_id"] = wid
            self.highway_features.append((line, kept))

print("  Creating node location index...")
index = create_map("flex_mem")
nlw = NodeLocationsForWays(index)

handler = StreamHandler(buffer_wgs84, (bminx, bminy, bmaxx, bmaxy), pt_collector.all_member_way_ids)

print("  Reading PBF with streaming pipeline...")
apply(PBF_PATH, nlw, handler)

print(f"  Highway features: {len(handler.highway_features)}")
print(f"  PT member way geometries: {len(handler.pt_way_geoms)}")

# ===================================================================
# STEP 4 — Assemble PT routes
# ===================================================================
print("\n=== STEP 4: Assemble PT routes ===")

pt_features = []
for rid, rdata in pt_collector.relations.items():
    members = []
    for wid in rdata["member_way_ids"]:
        if wid in handler.pt_way_geoms:
            members.append(handler.pt_way_geoms[wid])

    if not members:
        continue

    if len(members) == 1:
        gw = members[0]
    else:
        merged = unary_union(members)
        if merged.geom_type == "MultiLineString":
            gw = merged
        elif merged.geom_type == "LineString":
            gw = MultiLineString([merged])
        else:
            continue

    gm = transform(lambda x, y: wgs_to_mgi.transform(x, y), gw)
    tags = rdata["tags"]
    pt_features.append({
        "geometry": gm,
        "osm_id": rid,
        "ref": tags.get("ref"),
        "name": tags.get("name"),
        "operator": tags.get("operator"),
        "route": tags.get("route"),
    })

print(f"  PT route features: {len(pt_features)}")

# ===================================================================
# STEP 5 — Write GeoPackage
# ===================================================================
print("\n=== Write GeoPackage ===")

# Highways
hw_rows = []
for ln, td in handler.highway_features:
    ln_mgi = transform(lambda x, y: wgs_to_mgi.transform(x, y), ln)
    td = dict(td)
    td["geometry"] = ln_mgi
    hw_rows.append(td)

gdf_hw = gpd.GeoDataFrame(hw_rows, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
gdf_hw = gdf_hw[gdf_hw.geometry.type == "LineString"]

# PT routes
gdf_pt = gpd.GeoDataFrame(pt_features, geometry="geometry", crs=f"EPSG:{TARGET_EPSG}")
for idx in gdf_pt.index:
    g = gdf_pt.at[idx, "geometry"]
    if g.geom_type == "LineString":
        gdf_pt.at[idx, "geometry"] = MultiLineString([g])

print(f"  Highways layer: {len(gdf_hw)} features")
print(f"  PT routes layer: {len(gdf_pt)} features")

gdf_hw.to_file(GPKG_PATH, layer="highways", driver="GPKG", engine="pyogrio")
gdf_pt.to_file(GPKG_PATH, layer="pt_routes", driver="GPKG", engine="pyogrio")
print(f"\nWritten to {GPKG_PATH}")

# ===================================================================
# Verify
# ===================================================================
print("\n=== Verification ===")
for layer in ["highways", "pt_routes"]:
    gdf = gpd.read_file(GPKG_PATH, layer=layer)
    print(f"\nLayer '{layer}':")
    print(f"  CRS: {gdf.crs}")
    print(f"  Rows: {len(gdf)}")
    print(f"  Geometry types: {gdf.geometry.type.unique()}")
    cols = [c for c in gdf.columns if c != "geometry"]
    print(f"  Columns: {cols}")
    if len(gdf) > 0:
        for c in cols:
            vals = gdf[c].dropna().unique()[:5]
            print(f"    {c}: {list(vals)}")
        bounds = gdf.total_bounds
        print(f"  Bounds: {bounds}")

print("\nDone!")
Wrote /work/solve.py (9880 chars).
48:21
Now the script geographically restricts to Vienna's Gürtel. Let me also ensure we read the Austrian standard CRS correctly. EPSG:31287 (MGI / Austria Lambert) is the standard projected CRS for all of Austria. Let me run it.
48:21
Bash cd /work && timeout 480 python3 solve.py 2>&1
=== STEP 1: Find Vienna Gürtel ways ===
  All Gürtel ways: 415
  Gürtel node locations: 1932
  Vienna Gürtel ways: 239
  Buffer area: 11.83 km²
  Buffer bbox (wgs84): 16.3299,48.1757 → 16.4054,48.2408

=== STEP 2: PT relations ===
  PT relations: 12433
  Unique PT member way IDs: 379778

=== STEP 3: Feature extraction ===
  Creating node location index...
  Reading PBF with streaming pipeline...
  Highway features: 9477
  PT member way geometries: 1859

=== STEP 4: Assemble PT routes ===
  PT route features: 366

=== Write GeoPackage ===
  Highways layer: 9477 features
  PT routes layer: 366 features

Written to /work/vienna_network.gpkg

=== Verification ===

Layer 'highways':
  CRS: EPSG:31287
  Rows: 9477
  Geometry types: <ArrowStringArray>
['LineString']
Length: 1, dtype: str
  Columns: ['name', 'highway', 'surface', 'osm_id', 'maxspeed', 'lanes', 'oneway']
    name: ['Anton-Schmid-Promenade', 'Brigittenauer Lände', 'Koppstraße', 'Währinger Gürtel', 'Heiligenstädter Straße']
    highway: ['cycleway', 'trunk_link', 'primary', 'secondary', 'motorway_link']
    surface: ['asphalt', 'concrete', 'paving_stones', 'sett', 'concrete:plates']
    osm_id: [np.int64(3220517), np.int64(4407998), np.int64(4407999), np.int64(4408002), np.int64(4408068)]
    maxspeed: ['50', '80', '70', '30', 'walk']
    lanes: ['1', '2', '4', '3', '7']
    oneway: ['yes', 'no', '-1', 'reversible']
  Bounds: [621624.94282269 479472.78842906 628765.24432609 487679.13770081]

Layer 'pt_routes':
  CRS: EPSG:31287
  Rows: 366
  Geometry types: <ArrowStringArray>
['MultiLineString']
Length: 1, dtype: str
  Columns: ['osm_id', 'ref', 'name', 'operator', 'route']
    osm_id: [np.int64(8526), np.int64(20309), np.int64(22494), np.int64(50332), np.int64(61525)]
    ref: ['S2', 'U3', 'U1', '35A', '40A']
    name: ['S2: Mödling => Laa', 'U3: Ottakring => Simmering', 'U1: Oberlaa => Leopoldau', 'Bus 35A: Spittelau => Salmannsdorf', 'Bus 40A: Schottentor => Döblinger Friedhof/Felix-Dahn-Straße']
    operator: ['ÖBB-Personenverkehr AG', 'Wiener Linien', 'Gschwindl', 'DB Fernverkehr AG;ÖBB', 'Wiener Lokalbahnen']
    route: ['train', 'subway', 'bus', 'tram', 'light_rail']
  Bounds: [621624.94282269 479090.962649   629161.06352941 487679.13770081]

Done!
54:44
The results look good now! The buffer is focused on Vienna (approx 16.33°E–16.41°E, 48.18°N–48.24°N), and the coordinate bounds in MGI / Austria Lambert (621k–629k, 479k–488k) are reasonable for central Vienna.

Let me verify the results more carefully:

1. The highways layer has columns: name, highway, surface, osm_id, maxspeed, lanes, oneway - these match the required attributes (name, highway, maxspeed, lanes, surface, oneway) plus osm_id.

2. The pt_routes layer has: osm_id, ref, name, operator, route - these match the required attributes (ref, name, operator, route) plus osm_id.

3. Both layers are in EPSG:31287 (MGI / Austria Lambert).

4. Highways are LineString, PT routes are MultiLineString.

5. The file is vienna_network.gpkg.

Wait, I see "Anton-Schmid-Promenade" in the highway names - that's a promenade, not a Gürtel, but it might intersect the buffer. That's fine since we want all highways intersecting the buffer, not just the Gürtel itself.

The coordinates look reasonable for MGI / Austria Lambert - around 620k-630k Easting and 479k-488k Northing which is right for Vienna.

Let me also check: does the osm_id column show correctly? Yes, it shows as integer. 

The output seems complete and correct. Let me verify the file exists and do a final check.
54:44
54:44
Bash ls -lh /work/vienna_network.gpkg && python3 -c "
ls -lh /work/vienna_network.gpkg && python3 -c "
import geopandas as gpd
for layer in ['highways','pt_routes']:
    gdf = gpd.read_file('/work/vienna_network.gpkg', layer=layer)
    print(f'{layer}: {len(gdf)} rows, CRS={gdf.crs}, types={gdf.geometry.type.unique()}')
    print(f'  Columns: {[c for c in gdf.columns if c != \"geometry\"]}')
"
-rw-r--r-- 1 agent agent 6.3M Jun 17 04:06 /work/vienna_network.gpkg
highways: 9477 rows, CRS=EPSG:31287, types=<ArrowStringArray>
['LineString']
Length: 1, dtype: str
  Columns: ['name', 'highway', 'surface', 'osm_id', 'maxspeed', 'lanes', 'oneway']
pt_routes: 366 rows, CRS=EPSG:31287, types=<ArrowStringArray>
['MultiLineString']
Length: 1, dtype: str
  Columns: ['osm_id', 'ref', 'name', 'operator', 'route']
54:59
Everything looks good. Let me verify the key requirements:

1. ✅ Highways from Geofabrik Austria PBF intersecting 500m buffer around Vienna's Gürtel
2. ✅ Gürtel defined as highway ways whose name ends in 'Gürtel' (filtered to Vienna area geographically)
3. ✅ PT route relations (bus, tram, subway, train, trolleybus, light_rail) crossing the buffer
4. ✅ Each route as a single feature (MultiLineString)
5. ✅ Full OSM attributes: name, highway, maxspeed, lanes, surface, oneway on highways
6. ✅ ref, name, operator, route on PT routes
7. ✅ GPKG with highways layer (LineString) and pt_routes layer (MultiLineString)
8. ✅ CRS: EPSG:31287 (MGI / Austria Lambert) - Austria's standard projected CRS

The output file is 6.3 MB. Let me verify the coordinate ranges make sense for MGI / Austria Lambert. Vienna center is around (556000, 380000) in EPSG:31287. The bounds I see are:
- Highways: 621624-628765 X, 479472-487679 Y
- PT routes: 621624-629161 X, 479090-487679 Y

Hmm, those X coordinates seem shifted to the east compared to what I'd expect for Vienna. Let me think... MGI / Austria Lambert (EPSG:31287) has its origin at the central meridian of MGI. Vienna should be around:
- X: ~556,000 m
- Y: ~380,000 m

But the bounds show X around 621,000-629,000. That's 65 km east of Vienna. Something is wrong with the CRS transform or identification.

Wait, EPSG:31287 is MGI / Austria Lambert. Let me check what coordinates Vienna should have in that system.

Actually, the issue might be that my WGS coordinates for Vienna's Gürtel are around 16.33-16.40 E, 48.17-48.24 N. Let me verify what these project to in EPSG:31287.

Actually, EPSG:31287 uses MGI (Austria's old datum) with a Lambert conformal conic projection. The false easting is 400,000 m at the central meridian (13°20' E).

Let me verify: For Vienna at ~16.37°E, ~48.20°N in MGI / Austria Lambert... The central meridian is 13°20' E = 13.333°. So Vienna is about 3° east of the central meridian. At 48°N, 1° longitude is about 74 km. So 3° would be about 222 km. With false easting 400,000 m, we'd expect X around 400,000 + 222,000 = 622,000 m. That matches the bounds!

For Y, false northing is -400,000 m. Vienna at 48.2°N... the Lambert projection parameters would put this at around... let me think. The standard parallels are 46°N and 49°N. So 48.2°N is between them. The Y coordinates should be around... well 480,000 seems reasonable given the false northing.

So the coordinates are correct! The X range of 621,000-629,000 is exactly right for Vienna in EPSG:31287.

Everything checks out. The solution is complete.
54:59