fio-l1-paris-kml-pois
Map (reference ← swipe → agent)
0:00
A colleague handed me `paris_late_night_pois`. Our internal map server only takes flat GeoJSON or GeoParquet, so please convert it to `paris_pois.geojson`, one row per placemark. Keep `name`, include the source's classification/grouping as a `category` column, and pull each entry's 'last verified' date out of the HTML blurb into a `verified_date` column as an ISO date. We want to flag stale records.
0:00
System info
{
"adapter": "openrouter",
"model": "deepseek/deepseek-v4-flash",
"base_url": "https://openrouter.ai/api/v1"
}
0:04
Let me start by exploring the working directory and understanding the source data.
0:04
Bash
ls -la /work/
total 20 drwxr-xr-x 1 agent agent 50 Jun 15 14:28 . drwxr-xr-x 1 root root 14 Jun 15 14:28 .. -rw-r--r-- 1 agent agent 18305 Jun 15 14:28 paris_late_night_pois.kml
0:09
Let's first look at the KML file to understand its structure.
0:09
Read
/work/paris_late_night_pois.kml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <kml xmlns="http://www.opengis.net/kml/2.2">
3 <Document>
4 <name>Late-night POIs (RATP night-bus study)</name>
5 <Folder>
6 <name>Cafés ouverts tard</name>
7 <Placemark>
8 <name>% Arabica</name>
9 <description><![CDATA[<b>% Arabica</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/0">Voir la fiche</a><br/>Dernière vérification : 2026-01-01]]></description>
10 <Point>
11 <coordinates>2.324676,48.854156,0</coordinates>
12 </Point>
13 </Placemark>
14 <Placemark>
15 <name>1T. rue Scribe</name>
16 <description><![CDATA[<b>1T. rue Scribe</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/1">Voir la fiche</a><br/>Dernière vérification : 2026-08-14]]></description>
17 <Point>
18 <coordinates>2.329774,48.870581,0</coordinates>
19 </Point>
20 </Placemark>
21 <Placemark>
22 <name>1ndix Café</name>
23 <description><![CDATA[<b>1ndix Café</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/2">Voir la fiche</a><br/>Dernière vérification : 2026-03-27]]></description>
24 <Point>
25 <coordinates>2.341361,48.880764,0</coordinates>
26 </Point>
27 </Placemark>
28 <Placemark>
29 <name>2 Steps Café</name>
30 <description><![CDATA[<b>2 Steps Café</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/3">Voir la fiche</a><br/>Dernière vérification : 2026-10-12]]></description>
31 <Point>
32 <coordinates>2.376626,48.866261,0</coordinates>
33 </Point>
34 </Placemark>
35 <Placemark>
36 <name>3234 Café - Trente Deux Trente Quatre</name>
37 <description><![CDATA[<b>3234 Café - Trente Deux Trente Quatre</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/4">Voir la fiche</a><br/>Dernière vérification : 2026-05-25]]></description>
38 <Point>
39 <coordinates>2.376249,48.853167,0</coordinates>
40 </Point>
41 </Placemark>
42 <Placemark>
43 <name>360 Cafe</name>
44 <description><![CDATA[<b>360 Cafe</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/5">Voir la fiche</a><br/>Dernière vérification : 2026-12-10]]></description>
45 <Point>
46 <coordinates>2.321838,48.842293,0</coordinates>
47 </Point>
48 </Placemark>
49 <Placemark>
50 <name>3w Cafe Paris</name>
51 <description><![CDATA[<b>3w Cafe Paris</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/6">Voir la fiche</a><br/>Dernière vérification : 2026-07-23]]></description>
52 <Point>
53 <coordinates>2.327530,48.851560,0</coordinates>
54 </Point>
55 </Placemark>
56 <Placemark>
57 <name>48 Collagen Café</name>
58 <description><![CDATA[<b>48 Collagen Caf√©</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/7">Voir la fiche</a><br/>Dernière vérification : 2026-02-08]]></description>
59 <Point>
60 <coordinates>2.340460,48.874840,0</coordinates>
61 </Point>
62 </Placemark>
63 <Placemark>
64 <name>86 Champs Elysées</name>
65 <description><![CDATA[<b>86 Champs Elysées</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/8">Voir la fiche</a><br/>Dernière vérification : 2026-09-21]]></description>
66 <Point>
67 <coordinates>2.303767,48.871387,0</coordinates>
68 </Point>
69 </Placemark>
70 <Placemark>
71 <name>A la Ville de Paris</name>
72 <description><![CDATA[<b>A la Ville de Paris</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/9">Voir la fiche</a><br/>Dernière vérification : 2026-04-06]]></description>
73 <Point>
74 <coordinates>2.344280,48.870530,0</coordinates>
75 </Point>
76 </Placemark>
77 <Placemark>
78 <name>A minima cafe</name>
79 <description><![CDATA[<b>A minima cafe</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/10">Voir la fiche</a><br/>Dernière vérification : 2026-11-19]]></description>
80 <Point>
81 <coordinates>2.381290,48.845300,0</coordinates>
82 </Point>
83 </Placemark>
84 <Placemark>
85 <name>AD DictCafé</name>
86 <description><![CDATA[<b>AD DictCafé</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/11">Voir la fiche</a><br/>Dernière vérification : 2026-06-04]]></description>
87 <Point>
88 <coordinates>2.390577,48.849002,0</coordinates>
89 </Point>
90 </Placemark>
91 <Placemark>
92 <name>ANGIOLO</name>
93 <description><![CDATA[<b>ANGIOLO</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/12">Voir la fiche</a><br/>Dernière vérification : 2026-01-17]]></description>
94 <Point>
95 <coordinates>2.331682,48.854538,0</coordinates>
96 </Point>
97 </Placemark>
98 <Placemark>
99 <name>Abaita</name>
100 <description><![CDATA[<b>Abaita</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/13">Voir la fiche</a><br/>Dernière vérification : 2026-08-02]]></description>
101 <Point>
102 <coordinates>2.368446,48.864406,0</coordinates>
103 </Point>
104 </Placemark>
105 <Placemark>
106 <name>Absinthe Café</name>
107 <description><![CDATA[<b>Absinthe Café</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/14">Voir la fiche</a><br/>Dernière vérification : 2026-03-15]]></description>
108 <Point>
109 <coordinates>2.357228,48.865564,0</coordinates>
110 </Point>
111 </Placemark>
112 <Placemark>
113 <name>Acai & You</name>
114 <description><![CDATA[<b>Acai & You</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/15">Voir la fiche</a><br/>Dernière vérification : 2026-10-28]]></description>
115 <Point>
116 <coordinates>2.339001,48.854247,0</coordinates>
117 </Point>
118 </Placemark>
119 <Placemark>
120 <name>Adulis Cafe Abyssinien</name>
121 <description><![CDATA[<b>Adulis Cafe Abyssinien</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/16">Voir la fiche</a><br/>Dernière vérification : 2026-05-13]]></description>
122 <Point>
123 <coordinates>2.346929,48.851763,0</coordinates>
124 </Point>
125 </Placemark>
126 <Placemark>
127 <name>After Place</name>
128 <description><![CDATA[<b>After Place</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/17">Voir la fiche</a><br/>Dernière vérification : 2026-12-26]]></description>
129 <Point>
130 <coordinates>2.336121,48.864266,0</coordinates>
131 </Point>
132 </Placemark>
133 <Placemark>
134 <name>After Place</name>
135 <description><![CDATA[<b>After Place</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/18">Voir la fiche</a><br/>Dernière vérification : 2026-07-11]]></description>
136 <Point>
137 <coordinates>2.351101,48.862294,0</coordinates>
138 </Point>
139 </Placemark>
140 <Placemark>
141 <name>After Place</name>
142 <description><![CDATA[<b>After Place</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/19">Voir la fiche</a><br/>Dernière vérification : 2026-02-24]]></description>
143 <Point>
144 <coordinates>2.361560,48.869500,0</coordinates>
145 </Point>
146 </Placemark>
147 </Folder>
148 <Folder>
149 <name>Bibliothèques de nuit</name>
150 <Placemark>
151 <name>AUP Housing</name>
152 <description><![CDATA[<b>AUP Housing</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/20">Voir la fiche</a><br/>Dernière vérification : 2026-09-09]]></description>
153 <Point>
154 <coordinates>2.306008,48.861754,0</coordinates>
155 </Point>
156 </Placemark>
157 <Placemark>
158 <name>Academie de Medical</name>
159 <description><![CDATA[<b>Academie de Medical</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/21">Voir la fiche</a><br/>Dernière vérification : 2026-04-22]]></description>
160 <Point>
161 <coordinates>2.334208,48.856342,0</coordinates>
162 </Point>
163 </Placemark>
164 <Placemark>
165 <name>Ancêtres Italiens</name>
166 <description><![CDATA[<b>Ancêtres Italiens</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/22">Voir la fiche</a><br/>Dernière vérification : 2026-11-07]]></description>
167 <Point>
168 <coordinates>2.363314,48.873246,0</coordinates>
169 </Point>
170 </Placemark>
171 <Placemark>
172 <name>Apfelbaum Library</name>
173 <description><![CDATA[<b>Apfelbaum Library</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/23">Voir la fiche</a><br/>Dernière vérification : 2026-06-20]]></description>
174 <Point>
175 <coordinates>2.330685,48.849518,0</coordinates>
176 </Point>
177 </Placemark>
178 <Placemark>
179 <name>Archives de l'AP-HP</name>
180 <description><![CDATA[<b>Archives de l'AP-HP</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/24">Voir la fiche</a><br/>Dernière vérification : 2026-01-05]]></description>
181 <Point>
182 <coordinates>2.366407,48.857094,0</coordinates>
183 </Point>
184 </Placemark>
185 <Placemark>
186 <name>Arthur Rimbaud Library</name>
187 <description><![CDATA[<b>Arthur Rimbaud Library</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/25">Voir la fiche</a><br/>Dernière vérification : 2026-08-18]]></description>
188 <Point>
189 <coordinates>2.355874,48.856134,0</coordinates>
190 </Point>
191 </Placemark>
192 <Placemark>
193 <name>Association Bibliotheque Braille Enfantine</name>
194 <description><![CDATA[<b>Association Bibliotheque Braille Enfantine</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/26">Voir la fiche</a><br/>Dernière vérification : 2026-03-03]]></description>
195 <Point>
196 <coordinates>2.372404,48.868038,0</coordinates>
197 </Point>
198 </Placemark>
199 <Placemark>
200 <name>Association Francaise de Therapie</name>
201 <description><![CDATA[<b>Association Francaise de Therapie</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/27">Voir la fiche</a><br/>Dernière vérification : 2026-10-16]]></description>
202 <Point>
203 <coordinates>2.308679,48.870960,0</coordinates>
204 </Point>
205 </Placemark>
206 <Placemark>
207 <name>Attractive</name>
208 <description><![CDATA[<b>Attractive</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/28">Voir la fiche</a><br/>Dernière vérification : 2026-05-01]]></description>
209 <Point>
210 <coordinates>2.326704,48.866369,0</coordinates>
211 </Point>
212 </Placemark>
213 <Placemark>
214 <name>BNF</name>
215 <description><![CDATA[<b>BNF</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/29">Voir la fiche</a><br/>Dernière vérification : 2026-12-14]]></description>
216 <Point>
217 <coordinates>2.338076,48.867811,0</coordinates>
218 </Point>
219 </Placemark>
220 <Placemark>
221 <name>BPI Bibliotheque Publique D Information</name>
222 <description><![CDATA[<b>BPI Bibliotheque Publique D Information</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/30">Voir la fiche</a><br/>Dernière vérification : 2026-07-27]]></description>
223 <Point>
224 <coordinates>2.352378,48.859478,0</coordinates>
225 </Point>
226 </Placemark>
227 <Placemark>
228 <name>BPI-Beaubourg</name>
229 <description><![CDATA[<b>BPI-Beaubourg</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/31">Voir la fiche</a><br/>Dernière vérification : 2026-02-12]]></description>
230 <Point>
231 <coordinates>2.351867,48.860532,0</coordinates>
232 </Point>
233 </Placemark>
234 <Placemark>
235 <name>Beaubourg Librairie des Arts</name>
236 <description><![CDATA[<b>Beaubourg Librairie des Arts</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/32">Voir la fiche</a><br/>Dernière vérification : 2026-09-25]]></description>
237 <Point>
238 <coordinates>2.356791,48.865189,0</coordinates>
239 </Point>
240 </Placemark>
241 <Placemark>
242 <name>Bib Cent du Service de Sante Des Armees</name>
243 <description><![CDATA[<b>Bib Cent du Service de Sante Des Armees</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/33">Voir la fiche</a><br/>Dernière vérification : 2026-04-10]]></description>
244 <Point>
245 <coordinates>2.341252,48.841187,0</coordinates>
246 </Point>
247 </Placemark>
248 <Placemark>
249 <name>Bibliothe que Centrale - Muse'um national d'histoire naturelle (MNHN)</name>
250 <description><![CDATA[<b>Bibliothe que Centrale - Muse'um national d'histoire naturelle (MNHN)</b><br/>Catégorie : Bibliothèques de nuit<br/><a href="https://example.org/poi/34">Voir la fiche</a><br/>Dernière vérification : 2026-11-23]]></description>
251 <Point>
252 <coordinates>2.355838,48.842660,0</coordinates>
253 </Point>
254 </Placemark>
255 </Folder>
256 <Folder>
257 <name>Tours et infos touristiques</name>
258 <Placemark>
259 <name>24hourstravel</name>
260 <description><![CDATA[<b>24hourstravel</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/35">Voir la fiche</a><br/>Dernière vérification : 2026-06-08]]></description>
261 <Point>
262 <coordinates>2.347810,48.864739,0</coordinates>
263 </Point>
264 </Placemark>
265 <Placemark>
266 <name>2CVPrivateParis</name>
267 <description><![CDATA[<b>2CVPrivateParis</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/36">Voir la fiche</a><br/>Dernière vérification : 2026-01-21]]></description>
268 <Point>
269 <coordinates>2.337967,48.889998,0</coordinates>
270 </Point>
271 </Placemark>
272 <Placemark>
273 <name>382444404</name>
274 <description><![CDATA[<b>382444404</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/37">Voir la fiche</a><br/>Dernière vérification : 2026-08-06]]></description>
275 <Point>
276 <coordinates>2.326910,48.870680,0</coordinates>
277 </Point>
278 </Placemark>
279 <Placemark>
280 <name>A French Frye in Paris</name>
281 <description><![CDATA[<b>A French Frye in Paris</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/38">Voir la fiche</a><br/>Dernière vérification : 2026-03-19]]></description>
282 <Point>
283 <coordinates>2.340830,48.861240,0</coordinates>
284 </Point>
285 </Placemark>
286 <Placemark>
287 <name>Aeon Tours of Paris</name>
288 <description><![CDATA[<b>Aeon Tours of Paris</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/39">Voir la fiche</a><br/>Dernière vérification : 2026-10-04]]></description>
289 <Point>
290 <coordinates>2.358590,48.863780,0</coordinates>
291 </Point>
292 </Placemark>
293 <Placemark>
294 <name>Agence ORNIcom : location de gyropode Segway à Paris et sur toute la France</name>
295 <description><![CDATA[<b>Agence ORNIcom : location de gyropode Segway à Paris et sur toute la France</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/40">Voir la fiche</a><br/>Dernière vérification : 2026-05-17]]></description>
296 <Point>
297 <coordinates>2.303244,48.852662,0</coordinates>
298 </Point>
299 </Placemark>
300 <Placemark>
301 <name>Al Abrar Voyages</name>
302 <description><![CDATA[<b>Al Abrar Voyages</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/41">Voir la fiche</a><br/>Dernière vérification : 2026-12-02]]></description>
303 <Point>
304 <coordinates>2.312770,48.886310,0</coordinates>
305 </Point>
306 </Placemark>
307 <Placemark>
308 <name>Allison Zinder</name>
309 <description><![CDATA[<b>Allison Zinder</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/42">Voir la fiche</a><br/>Dernière vérification : 2026-07-15]]></description>
310 <Point>
311 <coordinates>2.398851,48.862635,0</coordinates>
312 </Point>
313 </Placemark>
314 <Placemark>
315 <name>Association Touristique Cheminots</name>
316 <description><![CDATA[<b>Association Touristique Cheminots</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/43">Voir la fiche</a><br/>Dernière vérification : 2026-02-28]]></description>
317 <Point>
318 <coordinates>2.371287,48.846263,0</coordinates>
319 </Point>
320 </Placemark>
321 <Placemark>
322 <name>Avenir Tourisme</name>
323 <description><![CDATA[<b>Avenir Tourisme</b><br/>Catégorie : Tours et infos touristiques<br/><a href="https://example.org/poi/44">Voir la fiche</a><br/>Dernière vérification : 2026-09-13]]></description>
324 <Point>
325 <coordinates>2.358040,48.868509,0</coordinates>
326 </Point>
327 </Placemark>
328 </Folder>
329 </Document>
330 </kml>
0:18
Let me understand the data:
1. It's a KML file with `<Document>` containing multiple `<Folder>` elements
2. Each Folder has a `<name>` which is the category/grouping (e.g., "Cafés ouverts tard", "Bibliothèques de nuit", "Tours et infos touristiques")
3. Each Placemark has:
- `<name>` - the POI name
- `<description>` with CDATA containing HTML with:
- The POI name again
- `Catégorie : <category name>` (the same as the folder name)
- A link
- `Dernière vérification : YYYY-MM-DD` (last verified date)
- `<Point><coordinates>` - lon,lat,height
I need to convert this to a flat GeoJSON with:
- `name` column
- `category` column (from the Folder name)
- `verified_date` column (ISO date parsed from the HTML blurb)
- Geometry as Point
Let me write a Python script using geopandas/lxml or just parse the KML manually.
Actually, I can use `geopandas.read_file()` to read KML. Let me try that approach first.
0:18
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
gdf = gpd.read_file('paris_late_night_pois.kml')
print(gdf.columns.tolist())
print(gdf.head(3))
"
['id', 'Name', 'description', 'timestamp', 'begin', 'end', 'altitudeMode', 'tessellate', 'extrude', 'visibility', 'drawOrder', 'icon', 'geometry']
id Name ... icon geometry
0 None % Arabica ... None POINT Z (2.32468 48.85416 0)
1 None 1T. rue Scribe ... None POINT Z (2.32977 48.87058 0)
2 None 1ndix Café ... None POINT Z (2.34136 48.88076 0)
[3 rows x 13 columns]
/usr/local/lib/python3.14/site-packages/pyogrio/geopandas.py:382: UserWarning: More than one layer found in 'paris_late_night_pois.kml': 'Cafés ouverts tard' (default), 'Bibliothèques de nuit', 'Tours et infos touristiques'. Specify layer parameter to avoid this warning. result = read_func(
0:51
Good, geopandas reads the KML. The description field contains the HTML blurb with: - The category (which matches the folder name) - A "Dernière vérification" date I need to: 1. Read the KML with all layers/folders, or get the category from the folder name 2. Parse the HTML description to extract the verification date 3. Create a flat GeoJSON with columns: name, category, verified_date, geometry Let me read each layer separately to get the category from the folder name, then concatenate them. Actually, looking at the KML structure: - Each Folder has a <name> which is the category - Placemarks have <description> with HTML containing "Catégorie : <category>" and "Dernière vérification : <date>" Let me parse each folder as a separate layer, assign the category, then combine.
0:51
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
# Read each folder/layer
categories = ['Cafés ouverts tard', 'Bibliothèques de nuit', 'Tours et infos touristiques']
frames = []
for cat in categories:
gdf = gpd.read_file('paris_late_night_pois.kml', layer=cat)
gdf['category'] = cat
frames.append(gdf)
gdf_all = gpd.pd.concat(frames, ignore_index=True)
print(gdf_all[['Name', 'description', 'category']].head(10))
"
Name ... category 0 % Arabica ... Cafés ouverts tard 1 1T. rue Scribe ... Cafés ouverts tard 2 1ndix Café ... Cafés ouverts tard 3 2 Steps Café ... Cafés ouverts tard 4 3234 Café - Trente Deux Trente Quatre ... Cafés ouverts tard 5 360 Cafe ... Cafés ouverts tard 6 3w Cafe Paris ... Cafés ouverts tard 7 48 Collagen Caf√© ... Cafés ouverts tard 8 86 Champs Elysées ... Cafés ouverts tard 9 A la Ville de Paris ... Cafés ouverts tard [10 rows x 3 columns]
1:02
Now I need to parse the HTML description to extract the "last verified" date. Looking at the description format: ``` <b>% Arabica</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/0">Voir la fiche</a><br/>Dernière vérification : 2026-01-01 ``` The pattern is "Dernière vérification : YYYY-MM-DD" or in HTML entities: `Dernière vérification : YYYY-MM-DD` Let me write the full conversion script.
1:02
Write
/work/solve.py
content · 50 lines · py
#!/usr/bin/env python3
"""Convert paris_late_night_pois.kml to flat GeoJSON with extra columns."""
import re
import geopandas as gpd
INPUT = "/work/paris_late_night_pois.kml"
OUTPUT = "/work/paris_pois.geojson"
# Read each folder layer (the folder names become the category)
categories = ["Cafés ouverts tard", "Bibliothèques de nuit", "Tours et infos touristiques"]
frames = []
for cat in categories:
gdf = gpd.read_file(INPUT, layer=cat)
gdf["category"] = cat
frames.append(gdf)
gdf = gpd.pd.concat(frames, ignore_index=True)
# Keep only the columns we need
gdf = gdf[["Name", "description", "category", "geometry"]].copy()
gdf.rename(columns={"Name": "name"}, inplace=True)
# Parse verified_date from description HTML
# Pattern: Dernière vérification : YYYY-MM-DD
# The HTML entities: Dernière vérification : 2026-01-01
date_pat = re.compile(
r"Derni[èe]re\s*v[ée]rification\s*:\s*(\d{4}-\d{2}-\d{2})",
re.IGNORECASE,
)
def extract_date(desc):
if not desc:
return None
m = date_pat.search(desc)
if m:
return m.group(1)
return None
gdf["verified_date"] = gdf["description"].apply(extract_date)
# Drop the raw description column
gdf.drop(columns=["description"], inplace=True)
# Write to GeoJSON (flat, one row per placemark)
gdf.to_file(OUTPUT, driver="GeoJSON")
# Quick summary
print(f"Written {len(gdf)} rows to {OUTPUT}")
print(gdf.head())
Wrote /work/solve.py (1427 chars).
1:05
Let me run the script.
1:05
Bash
cd /work && python solve.py
Written 45 rows to /work/paris_pois.geojson
name ... verified_date
0 % Arabica ... None
1 1T. rue Scribe ... None
2 1ndix Café ... None
3 2 Steps Café ... None
4 3234 Café - Trente Deux Trente Quatre ... None
[5 rows x 4 columns]
1:22
The regex didn't match. The issue is that the HTML description contains HTML entities: `è` for `è`, `é` for `é`, ` ` for non-breaking space. The description field probably got the raw HTML with entities. Let me check what the description actually looks like.
1:22
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
gdf = gpd.read_file('paris_late_night_pois.kml', layer='Cafés ouverts tard')
print(repr(gdf['description'].iloc[0]))
"
'<b>% Arabica</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/0">Voir la fiche</a><br/>Dernière vérification : 2026-01-01'
1:48
The description text uses HTML entities: `è` for `è` and `é` for `é`. So the pattern in the raw text is: `Dernière vérification : 2026-01-01` My regex was using the actual Unicode characters `è` and `é` which don't match the HTML entities. Let me fix the regex to match the HTML-encoded version.
1:48
1:48
Edit
/work/solve.py
Edited /work/solve.py.
1:54
Wait, but the regex might have issues because the description contains the actual HTML entities. Let me check: the description has `Dernière vérification : 2026-01-01`. So I need to match the literal text. Let me think about this... The description text is: ``` <b>% Arabica</b><br/>Catégorie : Cafés ouverts tard<br/><a href="https://example.org/poi/0">Voir la fiche</a><br/>Dernière vérification : 2026-01-01 ``` So it has: - `Dernière` (with entity) - `vérification` (with entity) - ` :` (with entity) My updated regex should match this. Let me run it.
1:54
Bash
cd /work && python solve.py
Written 45 rows to /work/paris_pois.geojson
name ... verified_date
0 % Arabica ... 2026-01-01
1 1T. rue Scribe ... 2026-08-14
2 1ndix Café ... 2026-03-27
3 2 Steps Café ... 2026-10-12
4 3234 Café - Trente Deux Trente Quatre ... 2026-05-25
[5 rows x 4 columns]
2:01
The dates are now being extracted correctly. Let me verify the output file is valid GeoJSON with the expected structure.
2:01
2:01
Read
/work/paris_pois.geojson
1 {
2 "type": "FeatureCollection",
3 "name": "paris_pois",
4 "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
5 "features": [
6 { "type": "Feature", "properties": { "name": "% Arabica", "category": "Cafés ouverts tard", "verified_date": "2026-01-01" }, "geometry": { "type": "Point", "coordinates": [ 2.324676, 48.854156, 0.0 ] } },
7 { "type": "Feature", "properties": { "name": "1T. rue Scribe", "category": "Cafés ouverts tard", "verified_date": "2026-08-14" }, "geometry": { "type": "Point", "coordinates": [ 2.329774, 48.870581, 0.0 ] } },
8 { "type": "Feature", "properties": { "name": "1ndix Café", "category": "Cafés ouverts tard", "verified_date": "2026-03-27" }, "geometry": { "type": "Point", "coordinates": [ 2.341361, 48.880764, 0.0 ] } },
9 { "type": "Feature", "properties": { "name": "2 Steps Café", "category": "Cafés ouverts tard", "verified_date": "2026-10-12" }, "geometry": { "type": "Point", "coordinates": [ 2.376626, 48.866261, 0.0 ] } },
10 { "type": "Feature", "properties": { "name": "3234 Café - Trente Deux Trente Quatre", "category": "Cafés ouverts tard", "verified_date": "2026-05-25" }, "geometry": { "type": "Point", "coordinates": [ 2.376249, 48.853167, 0.0 ] } },
11 { "type": "Feature", "properties": { "name": "360 Cafe", "category": "Cafés ouverts tard", "verified_date": "2026-12-10" }, "geometry": { "type": "Point", "coordinates": [ 2.321838, 48.842293, 0.0 ] } },
12 { "type": "Feature", "properties": { "name": "3w Cafe Paris", "category": "Cafés ouverts tard", "verified_date": "2026-07-23" }, "geometry": { "type": "Point", "coordinates": [ 2.32753, 48.85156, 0.0 ] } },
13 { "type": "Feature", "properties": { "name": "48 Collagen Caf√©", "category": "Cafés ouverts tard", "verified_date": "2026-02-08" }, "geometry": { "type": "Point", "coordinates": [ 2.34046, 48.87484, 0.0 ] } },
14 { "type": "Feature", "properties": { "name": "86 Champs Elysées", "category": "Cafés ouverts tard", "verified_date": "2026-09-21" }, "geometry": { "type": "Point", "coordinates": [ 2.303767, 48.871387, 0.0 ] } },
15 { "type": "Feature", "properties": { "name": "A la Ville de Paris", "category": "Cafés ouverts tard", "verified_date": "2026-04-06" }, "geometry": { "type": "Point", "coordinates": [ 2.34428, 48.87053, 0.0 ] } },
16 { "type": "Feature", "properties": { "name": "A minima cafe", "category": "Cafés ouverts tard", "verified_date": "2026-11-19" }, "geometry": { "type": "Point", "coordinates": [ 2.38129, 48.8453, 0.0 ] } },
17 { "type": "Feature", "properties": { "name": "AD DictCafé", "category": "Cafés ouverts tard", "verified_date": "2026-06-04" }, "geometry": { "type": "Point", "coordinates": [ 2.390577, 48.849002, 0.0 ] } },
18 { "type": "Feature", "properties": { "name": "ANGIOLO", "category": "Cafés ouverts tard", "verified_date": "2026-01-17" }, "geometry": { "type": "Point", "coordinates": [ 2.331682, 48.854538, 0.0 ] } },
19 { "type": "Feature", "properties": { "name": "Abaita", "category": "Cafés ouverts tard", "verified_date": "2026-08-02" }, "geometry": { "type": "Point", "coordinates": [ 2.368446, 48.864406, 0.0 ] } },
20 { "type": "Feature", "properties": { "name": "Absinthe Café", "category": "Cafés ouverts tard", "verified_date": "2026-03-15" }, "geometry": { "type": "Point", "coordinates": [ 2.357228, 48.865564, 0.0 ] } },
21 { "type": "Feature", "properties": { "name": "Acai & You", "category": "Cafés ouverts tard", "verified_date": "2026-10-28" }, "geometry": { "type": "Point", "coordinates": [ 2.339001, 48.854247, 0.0 ] } },
22 { "type": "Feature", "properties": { "name": "Adulis Cafe Abyssinien", "category": "Cafés ouverts tard", "verified_date": "2026-05-13" }, "geometry": { "type": "Point", "coordinates": [ 2.346929, 48.851763, 0.0 ] } },
23 { "type": "Feature", "properties": { "name": "After Place", "category": "Cafés ouverts tard", "verified_date": "2026-12-26" }, "geometry": { "type": "Point", "coordinates": [ 2.336121, 48.864266, 0.0 ] } },
24 { "type": "Feature", "properties": { "name": "After Place", "category": "Cafés ouverts tard", "verified_date": "2026-07-11" }, "geometry": { "type": "Point", "coordinates": [ 2.351101, 48.862294, 0.0 ] } },
25 { "type": "Feature", "properties": { "name": "After Place", "category": "Cafés ouverts tard", "verified_date": "2026-02-24" }, "geometry": { "type": "Point", "coordinates": [ 2.36156, 48.8695, 0.0 ] } },
26 { "type": "Feature", "properties": { "name": "AUP Housing", "category": "Bibliothèques de nuit", "verified_date": "2026-09-09" }, "geometry": { "type": "Point", "coordinates": [ 2.306008, 48.861754, 0.0 ] } },
27 { "type": "Feature", "properties": { "name": "Academie de Medical", "category": "Bibliothèques de nuit", "verified_date": "2026-04-22" }, "geometry": { "type": "Point", "coordinates": [ 2.334208, 48.856342, 0.0 ] } },
28 { "type": "Feature", "properties": { "name": "Ancêtres Italiens", "category": "Bibliothèques de nuit", "verified_date": "2026-11-07" }, "geometry": { "type": "Point", "coordinates": [ 2.363314, 48.873246, 0.0 ] } },
29 { "type": "Feature", "properties": { "name": "Apfelbaum Library", "category": "Bibliothèques de nuit", "verified_date": "2026-06-20" }, "geometry": { "type": "Point", "coordinates": [ 2.330685, 48.849518, 0.0 ] } },
30 { "type": "Feature", "properties": { "name": "Archives de l'AP-HP", "category": "Bibliothèques de nuit", "verified_date": "2026-01-05" }, "geometry": { "type": "Point", "coordinates": [ 2.366407, 48.857094, 0.0 ] } },
31 { "type": "Feature", "properties": { "name": "Arthur Rimbaud Library", "category": "Bibliothèques de nuit", "verified_date": "2026-08-18" }, "geometry": { "type": "Point", "coordinates": [ 2.355874, 48.856134, 0.0 ] } },
32 { "type": "Feature", "properties": { "name": "Association Bibliotheque Braille Enfantine", "category": "Bibliothèques de nuit", "verified_date": "2026-03-03" }, "geometry": { "type": "Point", "coordinates": [ 2.372404, 48.868038, 0.0 ] } },
33 { "type": "Feature", "properties": { "name": "Association Francaise de Therapie", "category": "Bibliothèques de nuit", "verified_date": "2026-10-16" }, "geometry": { "type": "Point", "coordinates": [ 2.308679, 48.87096, 0.0 ] } },
34 { "type": "Feature", "properties": { "name": "Attractive", "category": "Bibliothèques de nuit", "verified_date": "2026-05-01" }, "geometry": { "type": "Point", "coordinates": [ 2.326704, 48.866369, 0.0 ] } },
35 { "type": "Feature", "properties": { "name": "BNF", "category": "Bibliothèques de nuit", "verified_date": "2026-12-14" }, "geometry": { "type": "Point", "coordinates": [ 2.338076, 48.867811, 0.0 ] } },
36 { "type": "Feature", "properties": { "name": "BPI Bibliotheque Publique D Information", "category": "Bibliothèques de nuit", "verified_date": "2026-07-27" }, "geometry": { "type": "Point", "coordinates": [ 2.352378, 48.859478, 0.0 ] } },
37 { "type": "Feature", "properties": { "name": "BPI-Beaubourg", "category": "Bibliothèques de nuit", "verified_date": "2026-02-12" }, "geometry": { "type": "Point", "coordinates": [ 2.351867, 48.860532, 0.0 ] } },
38 { "type": "Feature", "properties": { "name": "Beaubourg Librairie des Arts", "category": "Bibliothèques de nuit", "verified_date": "2026-09-25" }, "geometry": { "type": "Point", "coordinates": [ 2.356791, 48.865189, 0.0 ] } },
39 { "type": "Feature", "properties": { "name": "Bib Cent du Service de Sante Des Armees", "category": "Bibliothèques de nuit", "verified_date": "2026-04-10" }, "geometry": { "type": "Point", "coordinates": [ 2.341252, 48.841187, 0.0 ] } },
40 { "type": "Feature", "properties": { "name": "Bibliothe que Centrale - Muse'um national d'histoire naturelle (MNHN)", "category": "Bibliothèques de nuit", "verified_date": "2026-11-23" }, "geometry": { "type": "Point", "coordinates": [ 2.355838, 48.84266, 0.0 ] } },
<system-reminder>Showed lines 1-40 of 52. Re-call Read with offset=41 to continue, or raise limit.</system-reminder>
2:06
The output looks good. Let me run a quick sanity check to make sure all 45 features have their verified_date extracted.
2:06
Bash
cd /work && python3 -c "
cd /work && python3 -c "
import geopandas as gpd
gdf = gpd.read_file('paris_pois.geojson')
print(f'Total features: {len(gdf)}')
print(f'Columns: {gdf.columns.tolist()}')
print(f'Null verified_date: {gdf.verified_date.isna().sum()}')
print(f'Unique categories: {gdf.category.unique().tolist()}')
print()
print(gdf[['name', 'category', 'verified_date']].head(3))
print('...')
print(gdf[['name', 'category', 'verified_date']].tail(3))
"
Total features: 45
Columns: ['name', 'category', 'verified_date', 'geometry']
Null verified_date: 0
Unique categories: ['Cafés ouverts tard', 'Bibliothèques de nuit', 'Tours et infos touristiques']
name category verified_date
0 % Arabica Cafés ouverts tard 2026-01-01
1 1T. rue Scribe Cafés ouverts tard 2026-08-14
2 1ndix Café Cafés ouverts tard 2026-03-27
...
name ... verified_date
42 Allison Zinder ... 2026-07-15
43 Association Touristique Cheminots ... 2026-02-28
44 Avenir Tourisme ... 2026-09-13
[3 rows x 3 columns]
2:12
The conversion is complete. Let me summarize what was done.
2:12