Hi there! Similar to the free parking lots scraper, this code collects data of enormous importance!
In the spirit of e-governance a couple of german cities decided to force every citizen to acquaint themselves with this internet thing and require them to make appointments with, e.g., the resident's registration office through a web interface. For these offices, there is no just-go-there-and-wait-till-you're-called anymore. In fact, if you go there without the web-appointment, the service staff usually barks and blusters in the well-known german public office tone.
Anyways! Now we have this internet thing going and it's not so hard to track the development of the office schedules through time. From that data we can infer:
- the business of the offices over time (e.g. barking at foreigners)
- the desire of citizens to plan ahead (e.g. not taking the first free slot but one in 4 weeks)
- the most popular time of day for appointments
- the most popular time of day for making appointments
- the cancellation rate
And all of that for a couple of different cities for comparison. As stated earlier: data of enormous importance!
Data scraping started at around 2021-07-09 and is weekly exported to the office-schedule-data repository.
To collect data yourself:
# install
git clone https://github.com/defgsus/office-schedule-scraper
cd office-schedule-scraper
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt
# make a snapshot
python scraper.py snapshot -w <X> -p <Y>
# where <X> is the number of weeks to look ahead (defaults to 4)
# where <Y> is the number of parallel processes to run
Which leads to a lot of JSON files in the snapshot
directory.
Using export.py, the data will be exported to compressed bunches of CSV files as described in office-schedule-data.
Here's a list of websites that are scraped (compiled via python scraper.py list
):
The scraped interfaces which are used by most websites:
- tevis: https://www.kommunix.de/produkte/tevis/
- netappoint: http://www.edv-kahlert.de/produkte/Netcallup/netalarmpro1.htm
- etermin: https://www.etermin.net/
- tempus: http://berner-telecom.de/
At each snapshot all available dates are recorded for each listed office department. The tevis system shows the available dates for the next N full weeks, where N is set to 6 in my recording job. The etermin and netappoint system is asked for the next N * 7 days.
Here's an example for one day from the website of Bonn:
Führerscheinwesen | Kfz-Zulassungswesen | Meldewesen | |
---|---|---|---|
2021-08-10T07:45:00 | X | ||
2021-08-10T07:50:00 | X | ||
2021-08-10T07:55:00 | X | X | |
2021-08-10T08:00:00 | X | X | |
2021-08-10T08:05:00 | X | X | |
2021-08-10T08:10:00 | X | X | |
2021-08-10T08:15:00 | X | X | |
2021-08-10T08:20:00 | X | X | |
2021-08-10T08:25:00 | X | X | |
2021-08-10T08:30:00 | X | X | |
2021-08-10T08:35:00 | X | X | |
2021-08-10T08:40:00 | X | X | |
2021-08-10T08:45:00 | X | X | |
2021-08-10T08:50:00 | X | X | |
2021-08-10T08:55:00 | X | X | |
2021-08-10T09:00:00 | X | ||
2021-08-10T09:05:00 | X | ||
2021-08-10T09:10:00 | X | ||
2021-08-10T09:15:00 | X | X | |
2021-08-10T09:20:00 | X | X | |
2021-08-10T09:25:00 | X | X | |
2021-08-10T09:30:00 | X | X | |
2021-08-10T09:35:00 | X | X | |
2021-08-10T09:40:00 | X | X | |
2021-08-10T09:45:00 | X | X | |
2021-08-10T09:50:00 | X | X | |
2021-08-10T09:55:00 | X | X | |
2021-08-10T10:00:00 | X | ||
2021-08-10T10:05:00 | X | ||
2021-08-10T10:10:00 | X | ||
2021-08-10T10:15:00 | X | ||
2021-08-10T10:20:00 | X | X | |
2021-08-10T10:25:00 | X | X | |
2021-08-10T10:30:00 | X | ||
2021-08-10T10:35:00 | X | ||
2021-08-10T10:40:00 | X | ||
2021-08-10T10:45:00 | X | ||
2021-08-10T10:50:00 | X | X | |
2021-08-10T10:55:00 | X | X | |
2021-08-10T11:00:00 | X | ||
2021-08-10T11:05:00 | X | ||
2021-08-10T11:10:00 | X | X | |
2021-08-10T11:15:00 | X | X | |
2021-08-10T11:20:00 | X | X | |
2021-08-10T11:25:00 | X | X | |
2021-08-10T11:30:00 | X | X | |
2021-08-10T11:35:00 | X | X | |
2021-08-10T11:40:00 | X | X | |
2021-08-10T11:45:00 | X | X | |
2021-08-10T11:50:00 | X | X | |
2021-08-10T11:55:00 | X | X | |
2021-08-10T12:00:00 | X | ||
2021-08-10T12:05:00 | X | ||
2021-08-10T12:10:00 | X | ||
2021-08-10T12:15:00 | X | X | |
2021-08-10T12:20:00 | X | X | |
2021-08-10T12:25:00 | X | ||
2021-08-10T12:30:00 | X | ||
2021-08-10T12:35:00 | X | ||
2021-08-10T12:40:00 | X |
Obviously they offer 5 minute slots and, yes, german offices might close quite early. By looking at more data one can see that Meldewesen and Führerscheinwesen exchange availability between weeks.
It's not possible for all websites/systems to gather the actual business hours for each day so a single snapshot is not necessarily enough to calculate the correct number of appointments for each day.
However, when comparing two successive snapshots, it's possible to count new appointments or cancellations quite robustly and then attach a timestamp of when the appointments where made, clicked or activated or however this is called.
Still, there seem to be some erratic updates which mess up the calculation and some offices seem to only update the availability every couple of days. In fact, this project turned out to demand a lot of time and work, especially in finding and fixing all my mistakes in the beginning. After the first 10 weeks of data collection, some of the mistakes have been spotted.
Below's a week of extracted data for all netappoint/tevis interfaces resampled to a 1 hour interval. Offices with strange peaks where excluded.
Blue line is number of appointments and red line is cancellations.
One custom appointment interface of special interest is the Impf-Terminvergabe Thüringen, the website to make covid vaccination appointments in Thuringia. They only offer one timeslot per day for at most 3 different days. And their offered dates change all the time. Out of pure interest, a snapshot is recorded every minute!
In below graphic, whenever a new snapshot offers a timeslot that is after the previously offered timeslot, the previous timeslot is counted as one appointment.
These are probably not real appointments but rather a measure of website activity as the total number of appointments made would be over 190,000 in a period of 10 days. That does not really fit the perceived reality.
-
"QMatic"
-
They seem to have an older (or newer?) version of netappoint belonging to netcallup.de/qmatic
-
https://www.lra-aoe.de/qmaticwebbooking/#/ some other qmatic page..
-
-
https://www.kreis-alzey-worms.eu/verwaltung/zulassungsstelle/buchung/terminbuchung.php
Quite simple XHR interface
-
https://sean.outsystemsenterprise.com/TicketSystemOnlineTermine/
-
timeacle.com
- https://timeacle.com/business/index/id/374 (Braunschweig)
- https://timeacle.com/business/index/id/3134/booking/appointment/row_id/undefined/ (Oldenburg)
- https://timeacle.com/business/index/id/2329/booking/appointment/row_id/3238
- https://timeacle.com/business/index/id/2339/booking/appointment/row_id/3278
-
https://testtermin.de/ - corona tests across germany
-
https://www.rhein-erft-kreis.de/artikel/termine-online-reservieren
-
Another generic appointment service by www.cleverq.de (Business Intelligent Cloud GmbH)
- https://cqm.cleverq.de/public/appointments/lk_cloppenpurg_kfz_cloppenburg/index.html?lang=de
- https://cqm.cleverq.de/public/appointments/zulassung_alzenau/index.html?lang=de
- https://cqm.cleverq.de/public/appointments/Zulassung-Segeberg/index.html?lang=de
- https://cqm.cleverq.de/public/appointments/norderstedt/index.html?lang=de
-
https://www.buergerserviceportal.de/
They require registration.
-
Smart Customer eXperience https://smart-cjm.com
- https://rsk.saas.smartcjm.com/m/strassenverkehrsamt/extern/calendar/?uid=8a08422a-9d05-48e4-bb31-4ee51c4cd68a
- https://termin.ostallgaeu.de/m/lra-oal/extern/calendar/?uid=259bec4f-6d28-46e3-a008-94818c84fe32
- https://lk-biberach.saas.smartcjm.com/m/Zulassung/extern/calendar/?uid=0413567e-ad7b-46f8-abc0-d1756c39109c
- https://termine.landkreis-karlsruhe.de/m/Zulassung/extern/calendar/?uid=81ebbc74-3681-4900-84e7-457ade4662ec
- https://termin.kreis-oh.de/m/kreis-ostholstein/extern/calendar/?uid=e236b01b-460e-4c76-88db-7e083557c438&wsid=c262c86a-9973-4760-806a-bc9c75755014&lang=de
- https://termine.lkgi.de/m/zulassungsstelle/extern/calendar/?uid=46c3c125-ee61-4949-97f4-132979349815
- https://termine.lkgi.de/m/Zulassungstelle-Gruenberg/extern/calendar/?uid=36f4d860-a5e2-4f73-ad7c-6ee3619d3ff9
- https://termine.lra-es.de/m/strassenverkehrsamt/extern/calendar/?uid=396fc9d8-b0e0-4138-8ab3-82bad96cdb3e
- a couple linked on https://www.karlsruhe.de/b4/buergerdienste/terminvereinbarung.de
- https://thor.ostalbkreis.de/m/oakstrassenverkehr/extern/calendar/?uid=caa33f31-2148-4149-986b-183dda71bdc3
- https://termin.kreis-oh.de/m/kreis-ostholstein/extern/calendar/?uid=e236b01b-460e-4c76-88db-7e083557c438
- https://termine.landkreis-guenzburg.de/m/lbb/extern/calendar/?uid=2224a191-d0b2-4432-aeae-ba69b10d03ba
- https://emergency.saas.smartcjm.com/m/Stadtverwaltung-Langen/extern/calendar/?uid=c3bf3b96-7847-497f-a2f4-6d72a992890f&lang=de
- https://emergency.saas.smartcjm.com/m/Stadtverwaltung-Langen/extern/calendar/?uid=cf6c0a32-6d3f-448b-a5fa-656aaa525715&lang=de
- https://termine.bochum.de/m/buergerbuero/extern/calendar/?uid=eab3c2ce-bd9c-4c81-8dab-663aebdc0ce3
- https://termine.bochum.de/m/standesamt/extern/calendar/?uid=8e909dba-a24a-4d55-b06b-a9dd0fed1cc6
- https://termine.bochum.de/m/abuero/extern/calendar/?uid=c5829d01-0a37-4ed5-868b-16b2788265d6
- https://termine.bochum.de/m/abuero/extern/calendar/?uid=404bfc07-3614-46fc-9f3a-801fb1fd8954
- https://termine.kreislippe.de/m/kreis-lippe/extern/calendar/?uid=b19ec13c-2d76-4188-a3e2-d63d6d2567c6
- https://lk-suedliche-weinstrasse.saas.smartcjm.com/m/sva/extern/calendar/?uid=824ff5f9-7e19-40ef-ba08-c3837cb05d79
- https://lk-suedliche-weinstrasse.saas.smartcjm.com/m/sva/extern/calendar/?uid=527c2cfb-35f9-492c-9676-9d2e7db3ce1d
- https://stadt-hildesheim.saas.smartcjm.com/m/stadt-hildesheim/extern/calendar/?uid=8b290124-473e-447c-b7a7-6bd74b7c58e5
-
https://www.rhein-neckar-kreis.de/start/service/terminvereinbarung.html
Yet another system. Seems to need email and phone before showing dates.
-
Yet another system. By Terminland GmbH.
-
"Internetgeschaeftsvorfaelle"
-
https://dtms.wiesbaden.de/DTMSTerminWeb/
Yet another system. based on dotnet oO
-
Yet another system
-
https://kfzonline.ekom21.de/kfzonline.public/start.html?oe=00.00.06.438000
Don't really finding dates there
-
https://www.wormser-baeder.de/sportbaeder/veranstaltungen/eintritt/monatsuebersicht.php
-
https://serviceportal.schleswig-holstein.de/Verwaltungsportal/Service/Entry/IKFZ
-
https://serviceportal.hamburg.de/HamburgGateway/FVP/FV/Bezirke/DigiTermin/?sid=313
-
https://service.berlin.de/terminvereinbarung/
Started scraper berlin.py but they protect themselves with throttling and captchas. They really do not want a bot crawling their page.
-
termed.de is fully APIfied and there exists a prototype of a scraper but it yields such an enormous amount of data (more than 3000 calendars per snapshot leading to gigabytes per week) that it is not included in the data export. Also it's a lot of private medical practices which do not really fit the public office category.
To use it, explicitly include it:
python scraper.py snapshot -i termed