University-based extended paper reproduction and analysis
- University course Machine Learning Systems for Data Science
- Project timeframe from December 2022 until January 2023
- Reproduction of Paper Analysis
- Alberto Trashaj
- Manuel Rech
- Sebastian Benno Veuskens
All data are collected in accordance with the paper procedure from the John-Hopkins University website. The raw data is available here.
- Absolute cases from the first day with cases recorded
- Absolute cases from the 22nd of January until 4th of April 2020
- Absolute cases per 1 million population from the first day with cases recorded
- Absolute cases per 1 million population from the 22nd of January until 4th of April 2020
- Absolute cases per 1 million population per area from the first day with cases recorded
- Absolute cases per 1 million population per area from the 22nd of January until 4th of April 2020
30 countries are selected for each dataset, based on one of the two selection critera, respectively:
- The countries with the highest number of cases on the 4th of April 2020
- The countries where cases occured first
Based on these datasets and selected countries, an agglomerative clustering algorithm is applied.
Based on the outcome of the clustering algorithm, the countries are embedded into a world map. Their cluster membership is made visually available within the context of all countries.
The referenced paper is available here. The description for the university course in Machine Learning can be found here.