This is the Basic Challenge for Data Engineers.
Lider.cl wants to create a new shiny section for videogames, to bring custom videogame information to our clients, our Analytics team needs a new report each day with several videogames information. This information will be used to create a lot of ML models & Data Science to give our customers the best experience and make the best decision of which videogame buy.
For this Challenge, we want you to do a Job who give us the Data for the Analytics team, but with a few concerns:
- The Job must be an ETL code in Java or Scala or Python.
- We need the Data Model for the problem.
- And we want a Deployment for this code.
The job must receive the datasets & brings a few things:
- The top 10 best games for each console/company.
- The worst 10 games for each console/company.
- The top 10 best games for all consoles.
- The worst 10 games for all consoles. The data is in the folder data/ in the root. The report can be exposed in any way you want, but remember this is an ETL Job.
The Data Model must be in 3NF. Save the model in the DataModel folder in both formats (data model format & JPG/PNG).
Use any tool, but please tell us the tool you choose & why.
We want you to give us the way to deploy your job and run it in any environment, So please put the way to deploy very clearly.
- You can create a new README for anything you want to tell us. Please don't name README.md
- We want to see if you know how to code in a professional way, so use the best practices of Software Engineering!.
- This is an ETL Job, so show us all you know about good practices to do ETL's.
- Save all the changes in your personal GitHub account using a Fork from this repository and send us the link to clone and see the repository.
"This challenge is your cover letter, the elections you choose to do & not to do matters, and will be ask in the next interview"
We use the data from TopGames provided by Metascore.