Skip to content

vbarbosavaz/DRIO4302C

Repository files navigation

IMDb_app

  • Course : DRIO-4302C Data Engineering
  • February 3, 2019
  • Students : Vincent Barbosa Vaz, William Cardoso
  • Teacher : Daniel Courivaud, Raphaël Courivaud

DISCLAIMER

This project is for informational and educational purposes, do not use it for business purposes.

Tasks

  • IMDb scraping with Scrapy
  • Flask application
  • MongoDB database
  • Docker
    • docker-compose.yml
    • Dockerfile
    • run mongodb inside Docker
    • run elasticsearch inside Docker

The project

Crawling/scraping of IMDb for series, with Scrapy.

Save data into MongoDB database.

Create a Flask web-app to display the data.

The user likes series he loves (through Elasticsearch), the app match the bests series to watch.

Home Page

Home page

Title selection

Title

Run the project

Clone it :

git clone https://github.com/v-barbosavaz/DRIO4302C

From Docker

cd DRIO4302C
docker-compose up -d

Locally

cd DRIO4302C
pipenv shell
pipenv run python run.py