Skip to content

Jupyter notebook demostrating ETL . data scrapped from web, transformed and inserted to MS SQL database hosted on Azure Cloud Instance

Notifications You must be signed in to change notification settings

kanchika-kapoor/data_preprocessing

Repository files navigation

Colab-Url

Code for B9AI108 (B9AI108_2223_TMD1S) CA 2

Name: Kanchika Sudhirkumar Kapoor
Email_Id: 10621287@mydbs.ie
Github Url: https://github.com/kanchika-kapoor/data_preprocessing

Documentation and code implementation link:

Note:

This repository contains the same methods used in colab but in a structured format

Data Source:

  • The data is scrapped from YCharts
  • The site uses one url which returns data based on various filters and the data is paginated

Data Pipeline:

  • Data Fetching:

    • Python Requests library is used for getting the data from the url.
  • Data Transformation:

    • Pandas library used for creating the dataframe from the url's json response
  • Data Storage:

    • Pypyodbc is used to connect to database on hosted azure virtual machine

Output:

code implementation on vm

About

Jupyter notebook demostrating ETL . data scrapped from web, transformed and inserted to MS SQL database hosted on Azure Cloud Instance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published