Skip to content

Web scraping project that includes data cleaning and analysis to determine the most cited, published books on google scholar

Notifications You must be signed in to change notification settings

clintbel/case-study-web-scraping-google-scholar-books

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 

Repository files navigation

Case Study - Web Scraping - Published Books (Google Scholar)

Author: Clint Barnard-El
Email: barnard.clint@yahoo.com
LinkedIn: https://www.linkedin.com/in/clintbarnardel/
Dataset: Google Scholar Web Scraping

Introduction

This repository provides the results from an analysis of the most cited published books on African-American History. Black History Month is an annual observance in the United States - held during February - to commemorate the events and celebrate the contributions of those from the African diaspora.

This analysis aims to demonstrate the skills of web scraping, data cleaning, exploratory analysis (EDA), and data visualization.

Applications Used

  • RStudio
  • Octoparse
  • Google Chrome

Language Used

  • R

Skills Used

Datasets

The dataset can be found on Kaggle. The web scraping completed on 2/24/24 yielded over 500 results that included articles, books, and PDFs.

About

Web scraping project that includes data cleaning and analysis to determine the most cited, published books on google scholar

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published