Skip to content

This repository holds the historical boxscore and play-by-play data for the PHF that was scraped with the fastRhockey package.

Notifications You must be signed in to change notification settings

benhowell71/fastRhockey-data

Repository files navigation

fastRhockey Data Repository

This repository holds historical boxscore and play-by-play data for the Premier Hockey Federation (PHF, formerly known as NWHL), which was compiled with the fastRhockey package from GitHub.

You can find fastRhockey here: BenHowell71/fastRhockey

The scraper was created to increase access to play-by-play and boxscore data for the PHF, which has historically been one of the bigger barriers to entry regarding women’s hockey analytics.


Installation

You can install the released version of fastRhockey from GitHub with:

# You can install using the pacman package using the following code:
if (!requireNamespace('pacman', quietly = TRUE)){
  install.packages('pacman')
}
pacman::p_load_current_gh("BenHowell71/fastRhockey", dependencies = TRUE, update = TRUE)

If you would prefer the devtools installation:

# if you would prefer devtools installation
if (!requireNamespace('devtools', quietly = TRUE)){
  install.packages('devtools')
}
# Alternatively, using the devtools package:
devtools::install_github(repo = "BenHowell71/fastRhockey")

Once the fastRhockey package has been installed, make sure you check out the brief walkthrough on the functions on the GitHub page. This repo only contains the boxscore and play-by-play data from 2016-2021.


Data

This repo contains three main CSVs of data, each of which is outlined in a little more detail below.

  • phf_meta_data.csv: this csv contains all the data that you’d want on an individual game in one row. Contains home/away teams, arena information, game IDs, league IDs, and more
  • boxscore.csv: this csv contains all the boxscore information from the PHF for the games in phf_meta_data.csv. Contains data on game ID, scoring by period, shots be period, power play numbers, and more, all broken down by each team involved in a game
  • play_by_play.csv: this csv contains all the play-by-play data from the PHF. It includes information on events, how many skaters were on the ice, penalties, shots, etc. This data is essentially complete for the more recent PHF seasons, while it is spottier, usually just goals and penalties, from the early seasons of the league

The best way to get familiar with this data is to use it! You can either download directly from this repo or use fastRhockey to scrape the data yourself.


Follow SportsDataverse on Twitter and star this repo

Twitter Follow

GitHub stars

Our Authors

Our Contributors (they’re awesome)

Citations

To cite the fastRhockey R package in publications, use:

BibTex Citation

@misc{howell_fastRhockey_2021,
  author = {Ben Howell},
  title = {fastRhockey: The SportsDataverse's R Package for Women's Hockey Data.},
  url = {https://benhowell71.github.io/fastRhockey/},
  year = {2021}
}

About

This repository holds the historical boxscore and play-by-play data for the PHF that was scraped with the fastRhockey package.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages