Scrabby

Scrabby is a Ruby-based tool designed to scrape data from the Harry Potter Wiki. This project is an essential component powering the Potter DB, providing an up-to-date data repository of all characters, potions and spells from the Harry Potter universe.

How it works

Scrabby performs the following spellbinding tasks:

Data scraping: Scrabby scrapes data from the Harry Potter Wiki using Nokogiri, extracting valueable insights about characters, potions and spells.
Data transformation: Scrabby transforms the scraped data into a structured format, using CSV as the data format, making it ready to use for immediate use.
Data storage: Scrabby stores the transformed to individual CSV files within data/, allowing easy access for analysis or integration with other projects.

Setup

Normally you don't need to setup anything. The data will be automatically scraped and updated once a week, by using GitHub Actions. However, if you'd like to take the reins and run the scrapers manually, follow these simple steps:

1. Clone / Fork the repository

git clone git@github.com:danielschuster-muc/scrabby.git && cd scrabby

2. Install ruby

Ensure you have Ruby 3.1.2 installed on your system.

rbenv install 3.1.2

3. Install dependencies

bundle install

4. Run scrapers

Execute the following commands to manually trigger the scrapers for characters, potions, and spells:

bundle exec rake scrabby:characters
bundle exec rake scrabby:potions
bundle exec rake scrabby:spells

5. Output

The fresh scraped data will be saved to data/*.csv, conveniently available for your use.

License

This project is licensed under the terms of the MIT license. See the LICENSE file.

Data is scraped from the Harry Potter Wiki and therefore licensed under CC-BY-SA unless otherwise stated. For specific details, please refer to the URLs linking to the corresponding wiki pages in the data files.

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
.github/workflows		.github/workflows
data		data
scrapers		scrapers
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
README.md		README.md
Rakefile		Rakefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrabby

How it works

Setup

1. Clone / Fork the repository

2. Install ruby

3. Install dependencies

4. Run scrapers

5. Output

License

About

Contributors 2

Languages

License

danielschuster-muc/scrabby

Folders and files

Latest commit

History

Repository files navigation

Scrabby

How it works

Setup

1. Clone / Fork the repository

2. Install ruby

3. Install dependencies

4. Run scrapers

5. Output

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages