OSRScrape

Or Osiris Crêpe

These are the basic python scripts I use(d) to scrape data from the OSRS wiki: http://oldschoolrunescape.wikia.com/wiki/Old_School_RuneScape_Wiki

Main python libraries used: requests, bs4, json
Some openpyxl when I began and decided to play with it, but since I've transferred to json.

The ultimate goal is to create a semantic net of concepts in the OSRS game world. A lot of information can be gained by simply scraping the already fairly structured content of the wiki.

However I would like to branch into extracting relationships and semantics from the wiki articles using NLP. I have a very very very very very rough idea of how to start, but by no means do I understand the state of the art. I've started learning basics with the nltk python library but if anyone has pointers or ideas or contributions please let me know.

I have added the latest json files. They are in a fairly structured format, but they are not yet in JSON-LD. As can be seen in armory_v#.json the natural language article text has been scraped and stitched. An important not is that links have been replaced with tokens (format: 'rsrc####') to assist with cross referencing. This relies on the availability and consistency of the rsrc_token_dict.json .

Files of Interest

org.errware.silentscribe.jar :
Single class library for scraping information directly from RS in conjunction with the DreamBot framework. Singleton design pattern with multiple threads for world-state observation.

npcs_v1.json, Bestiary_v1.json, Items_v1.json, armory_v2.json :
Currently the most structured formatting of scraped data from the osrs wikia wiki.

myScrapingLib.py :
Python module abstracting routines that are verbose or I've used quite often.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
ScrapedJson		ScrapedJson
assets/mslAssets		assets/mslAssets
.gitignore		.gitignore
README.md		README.md
armoryScrape.py		armoryScrape.py
multiPageScraper.py		multiPageScraper.py
myScrapingLib.py		myScrapingLib.py
org.errware.silentscribe.jar		org.errware.silentscribe.jar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OSRScrape

Files of Interest

About

Releases

Packages

Languages

ErrWare/OSRScrape

Folders and files

Latest commit

History

Repository files navigation

OSRScrape

Files of Interest

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages