-
-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically extract information from http://minecraft.gamepedia.com/ #8
Comments
The current recipes script in the bin/ folder don't produce data to the new recipe format. |
So, in order to do that extraction, I don't want to use html anymore. In order to do get that wikitext there are a few ways :
The problem with the dump is even if they agree to export them, I don't know how regularly they will do that (since the wiki content change regularly). |
Using the api is indeed possible (example) Instead of using it manually, let's use this https://github.com/macbre/nodemw |
There's also a wikitext parser written in node.js (https://github.com/spencermountain/wtf_wikipedia). https://github.com/spencermountain/wtf_wikipedia doesn't work on minecraft wiki (tested on Blocks : it can't find the table and on Gravel : it can't read the infobox) |
… sections, parse table, parse infobox
This http://minecraft.gamepedia.com/Data_values is important. current name in blocks.json and items.json correspond to nothing, wouldn't it be better to replace them by the "nameid" , for example swordDiamond -> diamond_sword (or even minecraft:diamond_sword) ? |
http://minecraft.gamepedia.com/Data_values/Block_IDs and http://minecraft.gamepedia.com/Data_values/Item_IDs should be used for the list of blocks and items (that even says if these blocks and items can have metadata) : parsing similar to https://github.com/PrismarineJS/minecraft-data/blob/master/bin/wiki_extractor/entities_extractor.js. Then more data can be found in the page of each block/item. |
Items extraction is done Now trying blocks extraction :
material goes along with materials.json. Problem is it seems to have been written manually and doesn't correspond to anything specific in the wiki. Most related thing is this http://minecraft.gamepedia.com/Breaking#Best_tools but I don't really know if it's possible to write materials.json using this. |
material : done. materials.json will probably stay manual. just harvestTools missing. |
blocks.json done ! |
Total progress :
|
Shapeless means it has multiple shapes. So recipes with only one item, or with 9 times the same item are shaped recipes (see http://minecraft.gamepedia.com/Module_talk:Crafting#Shapeless_recipes_marked_as_shaped_recipes and http://minecraft.gamepedia.com/Template_talk:Crafting#remove_shapeless_indicator_when_unambiguous) |
recipes done. |
only biomes missing. |
For biomes : see PrismarineJS/mineflayer#197 |
So, biomes current values cannot really be automatically extracted : I added a line in the wiki about how to semi-automatically extract it. |
All the .json now have a extraction procedure ! closing. |
There are many ways to extract the data that should go into minecraft-data, as discussed there PrismarineJS/mineflayer#229 .
In this issue I'll focus on progress on extracting information from the wiki http://minecraft.gamepedia.com/
The text was updated successfully, but these errors were encountered: