You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I will try to. For now I have settled with another wiki parser that worked with my file. Do you mind telling me what is the difference with the file that I have and the one you are suggesting me?
Thank you
I have downloaded the english wiki dump enwiki-20160305-pages-articles-multistream.xml.bz2 and installed Wikiextractor in a Debian VM.
When I ran the extractor I am getting 0 articles in return and no errors:
WikiExtractor.py -b 250K -o extracted enwiki-20160305-pages-articles-multistream.xml.bz2
INFO: Loaded 0 templates in 0.0s
INFO: Starting page extraction from enwiki-20160305-pages-articles-multistream.xml.bz2.
INFO: Using 1 extract processes.
INFO: Finished 1-process extraction of 0 articles in 0.1s (0.0 art/s)
The text was updated successfully, but these errors were encountered: