PeARS for Wikipedia search #3
minimalparts
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
given previous discussions with some of our contributors, we thought we should start integrating one of our sister projects into PeARS, namely Wikipedia processing with WikiNLP: https://github.com/possible-worlds-research/wikinlp.
The WikiNLP package lets you automatically download and preprocess Wikipedia dumps in any language. You can also extract specific categories and even specific sections of pages using the tool. The resulting corpus can then be fed into PeARS via the command line to provide bespoke search over some Wikipedia content.
The relevant CLI function is here:https://github.com/PeARSearch/PeARS-federated/blob/54029122fc6d00587c9a546aacca26ffaecc6d85/app/cli/controllers.py#L177.
Usage is as follows (run from the root folder of your installation):
(See function documentation for more info, or ask here if anything is unclear.)
We are preparing a demo instance to show a concrete example of this usage. See you there soon!
Beta Was this translation helpful? Give feedback.
All reactions