usage: genq.py [-h] [-file FILE] [-output OUTPUT] [-start_page START_PAGE]
[-number_of_pages NUMBER_OF_PAGES] [--all] [--verbose]
[--dir DIR]
optional arguments:
-h, --help show this help message and exit
-file FILE, -f FILE input file location
-output OUTPUT, -o OUTPUT
output file location
-start_page START_PAGE, -s START_PAGE
page to start reading from
-number_of_pages NUMBER_OF_PAGES, -n NUMBER_OF_PAGES
number of pages to read
--all process all the pages from the start_page
--verbose, -v verbose
--dir DIR, -d DIR input directory
START
-
Accept a
pdf
ortext file
. -
Clean the text and remove special characters.
-
Load the knowledge base
.pkl
which contains{set}
of unique generated questions. -
Beak the text in to sentences and do
POS tagging
. -
Loop through all sentences and check if sentence contain
NOUN/PRP
ie.,['NN', 'NNS', 'PRP', 'NNP', 'NNPS', 'PRP$']
.If true go to step 6 else continue the loop in step 5
-
Start from the index of
Noun
found and check ifVERB/PRP
is following. -
Understand the tense of the
VERB/PRP
and also check if the noun ishe/she/it/they
. -
Form a question based on the above rules.
-
Lemmatize the
Verb
and also change the tense of the question to future. -
Generalize the question by removing
personal reference
and removepossessive pronoun: her, his , mine
. -
Verify the question is not generated previously and add it to the
knowledge base
If sentences are remaining to be processed go to step 5 else go to step 12
-
Save the
metadata
and update the knowledge base and export the questions tocsv
from the metadata.
END
Alphabetical list of part-of-speech tags used in the Penn Treebank Project
Automatic Factual Question Generation from Text
TextBlob: Simplified Text Processing
Automatic Question Generation from Paragraph
K2Q: Generating Natural Language Questions from Keywords with User Refinements
Infusing NLU into Automatic Question Generation
Literature Review of Automatic Question Generation Systems
Neural Question Generation from Text: A Preliminary Study
Learning to Ask: Neural Question Generation for Reading Comprehension [Apr 2017]