TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction
- pattern-extraction. Pattern generation can be found at:
- word embedding: word2vec
Input format
- /input/pattern.txt
Example: $LOCATION leader $PERSON 0 2 united_states trump 1001
- /input/word_embedding.pickle or /input/word_embedding.txt
word_embedding.pickle: dictionary, where key is word, value is the vector
word_embedding.txt: the txt output of word2vec tool
The intermediate results are also provided. The final results can be found in /output/result/
Model Parameters
Can be changed in
Package Requirement: numpy, json, csv, _pickle, sklearn
@inproceedings{li2018truepie, title={Truepie: Discovering reliable patterns in pattern-based information extraction}, author={Li, Qi and Jiang, Meng and Zhang, Xikun and Qu, Meng and Hanratty, Timothy P and Gao, Jing and Han, Jiawei}, booktitle={Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining}, pages={1675--1684}, year={2018}, organization={ACM} }