Skip to content

piotrmp/bothunter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

BotHunter

A Twitter bot detection solution for the Bots & Gender Profiling shared task organised at the PAN workshop at CLEF 2019 conference.

The tool was developed within the HOMADOS project at the Institute of Computer Science, Polish Academy of Sciences. For the description of the approach and evaluation results, see the publication:

Przybyła, P.: Detecting Bot Accounts on Twitter by Measuring Message Predictability - Notebook for PAN at CLEF. In: Cappellato, L., Ferro, N., Losada, D. E., and Müller, H. (eds.) CLEF 2019 Working Notes. Springer.

You can run the code producing results as described in the paper in the following way (set <lang> to en for English or es for Spanish):

  • for LASSO version:

java -jar ./BotHunter-1.6-jar-with-dependencies.jar GLM <input>/<lang> <output> <workdir> ./R ./modelR <lang>

  • for BERT version

java -jar ./BotHunter-1.6-jar-with-dependencies.jar BERTpart <input>/<lang> <output> <workdir> <bert-source> <bert-model>/<lang> <venv> ./R ./modelR <lang>

  • for LASSO+BERT combination:

java -jar ./BotHunter-1.6-jar-with-dependencies.jar GLMBERTpart <input>/<lang> <output> <workdir> <bert-source> <bert-model>/<lang> <venv> ./R ./modelR <lang>

This requires the following elements to be present:

  • a JAR file of BotHunter 1.6 compiled from the source available here (BotHunter-1.6-jar-with-dependencies.jar),
  • R source files available here (R),
  • R model files to be downloaded separately (modelR).

Additionally, the BERT-powered variants require:

  • BERT source code, including run_classifier.py (<bert-source>),
  • BERT models tuned for the task to be downloaded separately (<bert-model>),
  • a virtual environment configured with Python and TensorFlow to run BERT (<venv>).

If you want to train the approach on a different dataset, the necessary procedures for preparing data (BotHunter->Main.java) and building ML models (R->TIRA_build.R) are available here, too.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published