covid-risk-factors

Comparing different topic models to see if covid-19 risk factors can be identified as one of the latent topics

Build docker image:

$ docker build -t covid_risk_factors .

Run the docker image

$ docker run -it --rm -v `pwd`:/usr/src/myapp -w /usr/src/myapp covid_risk_factors

Note: Before running, download the data from https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge and place it in the directory covid-risk-factors/data/cord-19/

More information on how to run each model can be found under it's directory:

Mallet baseline: mallet-baseline/
MetaLDA with institutions: metaLDA/institutions/
MetaLDA with epochs: metaLDA/epochs/

SCHOLAR

Code in scholar taken from https://github.com/dallascard/scholar.

Python requirements

python3
pytorch 0.4
numpy
scipy
pandas
gensim

Scholar requires data being in a jsonlines format. The script data/collect_data.py will transform the dataset download (which should be in the project's root directory) into a jsonlines files called data/baseline/combined_data.json

Scholar then requires some preprocessing on top of the json lines file python scholar/preprocess_data.py data/baseline/combined_data.json data/scholar/processed/ --vocab-size 2000 --label disease_epoch,top_authors_institution

Then to run the scholar model itself (this example runs the model with both metadata attributes as covariates) python run_scholar.py data/scholar/processed/ -k 30 -o results/output_smallV_30_both --topic-covars disease_epoch,top_authors_institution

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
mallet-baseline		mallet-baseline
metalda		metalda
results		results
scholar		scholar
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
mallet_script.py		mallet_script.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

covid-risk-factors

SCHOLAR

About

Releases

Packages

Contributors 3

Languages

anjmittu/covid-risk-factors

Folders and files

Latest commit

History

Repository files navigation

covid-risk-factors

SCHOLAR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages