Concreteness

An implementation of Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets with PyTorch.

It uses a ResNet50 along with Spotify's Annoy library to compute the visual concreteness scores of words from MIRFLICKR.

Requirements

To install the basic requirements, run this:

pip install -r requirements.txt

If you'd like use a Jupyter Notebook for interacting with the concreteness scores after computing them, you'll also need:

pip install -r requirements-notebook.txt

As of now, the existing code has only been tested with Python3.6 and Python 3.7.

For the MSCOCO dataset, you'd also have to download the English model for SpaCy by

python -m spacy download en

Usage

Downloading the dataset

Before running, you'll need to download the MIRFLICKR dataset. You can do that with:

cd data
./get_mirflickr.sh

It's 120GB, so it may take a while.

Similarly, you can get the MSCOCO dataset with:

cd data
./get_mscoco.sh

Shell usage

Once your download is finished, you can compute the concreteness scores with:

python main.py -d <mirflickr_directory> -c <cache_directory> -v

Swap in the path to where the mirflickr dataset was downloaded to and a directory of your choice to use for caching.

For the MSCOCO dataset, run with

python main.py -d <mscoco_directory> -c <cache_directory> -v -t mscoco

Jupyter Notebook

If you prefer, you can also run the provided Jupyter Notebook:

jupyter notebook concreteness.ipynb

TODO

Improve Jupyter Notebook formatting

Thanks to

@jmhessel for helpful pointers and a great paper.

Citation:

@inproceedings{hessel2018concreteness,
               title={Quantifying the visual concreteness of words and topics in multimodal datasets},
               author={Hessel, Jack and Mimno, David and Lee, Lillian},
               booktitle={NAACL},
               year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.circleci		.circleci
data		data
.flake8		.flake8
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
concreteness.ipynb		concreteness.ipynb
concreteness.py		concreteness.py
main.py		main.py
mirflickr.py		mirflickr.py
mscoco.py		mscoco.py
requirements-notebook.txt		requirements-notebook.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concreteness

Requirements

Usage

Downloading the dataset

Shell usage

Jupyter Notebook

TODO

Thanks to

About

Releases

Packages

Contributors 2

Languages

License

victorssilva/concreteness

Folders and files

Latest commit

History

Repository files navigation

Concreteness

Requirements

Usage

Downloading the dataset

Shell usage

Jupyter Notebook

TODO

Thanks to

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages