Code used by the team IIIDYT for the WASSA 2018 Implicit Emotion Shared Task


This code was developed during the WASSA 2018 Implicit Emotion Shared Task by the team IIIDYT. You can read our paper here.

You can also read more details about the shared task in the official IEST website, and in the competition website.

If you find this code useful please consider citing our paper:

  author       = {Balazs, Jorge A. and 
                  Marrese-Taylor, Edison and
                  Matsuo, Yutaka},
  title        = {{IIIDYT at IEST 2018: Implicit Emotion Classification
                   with Deep Contextualized Word Representations}},
  booktitle    = {Proceedings of the 9th Workshop on Computational
                  Approaches to Subjectivity, Sentiment and Social
                  Media Analysis},
  year         = {2018},
  address      = {Brussels, Belgium},
  month        = {November},
  organization = {Association for Computational Linguistics}

Recommended Installation

  1. Clone this repo.

    git clone
    cd implicit_emotion
  2. Create a conda environment

    If you don't have conda installed, we recommend using miniconda.

    You can then easily create and activate a new conda environment with Python 3.6 by executing:

    conda create -n iest python=3.6
    source activate iest

    Where you can replace iest by any environment name you like.

  3. Run


    This will install pytorch, a few dependencies for our code, AllenNLP (ELMo) and all of its dependencies. See for more ways to install AllenNLP. Also note that for replicability purposes we will install the same ELMo version we used for development: ac2e0b9b6.

    By default, AllenNLP will be cloned in this repo. If you want to install it somewhere else please modify the install script scripts/, and change the ALLENNLP_PATH variable in src/ accordingly.

    The installation script will install Pytorch 0.4.0 with CUDA 8.0 by default. Please make sure that you have compatible GPU drivers, or change the installation script so it installs the correct version of CUDA. You can run nvidia-smi to see the version of your driver, and check the compatibility with CUDA in this chart.

  4. (Optional) Install java for obtaining POS tags

    We used a forked version of ark-tweet-nlp for obtaining POS tags without using its built-in tokenization feature. This repo already comes with the compiled jar (ark-tweet-nlp-0.3.2.jar) in utils/ark-tweet-nlp.

    If you want to use this feature you need java. You can easily install it within your conda environment with

    conda install -c cyclus java-jdk

    You can also change the pre-trained POS tagging model by modifying the PRETRAINED_MODEL_NAME variable in utils/ with one of the models provided in utils/ark-tweet-nlp.


  1. To get the data you need some credentials provided by the organizers of the shared task. Please contact them at, or at the email addresses listed in the offical shared task website, to get the credentials for downloading the data.

    Alternatively, you could download the tweets according to their IDs, already published in the official website, and not requiring any credentials. However, the organizers haven't published the code they used for replacing username mentions, newlines, urls, and trigger-words, so you might not end up with the same dataset that was used during the shared task.

  2. Once you have your USERNAME and PASSWORD, get the data by running the following command, and typing your password when prompted:

    scripts/ USERNAME

    This script will download the following:

    • train \ dev \ test splits (~23 MB unzipped) into data/
    • pre-trained ELMo weights (~360 MB) into data/word_embeddings

    If you want to save the data in a different directory, you can do so as long as you modify the paths in the scripts/ and scripts/ preprocessing scripts, and in src/

  3. Run the preprocessing script

  4. (Optional) If you installed java and want to obtain the pos tags, execute:


To test if you installed everything correctly run python --help. This command should display the options with which you can run the code, or an error if something failed during the installation process.


To train a best-performing model, run:

python --write_mode=BOTH --save_model

This will run for 10 epochs and will save the best checkpoint according to validation accuracy.

Checkpoints and other output files are saved in a directory named after the hash of the current run in data/results/. See this section for more details.

The hash will depend on hyperparameters that impact performance, and the current commit hash. For example, changing learning_rate, lstm_hidden_size, dropout, would produce different hashes, whereas changing write_mode, or save_model or similars, would not.


To test a trained model, run:

python --model_hash=<partial_model_hash> --test

Where you have to replace <partial_model_hash> by the hash of the model you wish to test, corresponding to the name of its directory located in data/results/.

A classification report will be printed on screen, and files containing the prediction labels and probabilities will be created in data/results/<hash> (details).

Experiment Results Directory Structure

After the validation phase of the first epoch you should have the following structure:

├── architecture.txt
├── best_dev_predictions.txt
├── best_dev_probabilities.csv
├── best_model.pth
├── best_model_state_dict.pth
├── events.out.tfevents.1535523327
└── hyperparams.json
  • architecture.txt contains the architecture as represented by PyTorch. For example:

        (char_embeddings): Embedding(1818, 50)
        (word_encoding_layer): WordEncodingLayer(method=elmo)
        (word_dropout): Dropout(p=0.5)
        (sent_encoding_layer): SentenceEncodingLayer(
          (sent_encoding_layer): BLSTMEncoder(
            (enc_lstm): LSTM(1024, 2048, dropout=0.2, bidirectional=True)
        (sent_dropout): Dropout(p=0.2)
        (pooling_layer): PoolingLayer(
          (pooling_layer): MaxPoolingLayer()
        (dense_layer): Sequential(
          (0): Linear(in_features=4096, out_features=512, bias=True)
          (1): ReLU()
          (2): Dropout(p=0.5)
          (3): Linear(in_features=512, out_features=6, bias=True)
  • best_dev_predictions.txt contains 9591 rows with a single column containing the predicted label for the best epoch for the dev (trial) examples. This is how its head looks like:

  • best_dev_probabilities.csv contains 9591 comma-separated rows, with 6 columns corresponding to the probability of the example belonging to one of the 6 emotion classes. This is how its head looks like:


    This is the header of the file (not included in the file itself):

  • best_model.pth: The whole model, serialized by running, PATH).

  • best_model_state_dict.pth: The model weigths, serialized by running, PATH).

    For more on Pytorch serialization, see: Serialization Semantics.

  • events.out.tfevents.1535523327: TensorBoard file generated by tensorboardX.

  • hyperparams.json: Hyperparameters with which the model was trained, and some extra information, such as the date the model was run and its hash. For example:

      "epochs": 10,
      "batch_size": 64,
      "optim": "adam",
      "seed": 45,
      "max_lr": 0.001,
      "datetime": "2018-08-29 17:36:09",
      "commit": "76e0af0150fc35f9be6cd993dc35b2dc7a4bb87d",
      "hash": "dc700889fa1bbae360bbd7afa68cd9d02c154d62"

After testing a model, two new files will be created in data/results/<hash>:

  • test_predictions.txt: equivalent to best_dev_predictions.txt, but obtained from evaluating the trained model on the test dataset.

  • test_probabilities.csv: equivalent to best_dev_probabilites.txt, but obtained from evaluating the trained model on the test dataset.


