Multi-Label Emotion Classification in Urdu

ML Models to decode compound emotions conveyed in a Urdu-based text

Welcome to my repo! This is one of my humble attempts at working with common ML Models, with a motive to contribute towards a need-of-the-day issue.

Purpose

The comfort of anonymity offered by today's social media enables a convenient outspread of hate-speech and incitement to threats. These often target individuals and communities, and worsen users' experience. With over 230 million Urdu speakers generating massive content daily, manual moderation falls short. Thus automated emotion-analysis becomes a demand of high relevance.

Project Description

This project uses ML algorithms to automate emotion-analysis in the Urdu language. It takes in a piece of Urdu text, and identifies the multiple combination of emotions (hence, multi-label), that may be conveyed by it. The identified emotions are categorised to fall under Ekman’s six basic emotions and neutrality.

Repo structure

There are 5 Jupyter notebooks (written to execute on Google's Colaboratory) each containing the code for training and testing each ML model-combination. I've also uploaded the training and testing data I used during development.

Training Data
- Has 7800 tweets in the Urdu language
- Contains 8 columns of data. Each Urdu text is accompanied by corresponding emotion-labels (1's signify the presence of a particular emotion)
Testing Data - Has 1950 Urdu sentences for testing

Setup Instructions

Go to Google Colab and create a new notebook.
Clone the Repository - In a new code cell, type the following command: !git clone https://github.com/dejah22/Multi-Label-Emotion-Classification-in-Urdu.git
Use cd to change to the directory of the cloned repository and open the desired .ipynb file.

Tips
1. Install any missing dependencies or required libraries using: !pip install
2. Save your changes back to GitHub

Project Recognition and Acknowledgments :)

I would first like to thank Avanthika K and Dr. Bharathi B for working on this project with me. Kudos guys!

Upon completion, we submitted out work to Task A - EmoThreat: Emotions and Threat Detection in Urdu, FIRE 2022. I sincerely express my gratitude to them, for letting us adopt their dataset, as well as for supporting our work. The working-notes of this project has also been published as a paper in the FIRE 2022 Conference.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CLASSIFIERCHAINS_TF-IDF.ipynb		CLASSIFIERCHAINS_TF-IDF.ipynb
LSTM_FastText.ipynb		LSTM_FastText.ipynb
MLKNN_CountVectorizar.ipynb		MLKNN_CountVectorizar.ipynb
MLKNN_TF-IDF.ipynb		MLKNN_TF-IDF.ipynb
README.md		README.md
SimpleRNN_Word2Vec.ipynb		SimpleRNN_Word2Vec.ipynb
testing_data.csv		testing_data.csv
training_data.csv		training_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Label Emotion Classification in Urdu

ML Models to decode compound emotions conveyed in a Urdu-based text

Purpose

Project Description

Repo structure

Setup Instructions

Tips

Project Recognition and Acknowledgments :)

About

Releases

Languages

dejah22/Multi-Label-Emotion-Classification-in-Urdu

Folders and files

Latest commit

History

Repository files navigation

Multi-Label Emotion Classification in Urdu

ML Models to decode compound emotions conveyed in a Urdu-based text

Purpose

Project Description

Repo structure

Setup Instructions

Tips

Project Recognition and Acknowledgments :)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages