TwiLoc - Location Prediction Using Tweets

TwiLoc investigates the feasibility of geographically locating Twitter users based solely on tweet content. We are trying to locate a user using their tweet content by understanding the dialect differences across geographies through deep learning techniques. We are not using any other external information to locate the user. This project provides an approach to augment existing systems that locate users.

Prerequisites

Requires Python 3.x.

Here's is the list of libraries required for this project

GloVe is used for obtaining vector representations for words.

Dataset

GeoText - Geo-tagged Microblog Corpus is the primary dataset for TwiLoc. All the results and hyperparameter tunings are based on this dataset.
Accuracy can be enhanced further by using massive datasets like UTGeo2011 can also be used to train.

Reverse geocoding can be done using services provided by MapQuest.

Pre-trained models

Model	Accuracy (%)
CNN	57.43
GRU	56.35
LSTM	55.54
MLP	50.59

Note: Please read the report for more detailed information regarding the experiment's result.

References

Eisenstein J., O'Connor B., Smith N A., Xing E P. 2010. A Latent Variable Model for Geographic Lexical Variation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Liu J., Inkpen D. 2015. Estimating User Location in Social Media with Stacked Denoising Autoencoders. Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing.
Yin W., Kann K., Yu M., Hinrich S. 2017. Comparative Study of CNN and RNN for Natural Language Processing. arXiv:1702.01923.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Baseline_Models.ipynb		Baseline_Models.ipynb
CNN.ipynb		CNN.ipynb
GRU.ipynb		GRU.ipynb
LICENSE		LICENSE
LSTM.ipynb		LSTM.ipynb
MLP.ipynb		MLP.ipynb
README.md		README.md
Report.pdf		Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TwiLoc - Location Prediction Using Tweets

Prerequisites

Dataset

Pre-trained models

References

Authors

About

Releases

Packages

Contributors 2

Languages

License

Msundarv/TwiLoc

Folders and files

Latest commit

History

Repository files navigation

TwiLoc - Location Prediction Using Tweets

Prerequisites

Dataset

Pre-trained models

References

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages