A Recommender Engine for headlines articles using embedded words.

The project develop an application that suggest to readers more similar articles to those they already read. It uses the embedding algorithms of headlines to create their own numerical representation, which allows to compute similarity between headlines and get the most similar ones.

For purpose of simplicity, we was satisfied only with headlines that concernes the year of 2018.

Steps of the project

We build the function "general_process" saved in the preprocessing.py file, to prepare the text data. Its output is the processed_data csv file, that contains the headlines after the preprocessing.

three algorithms are used to build a numerical representation of each headline, We talk about:

NMF and LDA factorization: We create a sparse matrix that composed of rows that represent each headlines and columns that represent each word in the entire vocabulary.
word2vec : A deeplearning approach, that uses an average word2vec of words composing that headline. those algorithms are exploited with the function "recommender_engine" developed in the recommender py file.

To excecute the app

Clone the repository in the commend line using the link : https://github.com/akhsassoualid/Headline_Recommender.

git clone https://github.com/akhsassoualid/Headline_Recommender.git

Install the necessary requirements :

pip install -r requirements.txt

Run the application savec in the app.py file

streamlit run app.py

Illustrate the application

A simple illustration of the App :

Deployment on Docker

Build the app image, execute in the command line :

docker build -t app .

To the container :

docker run -p 8501:8501 app

Special Thanks:

Google team of researchers for the Word2Vec trained model.
To the team of Streamlit for their open-source Python library to build applications.
To vikashrajluhaniwal for his tutorial about recommendation system.
To my friends Rachid and Salih for their help.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
datasets		datasets
headlines_recommender		headlines_recommender
static		static
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Recommender Engine for headlines articles using embedded words.

Steps of the project

To excecute the app

Illustrate the application

Deployment on Docker

Special Thanks:

About

Releases

Packages

Languages

akhsassoualid/Headline_Recommender

Folders and files

Latest commit

History

Repository files navigation

A Recommender Engine for headlines articles using embedded words.

Steps of the project

To excecute the app

Illustrate the application

Deployment on Docker

Special Thanks:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages