Skip to content
This repository has been archived by the owner on Feb 26, 2024. It is now read-only.

Email system with spam detection, spelling check, AES encryption.

Notifications You must be signed in to change notification settings

mendrika261/S4-ANALYSE-email-spam-spelling

Repository files navigation

Overview 🔮

Email system with spam detection, spelling check, AES encryption.

Screen.Recording.2024-02-26.at.22.59.07.online-video-cutter.com.mp4

Functionality 🛠️

  • Automatic spam detection
  • Auto refresh model after (10) new emails, or manually after marking an email as spam
  • Check spelling when writing an email
  • AES encryption
  • Send, receive, delete, mark as spam
  • Status: received ... (read, unread)

How it works 🧠

  • The spam detection uses data (only) generated by prompt from the OpenAI API - GPT Turbo 3.5
    • Make the email into UTF-8
    • Parse eventual HTML with BeautifulSoup
    • Remove links, mentions, hashtags, emails, numbers, punctuation, and stopwords
    • Stem the words using: PorterStemmer
    • Make Term Frequency-Inverse Document Frequency (TF-IDF) matrix
    • Train the model (limit 500 first features), and took LinearSVC as the best model
    • Save the model into file using joblib
  • The spelling check uses the Levenshtein distance to find the most similar word in a French (lite) dictionary

How to use ℹ️

  • Create a virtual environment
python3 -m venv venv
  • Activate the virtual environment
source venv/bin/activate
  • Install the requirements
pip install -r requirements.txt
  • Download the necessary data
python3 -m nltk.downloader stopwords
python3 -m nltk.downloader punkt
python3 -m nltk.downloader wordnet
python3 -m nltk.downloader omw
  • Run the program
python3 manage.py runserver