Filtering Spam E-mails using a Naive Bayes Classifier

Introduction

Naive Bayes classifiers work by correlating the use of tokens (typically words, or sometimes other things), with spam and non-spam e-mails and then using Bayes' theorem to calculate a probability that an email is or is not spam.

Math

The Bayes Theorem can be defined mathematically as:

P (A/B) = P (A) ∗ P (B/A)/P (B)

In the naive bayes approach we define the feature space of words as independent, thus the representation of naive bayes for email spam filtering is as follows:

prior(spam)=P(spam)*P(word1/spam)*P(word2/spam)...

and

Posterior(non-spam)=P(ham)*P(word1/ham)*P(word2/ham)...

Comparing between the values for the mail to be spam or ham, we reach at the conclusion. Employing proper prior and effective processing and lemmatization helps in improving the accuracy of naive bayesian spam filtering.

Contributors

Akshit Sudheer Kumar (Repository Maintainer)
Aby Stalin
Akhbar Sha
Shrish Nandakumar

Deployment Details

This project uses MATLAB as the development platform.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
README.md		README.md
bayesClassifier.m		bayesClassifier.m
count1.m		count1.m
getLikelihood.m		getLikelihood.m
nonSpamConfirm.m		nonSpamConfirm.m
process2.m		process2.m
processing.m		processing.m
spamConfirm.m		spamConfirm.m
testing.m		testing.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Filtering Spam E-mails using a Naive Bayes Classifier

Introduction

Math

Contributors

Deployment Details

About

Releases

Packages

Contributors 4

Languages

4k5h1t/NB-Spam-Classifier-Project

Folders and files

Latest commit

History

Repository files navigation

Filtering Spam E-mails using a Naive Bayes Classifier

Introduction

Math

Contributors

Deployment Details

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages