naive bayes blog v1

nveshaan · Oct 19, 2024 · fa791f9 · fa791f9
1 parent fd754b5
commit fa791f9
Show file tree

Hide file tree

Showing 13 changed files with 368 additions and 17 deletions.
diff --git a/content/blog/naive-bayes/index.md b/content/blog/naive-bayes/index.md
@@ -7,4 +7,73 @@ tags: ["machine learning"]
 
 ## Overview
 
-Bayes' Theorem is a fundamental concept in mathematics. But, did you ever know how it is used in real-life?
+Bayes' Theorem is a fundamental concept in Probability Theory. It is widely used in fields such as statistics, machine learning, and data science, especially in the context of probabilistic inference and decision-making. In this post, I will explain how Bayes' Theorem is used in Machine Learning, by considering a simple example. But, before that, let's understand some terminology.
+
+## Terminology
+
+### Bayes' Theorem
+
+It describes how to update the probability of a hypothesis based on new evidence. It provides a way to calculate the **posterior probability** of an event by combining the **prior probability** with the **likelihood**.
+
+Mathematically, Bayes' Theorem is defined as:
+
+{{< katex >}}
+
+$$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$$
+
+### Likelihood vs. Probability
+
+Probability measures how likely a particular event is to occur, given a certain model or assumption. In Bayes' Theorem, \\( P(B) \\) represents the total probability of observing event \\(B\\).
+
+Likelihood measures how likely the observed data is given a particular hypothesis or model. It's similar to probability with a subtle difference. Probability is about predicting outcomes, while likelihood is about fitting a model to data. In Bayes' Theorem, \\( P(B|A)\\) is the likelihood of observing data \\(B\\) given the hypothesis \\(A\\).
+
+### Prior Probability
+
+The prior probability represents your initial belief about the probability of a hypothesis before you have any new data or evidence. In Bayes' Theorem, \\( P(A) \\) is the prior probability of the hypothesis \\(A\\).
+
+### Posterior Probability
+
+The posterior probability is the updates probability of a hypothesis after observing new evidence. It combines the prior belief with the likelihood of the observed data. In Bayes' Theorem, \\( P(A|B) \\) is the posterior probability of the hypothesis \\(A\\) given the evidence \\(B\\).
+
+## Implementation
+
+One of the use-cases of Naive Bayes' Classifier is filtering out spam mails from your inbox. Let's see how it is implemented.
+
+For a classifier, we need to define three parameters:
+- Prior probability
+- Likelihood
+- Posterior probability
+
+We do this by analysing the data we have. The dataset contains the words in the mail and their counts, and the number of spam mails we have.
+
+Let's say we recieve 100 mails, 20 spam mails, and 80 normal mails.
+
+Now, the prior probability is the probability of a mail being spam. It is calculated by dividing the number of spam mails by the total number of mails. So,
+
+{{< katex >}}
+
+$$P(spam) = \frac{20}{100} = 0.2$$
+
+Let's assume that the normal mails we received have words like "dear", "friend", and "hi". And the spam mails have words like "money", "free", and "sign".
+
+The likelihood is the probability of a word being in the mail. It is calculated by multiplying the frequency of the word by the prior probability. For sake of simplicity, let's assume that the liklihood of a word "friend" is low and the liklihood of a word "money" is high.
+
+{{< katex >}}
+
+$$P("friend" | spam) = 0.02$$
+$$P("money" | spam) = 0.8$$
+$$P("hi" | spam) = ...$$
+
+And so on for all the words.
+
+The posterior probability is the probability of a mail being spam given a word. It is calculated by multiplying the likelihood by the prior probability.
+
+When we recieve a new mail, we count the frequency of each word in the mail.
+
+For example, if we recieve a mail with words "spam", "money", and "money", the posterior probability of the mail being spam is:
+
+{{< katex >}}
+
+$$P(spam | "spam", "money", "money") = $$
+
+## Conclusion
diff --git a/content/projects/muscle-wave-classifier/data_distribution.png b/content/projects/muscle-wave-classifier/data_distribution.png
diff --git a/content/projects/muscle-wave-classifier/index.md b/content/projects/muscle-wave-classifier/index.md
@@ -19,4 +19,7 @@ The project is divided into three main parts:
 
 ## Data Acquisition
 
-The dataset for this project has been collected from 13 subjects, using an EMG sensor from UpsideDownLabs.
+The dataset is recorded from 13 subjects. Each subject has 1000 samples, totalling to 13,000. The labels are the states of their dominant fist (open or closed). Both labels have equal number of examples. Below is the distribution of the dataset.
+
+![data-distribution](./data_distribution.png)
+
diff --git a/public/blog/index.html b/public/blog/index.html
@@ -570,7 +570,7 @@ <h1 class="mt-5 text-4xl font-extrabold text-neutral-900 dark:text-neutral">Blog
 <div class="flex flex-row flex-wrap items-center">
 
 
-  <time datetime="2024-10-12 00:00:00 &#43;0000 UTC">12 October 2024</time><span class="px-2 text-primary-500">&middot;</span><span>21 words</span><span class="px-2 text-primary-500">&middot;</span><span title="Reading time">1 min</span>
+  <time datetime="2024-10-12 00:00:00 &#43;0000 UTC">12 October 2024</time><span class="px-2 text-primary-500">&middot;</span><span>581 words</span><span class="px-2 text-primary-500">&middot;</span><span title="Reading time">3 mins</span>
 
 
 

diff --git a/public/blog/index.xml b/public/blog/index.xml
@@ -14,7 +14,7 @@
       <pubDate>Sat, 12 Oct 2024 00:00:00 +0000</pubDate>
 
       <guid>http://localhost:1313/blog/naive-bayes/</guid>
-      <description>Overview #Bayes&amp;rsquo; Theorem is a fundamental concept in mathematics.</description>
+      <description>Overview #Bayes&amp;rsquo; Theorem is a fundamental concept in Probability Theory.</description>
 
     </item>
Original file line number	Diff line number	Diff line change
Expand Up		@@ -570,7 +570,7 @@ <h1 class="mt-5 text-4xl font-extrabold text-neutral-900 dark:text-neutral">Blog
		<div class="flex flex-row flex-wrap items-center">


		<time datetime="2024-10-12 00:00:00 +0000 UTC">12 October 2024</time><span class="px-2 text-primary-500">·</span><span>21 words</span><span class="px-2 text-primary-500">·</span><span title="Reading time">1 min</span>
		<time datetime="2024-10-12 00:00:00 +0000 UTC">12 October 2024</time><span class="px-2 text-primary-500">·</span><span>581 words</span><span class="px-2 text-primary-500">·</span><span title="Reading time">3 mins</span>



Expand Down