Naive Bayes for Multilingual Text Classification

Overview

This project implements the Naive Bayes algorithm for text classification across multiple languages. The aim is to explore how Naive Bayes, a probabilistic machine learning algorithm, can be applied effectively to multilingual text data in various natural language processing (NLP) tasks.

Features

Naive Bayes Implementation: Custom implementation of the Naive Bayes algorithm for text classification.
Multilingual Support: Handles text data from multiple languages.
Model Evaluation: Evaluates model performance using accuracy, precision, recall, and F1-score.
Dataset Support: Supports various datasets for multilingual text classification.

Usage

Prepare your dataset:
- Ensure that your dataset is in a suitable format (e.g., CSV or JSON) and contains labeled text data from multiple languages.
Train the model:
- Run the training script to train the Naive Bayes classifier on your dataset.
```
python train.py --dataset path_to_your_dataset
```
Evaluate the model:
- Evaluate the trained model on a test set and view the results.
```
python evaluate.py --dataset path_to_your_test_dataset
```

Datasets

Some example datasets you can use:

Multilingual Sentiment Analysis Dataset
Language Identification Dataset
Spam/Ham Classification Dataset (in multiple languages)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
code.py		code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive Bayes for Multilingual Text Classification

Overview

Features

Usage

Datasets

About

Releases

Packages

Languages

koushik16/Naive-Bayes-on-Multi-Language-Text

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes for Multilingual Text Classification

Overview

Features

Usage

Datasets

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages