Classification with Mistral 7B vs CamemBERT

from [https://github.com/BirdiD/TextClassifier]

Classification with Mistral 7B vs CamemBERT

In this repository, we will perform two classification tasks and compare them.

The first classification will use Mistral 7B Instruct large language model and in the second one, we will finetune Camembert base model on sequence classification. We have 90 french sentences that belongs to one of the following classes:

Intention de recherche d'informations
Intention d'action
Intention familière

Getting started

Create a venv Create a python virtual environment and install the required dependancies

python -m venv myenv

Activate the virtual environment

source myenv/bin/activate

Mistral 7B

Go to the Mistral folder and run the following commands

pip install -r requirements.txt

After installation, add your sentence to run.sh file and run the script sh run.sh.

You can also directly run the inference in terminal with code below

python classifier.py --model_name='mistralai/Mistral-7B-Instruct-v0.2' \
                     --sentence="Ferme automatiquement les portes à l'heure prévue."

Running the above script will print the predicted category along with some explanation why the category has been chosen in a json format

CamemBERT

Here we finetuned a camembert bae model on a small dataset (90 records).

To train the model, run the following script. Make sure you modifiy the values for your use case:

python train.py --model_name_or_path="camembert/camembert-base" \
                --data_folder_path="Data/Classeur1_catg_phrases.xlsx" \
                --output_dir="output" \
                --hub_model_id="DioulaD/classificateur-intention_camembert" \
                --max_steps="100" \
                --logging_steps="10" \
                --save_steps="20"

You can also directly run sh run_training.sh in your terminal.

Once training, we can run inference as follows:

from transformers import pipeline

pipe = pipeline("text-classification", model="DioulaD/classificateur-intention_camembert")
pipe("Ouvre la porte et fais vite stp")

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
CamemBert		CamemBert
Mistral		Mistral
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification with Mistral 7B vs CamemBERT

Getting started

About

Releases

Packages

Contributors 2

Languages

SemouleSombre/TextClassifier

Folders and files

Latest commit

History

Repository files navigation

Classification with Mistral 7B vs CamemBERT

Getting started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages