RAG Approach used to participate in Dacon Hansol Deco Challenge 2024. Please refer to https://dacon.io/competitions/official/236216/overview/description for more information about the competition.
conda create -n hansolrag-env python=3.9
conda activate hansolrag-env
pip install git+https://github.com/bibekyess/dacon-hansol-deco-challenge.git
git clone https://github.com/bibekyess/dacon-hansol-deco-challenge.git
cd dacon-hansol-deco-challenge
hansolrag --text "면진장치가 뭐야?"
hansolrag --file hansolrag/data/mini_test.csv --output-file hansolrag/deliverable/mini_test_result.json
hansolrag --file hansolrag/data/test.csv --output-file hansolrag/deliverable/test_result.json --submission-file hansolrag/deliverable/test_result.csv
Please look at the hansolrag/config/config.yaml and change the config as per your preference.
In the config, you can see there are three generation-model modes:
skt/kogpt2-base-v2
is used for quick debugging and testing purposes. The results are not satisfactory using this model.
For the best results using CPU, please download gemma-2b-it-GGUF
from https://huggingface.co/google/gemma-2b-it-GGUF and put it under this directory structure hansolrag/model_checkpoints/gemma/gemma-2b-it.gguf
.
For the best results with GPU, we are using SOTA OrionStarAI/Orion-14B-Chat-Int4
. Just uncomment the respective portion and run.