Skip to content

Commit

Permalink
build: Use python script to create and push dataset to huggingface
Browse files Browse the repository at this point in the history
  • Loading branch information
getwithashish authored Aug 1, 2024
1 parent 26a2613 commit f8c1047
Showing 1 changed file with 4 additions and 10 deletions.
14 changes: 4 additions & 10 deletions .github/workflows/push-dataset.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,14 @@ jobs:
sudo apt-get install python3-pandas
pip install pandas
pip install pyarrow
pip install datasets
pip install huggingface_hub
pip install python-decouple
- name: Convert json to parquet
- name: Convert json to parquet and save it in huggingface
run: |
python3 parquet_dataset_generator.py
- name: Install dependencies to push to huggingface
run: |
pip install datasets huggingface-hub
- name: Push to huggingface
run: |
datasets-cli login --token ${{ secrets.HF_TOKEN }}
datasets-cli upload ./internal_dataset.parquet getwithashish/internal-dept-dataset
# finetune:
# runs-on: ubuntu-latest

Expand Down

0 comments on commit f8c1047

Please sign in to comment.