Skip to content

Commit

Permalink
Merge branch 'develop' into feature/prepare-bulk-annotation
Browse files Browse the repository at this point in the history
* develop: (21 commits)
  ✨ Fix error handling in axios plugin for 401 (#4362)
  docs: Change `telemetry` section in tutorials to directly executable cells (#4399)
  docs: add faq files (#4363)
  fix: pinning `pytest-asyncio` to version `0.21.1` to avoid problems running unit tests on GitHub workflows (#4395)
  docs: add making most of markdown to tutorial page (#4376)
  Fixing typo in Fine Tuning LLMs Practical Guides (#4392)
  Token Classification epochs parameter trainer changed (#4393)
  docs: align practical guidescreate datasethtml with end2end examples structure (#4375)
  docs: hugging face space url (#4379)
  docs: extend using proxy section (#4368)
  chore: update dev version
  chore: update CHANGELOG.md before release v1.20.0 (#4357)
  docs: temporal update to indicate persistent storage (#4355)
  docs: add suggestions and responses filters and sorting (#4345)
  docs: add end2end example on creating a basic text-classification dataset (#4208)
  Fix/responses suggestions filter fine tune (#4356)
  Fix/responses suggestions filter fine tune (#4356)
  fix: Accept draft responses on dataset records creation (#4354)
  Feature/responses operator (#4352)
  Feature/responses operator (#4352)
  ...
  • Loading branch information
leire committed Dec 12, 2023
2 parents 4caf187 + 2729c76 commit ef50946
Show file tree
Hide file tree
Showing 148 changed files with 9,900 additions and 1,587 deletions.
10 changes: 10 additions & 0 deletions .github/workflows/check-repo-files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ on:
pythonChanges:
description: "True if some files in python code have changed"
value: ${{ jobs.check-repo-files.outputs.pythonChanges }}
end2endChanges:
description: "True if some files in python code have changed"
value: ${{ jobs.check-repo-files.outputs.end2endChanges }}
buildChanges:
description: "True if some files affecting the build have changed"
value: ${{ jobs.check-repo-files.outputs.buildChanges }}
Expand All @@ -17,6 +20,7 @@ jobs:
outputs:
pythonChanges: ${{ steps.path_filter.outputs.pythonChanges }}
buildChanges: ${{ steps.path_filter.outputs.buildChanges }}
end2endChanges: ${{ steps.path_filter.outputs.end2endChanges }}
steps:
- name: Checkout Code 🛎
uses: actions/checkout@v3
Expand All @@ -30,6 +34,12 @@ jobs:
- 'tests/**'
- 'pyproject.toml'
- 'setup.py'
end2endChanges:
- 'src/**'
- 'pyproject.toml'
- 'setup.py'
- 'scripts/end2end_examples.py'
- 'docs/_source/tutorials_and_integrations/tutorials/feedback/end2end_examples/**'
buildChanges:
- 'src/**'
- 'frontend/**'
Expand Down
98 changes: 98 additions & 0 deletions .github/workflows/end2end-examples.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
name: Run end2end sdk examples

on:
workflow_call:
inputs:
runsOn:
required: false
type: string
default: extended-runner
searchEngineDockerImage:
description: "The name of the Docker image of the search engine to use."
default: docker.elastic.co/elasticsearch/elasticsearch:8.8.2
required: false
type: string
searchEngineDockerEnv:
description: "The name of the Docker image of the search engine to use."
default: '{"discovery.type": "single-node", "xpack.security.enabled": "false"}'
required: false
type: string
env:
# Increase this value to reset cache if etc/example-environment.yml has not changed
CACHE_NUMBER: 5

jobs:
# Runs depending on the result from the check-repo-files.yml
call-check-repo-files:
uses: ./.github/workflows/check-repo-files.yml

end2end-examples:
name: end2end notebook examples, FeedbackDataset for text-classification
runs-on: ${{ inputs.runsOn }}
services:
search_engine:
image: ${{ inputs.searchEngineDockerImage }}
ports:
- 9200:9200
env: ${{ fromJson(inputs.searchEngineDockerEnv) }}
defaults:
run:
shell: bash -l {0}
steps:
- name: Checkout Code 🛎
uses: actions/checkout@v3
- name: Setup Conda Env 🐍
uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
miniforge-version: latest
use-mamba: true
activate-environment: argilla
- name: Get date for conda cache
id: get-date
run: echo "::set-output name=today::$(/bin/date -u '+%Y%m%d')"
shell: bash
- name: Cache Conda env
uses: actions/cache@v3
id: cache
with:
path: ${{ env.CONDA }}/envs
key: conda-${{ runner.os }}--${{ runner.arch }}--${{ steps.get-date.outputs.today }}-${{ hashFiles('environment_dev.yml') }}-${{ env.CACHE_NUMBER }}
- name: Update environment
if: steps.cache.outputs.cache-hit != 'true'
run: mamba env update -n argilla -f environment_dev.yml
- name: Cache pip 👜
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ env.CACHE_NUMBER }}-${{ hashFiles('pyproject.toml') }}
- name: Set huggingface hub credentials
if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop' || startsWith(github.ref, 'refs/heads/releases')
run: |
echo "HF_HUB_ACCESS_TOKEN=${{ secrets.HF_HUB_ACCESS_TOKEN }}" >> "$GITHUB_ENV"
echo "Enable HF access token"
- name: Set Argilla search engine env variable
if: startsWith(inputs.searchEngineDockerImage, 'docker.elastic.co')
run: |
echo "ARGILLA_SEARCH_ENGINE=elasticsearch" >> "$GITHUB_ENV"
echo "Configure elasticsearch engine"
- name: Set Argilla search engine env variable
if: startsWith(inputs.searchEngineDockerImage, 'opensearchproject')
run: |
echo "ARGILLA_SEARCH_ENGINE=opensearch" >> "$GITHUB_ENV"
echo "Configure opensearch engine"
- name: Launch Argilla Server
env:
ARGILLA_ENABLE_TELEMETRY: 0
run: |
pip install -e .
python -m argilla server database migrate
python -m argilla server database users create_default
uvicorn argilla.server.app:app --port 6900 --host 0.0.0.0 &
- name: Run end2end examples 📈
env:
ARGILLA_ENABLE_TELEMETRY: 0
HF_HUB_ACCESS_TOKEN: ${{ secrets.HF_HUB_ACCESS_TOKEN }}
run: |
pip install papermill
python scripts/end2end_examples.py
22 changes: 22 additions & 0 deletions .github/workflows/package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,28 @@ jobs:
pytestArgs: tests/unit
secrets: inherit

run_end2end_tests:
strategy:
matrix:
include:
- searchEngineDockerImage: docker.elastic.co/elasticsearch/elasticsearch:8.8.2
searchEngineDockerEnv: '{"discovery.type": "single-node", "xpack.security.enabled": "false"}'
coverageReport: coverage-elasticsearch-8.8.2
runsOn: extended-runner
- searchEngineDockerImage: opensearchproject/opensearch:2.4.1
searchEngineDockerEnv: '{"discovery.type": "single-node", "plugins.security.disabled": "true"}'
coverageReport: coverage-opensearch-2.4.1
runsOn: ubuntu-latest
name: Run end2end tests
uses: ./.github/workflows/end2end-examples.yml
needs: check_repo_files
if: needs.check_repo_files.outputs.end2endChanges == 'true'
with:
runsOn: ${{ matrix.runsOn }}
searchEngineDockerImage: ${{ matrix.searchEngineDockerImage }}
searchEngineDockerEnv: ${{ matrix.searchEngineDockerEnv }}
secrets: inherit

run_unit_test_with_extra_engines:
strategy:
matrix:
Expand Down
15 changes: 12 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,14 @@ These are the section headers that we use:

### Added

- Added `metadata_properties` to the `__repr__` method of the `FeedbackDataset` and `RemoteFeedbackDataset`.([#4192](https://github.com/argilla-io/argilla/pull/4192)).
- Added strategy to handle and translate errors from server for `401 http status code` ([#4362](https://github.com/argilla-io/argilla/pull/4362))

## [1.20.0](https://github.com/argilla-io/argilla/compare/v1.19.0...v1.20.0)

### Added

- Added `GET /api/v1/datasets/:dataset_id/records/search/suggestions/options` endpoint to return suggestion available options for searching. ([#4260](https://github.com/argilla-io/argilla/pull/4260))
- Added `metadata_properties` to the `__repr__` method of the `FeedbackDataset` and `RemoteFeedbackDataset`.([#4192](https://github.com/argilla-io/argilla/pull/4192)).
- Added `get_model_kwargs`, `get_trainer_kwargs`, `get_trainer_model`, `get_trainer_tokenizer` and `get_trainer` -methods to the `ArgillaTrainer` to improve interoperability across frameworks. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
- Added additional formatting checks to the `ArgillaTrainer` to allow for better interoperability of `defaults` and `formatting_func` usage. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
- Added a warning to the `update_config`-method of `ArgillaTrainer` to emphasize if the `kwargs` were updated correctly. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
Expand All @@ -33,16 +39,19 @@ These are the section headers that we use:
- Fixed error in the unification strategy for `RankingQuestion` ([#4295](https://github.com/argilla-io/argilla/pull/4295))
- Fixed `TextClassificationSettings.labels_schema` order was not being preserved. Closes [#3828](https://github.com/argilla-io/argilla/issues/3828) ([#4332](https://github.com/argilla-io/argilla/pull/4332))
- Fixed error when requesting non-existing API endpoints. Closes [#4073](https://github.com/argilla-io/argilla/issues/4073) ([#4325](https://github.com/argilla-io/argilla/pull/4325))
- Fixed error when passing `draft` responses to create records endpoint. ([#4354](https://github.com/argilla-io/argilla/pull/4354))

### Changed

- [breaking] Suggestions `agent` field only accepts now some specific characters and a limited length. ([#4265](https://github.com/argilla-io/argilla/pull/4265))
- [breaking] Suggestions `score` field only accepts now float values in the range `0` to `1`. ([#4266](https://github.com/argilla-io/argilla/pull/4266))
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support optional `query` attribute. ([#4327](https://github.com/argilla-io/argilla/pull/4327))
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support `filter` and `sort` attributes. ([#4327](https://github.com/argilla-io/argilla/pull/4327))
- Updated `POST /api/v1/me/datasets/:dataset_id/records/search` endpoint to support optional `query` attribute. ([#4270](https://github.com/argilla-io/argilla/pull/4270))
- Updated `POST /api/v1/me/datasets/:dataset_id/records/search` endpoint to support `filter` and `sort` attributes. ([#4270](https://github.com/argilla-io/argilla/pull/4270))
- Changed the logging style while pulling and pushing `FeedbackDataset` to Argilla from `tqdm` style to `rich`. ([#4267](https://github.com/argilla-io/argilla/pull/4267)). Contributed by @zucchini-nlp.
- Updated `push_to_argilla` to print `repr` of the pushed `RemoteFeedbackDataset` after push and changed `show_progress` to True by default. ([#4223](https://github.com/argilla-io/argilla/pull/4223))
- Changed `models` and `tokenizer` for the `ArgillaTrainer` to explicitly allow for changing them when needed. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support optional `query` attribute. ([#4327](https://github.com/argilla-io/argilla/pull/4327))
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support `filter` and `sort` attributes. ([#4327](https://github.com/argilla-io/argilla/pull/4327))

## [1.19.0](https://github.com/argilla-io/argilla/compare/v1.18.0...v1.19.0)

Expand Down
Loading

0 comments on commit ef50946

Please sign in to comment.