Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(weave): Split out Trace and Query tests; move trace tests into dedicated tox envs across py39-312 #2403

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
213 changes: 101 additions & 112 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,20 @@ jobs:

lint:
name: Python lint
timeout-minutes: 7
needs:
- build-container
# runs-on: [self-hosted, gke-runner]
runs-on: ubuntu-latest
container: us-east4-docker.pkg.dev/weave-support-367421/weave-images/weave-test-python-client:${{ github.sha }}
steps:
- uses: actions/checkout@v2
- run: source /root/venv/bin/activate && pre-commit run --hook-stage=pre-push --all-files
- name: Checkout
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install tox
- name: Run tox
run: tox -e lint

weavejs-lint-compile:
name: WeaveJS Lint and Compile
Expand All @@ -62,7 +67,7 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/setup-node@v1
with:
node-version: "18.x"
node-version: '18.x'
- run: |
set -e
cd weave-js
Expand All @@ -73,6 +78,88 @@ jobs:
yarn prettier
yarn run tsc

trace-tests:
name: Trace tox tests
runs-on: ubuntu-latest
strategy:
matrix:
python-version-major: ['3']
python-version-minor: [
'9',
'10',
'11',
# '12', # TODO: We have actual failing tests in 3.12, but commenting for simplicity for now.
#
]
tox-shard:
[
'trace',
'trace_server',
'anthropic',
'cerebras',
'cohere',
'dspy',
'groq',
'langchain',
'litellm',
'llamaindex',
'mistral0',
'mistral1',
'openai',
]
fail-fast: false
services:
wandbservice:
image: us-central1-docker.pkg.dev/wandb-production/images/local-testcontainer:master
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this base?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the container we use to test things that need gorilla.

Do you think there's a better container to use?

credentials:
username: _json_key
password: ${{ secrets.gcp_wb_sa_key }}
env:
CI: 1
WANDB_ENABLE_TEST_CONTAINER: true
ports:
- '8080:8080'
- '8083:8083'
- '9015:9015'
options: --health-cmd "curl --fail http://localhost:8080/healthz || exit 1" --health-interval=5s --health-timeout=3s
weave_clickhouse:
image: clickhouse/clickhouse-server
ports:
- '8123:8123'
options: --health-cmd "wget -nv -O- 'http://localhost:8123/ping' || exit 1" --health-interval=5s --health-timeout=3s

steps:
- name: Checkout
uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install tox
- name: Run tox (Clickhouse Only)
env:
WEAVE_SENTRY_ENV: ci
CI: 1
WB_SERVER_HOST: http://wandbservice
WF_CLICKHOUSE_HOST: weave_clickhouse
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
run: |
tox -e ${{ matrix.tox-shard }}-py${{ matrix.python-version-major }}${{ matrix.python-version-minor }} -- \
-m "weave_client and not skip_clickhouse_client" \
--weave-server=clickhouse
- name: Run tox
env:
WEAVE_SENTRY_ENV: ci
CI: 1
WB_SERVER_HOST: http://wandbservice
WF_CLICKHOUSE_HOST: weave_clickhouse
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
run: |
tox -e ${{ matrix.tox-shard }}-py${{ matrix.python-version-major }}${{ matrix.python-version-minor }}

test:
name: Python unit tests
timeout-minutes: 15 # do not raise! running longer than this indicates an issue with the tests. fix there.
Expand All @@ -97,14 +184,14 @@ jobs:
CI: 1
WANDB_ENABLE_TEST_CONTAINER: true
ports:
- "8080:8080"
- "8083:8083"
- "9015:9015"
- '8080:8080'
- '8083:8083'
- '9015:9015'
options: --health-cmd "curl --fail http://localhost:8080/healthz || exit 1" --health-interval=5s --health-timeout=3s
weave_clickhouse:
image: clickhouse/clickhouse-server
ports:
- "8123:8123"
- '8123:8123'
options: --health-cmd "wget -nv -O- 'http://localhost:8123/ping' || exit 1" --health-interval=5s --health-timeout=3s
steps:
# - uses: datadog/agent-github-action@v1.3
Expand All @@ -113,28 +200,7 @@ jobs:
- uses: actions/checkout@v2
- name: Verify wandb server is running
run: curl -s http://wandbservice:8080/healthz
- name: Run Python Unit Tests (Clickhouse Client Only)
env:
DD_SERVICE: weave-python
DD_ENV: ci
WEAVE_SENTRY_ENV: ci
CI: 1
WB_SERVER_HOST: http://wandbservice
WF_CLICKHOUSE_HOST: weave_clickhouse
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
# This runner specifically runs the tests that use the `client` fixture (those that support clickhouse client tests)
# However, we skip tests marked with `skip_clickhouse_client`. These should be considered TODOs and an exception
run: |
source /root/venv/bin/activate && \
cd weave && \
pytest -m "weave_client and not skip_clickhouse_client" \
--weave-server=clickhouse \
--job-num=${{ matrix.job_num }} \
--timeout=90 \
--ddtrace \
--durations=5 \
./integrations ./legacy ./trace_server ./trace ./tests
- name: Run Python Unit Tests
- name: Run Legacy (Query Service) Python Unit Tests
env:
DD_SERVICE: weave-python
DD_ENV: ci
Expand All @@ -153,81 +219,4 @@ jobs:
--timeout=90 \
--ddtrace \
--durations=5 \
./integrations ./legacy ./trace_server ./trace ./tests

# nbmake:
# name: Run notebooks with nbmake
# runs-on: self-hosted
# container: us-east4-docker.pkg.dev/weave-support-367421/weave-images/weave-test:latest
# steps:
# - uses: actions/checkout@v2

# - name: Run notebooks
# run: source /root/venv/bin/activate && export PYTHONPATH=$(pwd) && pytest -n=4 --nbmake --overwrite examples
# - name: Upload executed notebooks
# uses: actions/upload-artifact@v3
# if: always()
# with:
# name: notebooks
# path: examples

cypress-run:
name: Notebook and UI tests
timeout-minutes: 25 # 15 minute timeout routinely trips on rerun
needs:
- build-container
# - lint
# - test
if: always()
# runs-on: [self-hosted, gke-runner]
runs-on: ubuntu-latest

container: us-east4-docker.pkg.dev/weave-support-367421/weave-images/weave-integration-test:${{ github.sha }}

strategy:
fail-fast: false
matrix:
containers: [1, 2, 3, 4, 5, 6]
steps:
- uses: actions/checkout@v3
- name: Setup W&B API key
run: echo "WANDB_API_KEY=${{ secrets.WANDB_API_KEY }}" >> $GITHUB_ENV
- name: Setup Replicate API token
run: echo "REPLICATE_API_TOKEN=${{ secrets.REPLICATE_API_TOKEN }}" > $GITHUB_ENV
- name: Copy node_modules from container to checkout
run: cp -R /root/integration_test/node_modules ./integration_test/
- name: Activate venv
run: source /root/venv/bin/activate && echo "PATH=$PATH" >> $GITHUB_ENV
- name: Make Log Dir
run: mkdir -p /tmp/weave/log
- name: Copy over built assets
run: cp -r /root/weave-js-build/* ./weave/frontend
- name: Start weave server
# github actions does something funky with the std file descriptors, they end up
# being closed. tqdm (for example) raises an exception when the descriptor it
# wants to write to is closed.
run: nohup ./scripts/weave_server_test.sh < /dev/null &> /tmp/weave/log/stdout.log &
shell: bash
- name: Cypress run
# Use the following to run just a single test
# run: export PYTHONPATH=$(pwd) && cd integration_test && npx cypress run --browser replay-chromium --spec "cypress/e2e/notebooks/Ops that return images.cy.ts"
# run: export PYTHONPATH=$(pwd) && cd integration_test && npx cypress run --browser replay-chromium
run: export PYTHONPATH=$(pwd) && cd integration_test && npx cypress run --browser chrome
# uses: cypress-io/github-action@v4
# with:
# working-directory: ./integration_testG
# record: true
# parallel: true
env:
REPLAY_API_KEY: ${{ secrets.REPLAY_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SPLIT: ${{ strategy.job-total }}
SPLIT_INDEX: ${{ strategy.job-index }}
# - uses: actions/upload-artifact@v3
- name: Upload logs
uses: actions/upload-artifact@v3
if: failure()
with:
name: weave-server-logs
path: /tmp/weave/log
retention-days: 3
./legacy
andrewtruong marked this conversation as resolved.
Show resolved Hide resolved
14 changes: 7 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ version = { attr = "weave.version.VERSION" }
dependencies = { file = ["requirements.txt"] }

[tool.setuptools.dynamic.optional-dependencies]
examples = { file = ["requirements.ecosystem.txt"] }
engine = { file = ["requirements.engine.txt"] }
ecosystem = { file = ["requirements.ecosystem.txt"] }
datadog = { file = ["requirements.datadog.txt"] }
examples = { file = ["requirements.legacy.ecosystem.txt"] }
engine = { file = ["requirements.legacy.engine.txt"] }
ecosystem = { file = ["requirements.legacy.ecosystem.txt"] }
datadog = { file = ["requirements.legacy.datadog.txt"] }
modal = { file = ["requirements.modal.txt"] }

[tool.pytest.ini_options]
Expand Down Expand Up @@ -127,8 +127,8 @@ parse = """(?x)
)? # pre-release section is optional
"""
serialize = [
"{major}.{minor}.{patch}-{pre_l}{pre_n}",
"{major}.{minor}.{patch}",
"{major}.{minor}.{patch}-{pre_l}{pre_n}",
"{major}.{minor}.{patch}",
]
search = "{current_version}"
replace = "{new_version}"
Expand All @@ -146,4 +146,4 @@ commit_args = ""

[tool.bumpversion.parts.pre_l]
values = ["dev", "final"]
optional_value = "final"
optional_value = "final"
7 changes: 0 additions & 7 deletions requirements.all.txt

This file was deleted.

File renamed without changes.
4 changes: 2 additions & 2 deletions requirements.dev.txt → requirements.legacy.dev.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
-r requirements.txt
-r requirements.engine.txt
-r requirements.legacy.txt
-r requirements.legacy.engine.txt
types-requests>=2.28.11.8
types-setuptools>=65.7.0.3
pre-commit>=3.3.3
Expand Down
File renamed without changes.
File renamed without changes.
7 changes: 4 additions & 3 deletions requirements.legacy.test.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
-r requirements.txt
-r requirements.engine.txt # This is "legacy" as well
-r requirements.datadog.txt # This is "legacy" as well
-r requirements.legacy.txt
-r requirements.legacy.engine.txt
-r requirements.legacy.datadog.txt
pytest>=8.2.0
pytest-watch>=4.2.0
pytest-timeout>=2.1.0
Expand All @@ -26,3 +26,4 @@ cryptography>=42.0.7 # CVE 2023-23931

# SQL Generation Tests
sqlparse
filelock
61 changes: 61 additions & 0 deletions requirements.legacy.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# These are the base Weave requirements, enough for weave tracking and evaluation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be severely trimmed down - much of this is trace specific. fine to be a followup

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the minimal set described here? If yes we can replace:
https://github.com/wandb/core/blob/master/services/weave-python/requirements.txt

# to work.

# Type annotations, we need ParamSpec in python3.9
typing_extensions>=4.0.0

# Definitely need arrow
# TODO: Colab has 9.0.0, can we support?
# TODO: 17.0.0 breaks a bunch of tests - can we move this requirement to just the engine?
pyarrow>=14.0.1,<17.0.0

# pydantic integration, and required by openai anyway
openai>=1.0.0
tiktoken>=0.4.0
pydantic>=2.0.0

# evaluation framework uses this for logging/status line at the moment.
rich>=13.7.0

# IO service uses these. Could probably remove reliance on ioservice.
aiohttp>=3.8.3
aiofiles>=22.1.0
aioprocessing>=2.0.1
Werkzeug>=3.0.3 # CVE 2024-34069
janus>=1.0.0

# we use this for logger, could probably skip it
python-json-logger>=2.0.4

# Used in box and just a little in arrow code.
numpy>=1.21

# required for wandb
wandb>=0.16.4
graphql-core>3
gql[requests]>=3.4.1
# TEMPORARY: Up to, and including wandb==0.17.1, wandb does is not
# compatible with numpy >= 2.0.0. This is a temporary fix until wandb
# is updated to be compatible with numpy >= 2.0.0.
numpy<2.0.0

# Segment logging
analytics-python>=1.2.9

# Used for ISO date parsing.
python-dateutil>=2.8.2

# Used for version parsing in integrations.
packaging>=21.0

# Need to exclude the 8.4.0 version of tenacity because it has a bug
# on import of AsyncRetrying
tenacity>=8.3.0,!=8.4.0


# Used for emoji shortcode support in feedback
emoji>=2.12.1

# Used for ID Generation - remove once python's
# built-in uuid module is updated to support UUIDv7
uuid-utils>=0.9.0
Loading
Loading