Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A/B test jobs #314

Merged
merged 3 commits into from
Sep 9, 2022
Merged

A/B test jobs #314

merged 3 commits into from
Sep 9, 2022

Conversation

crusaderky
Copy link
Contributor

@crusaderky crusaderky commented Sep 8, 2022

This PR adds the ability to run the whole test suite on arbitrary additional coiled software environments.

@crusaderky crusaderky self-assigned this Sep 8, 2022
@crusaderky crusaderky marked this pull request as draft September 8, 2022 09:29
# Leave empty if you don't want to override anything.
distributed:
scheduler:
worker-saturation: 1.2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, if #235 will go through we'll lose the ability to pass environment variables.

The alternative to forcing the developer to create an empty file every time was to ignore the file if it doesn't exist, but the risk of misspellings producing unexpected results was too high. Explicit is better than implicit.

@crusaderky crusaderky changed the title A/B testing jobs A/B test jobs Sep 8, 2022
@crusaderky crusaderky mentioned this pull request Sep 8, 2022
@crusaderky crusaderky closed this Sep 8, 2022
@crusaderky crusaderky reopened this Sep 8, 2022
@crusaderky crusaderky force-pushed the ab_testing branch 2 times, most recently from ff71818 to b008604 Compare September 9, 2022 09:41
@crusaderky
Copy link
Contributor Author

Ready for final review.
Run with files in AB_environments/: https://github.com/coiled/coiled-runtime/actions/runs/3018905647

@crusaderky crusaderky marked this pull request as ready for review September 9, 2022 09:43
@crusaderky
Copy link
Contributor Author

Question:

In the context of A/B performance tests, do we care about the runtime and stability test folders?
CC @ncclementi @ian-r-rose

@hayesgb
Copy link
Contributor

hayesgb commented Sep 9, 2022

I can see a solid argument for skipping the runtime folder, but seems to me it would preferable to keep the stability folder in scope.

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full disclosure: I haven't reviewed ab_tests in much detail but rather the broad strokes.

My comments are mostly nitpicks. If there are issues or missing features we can iterate.

Looking forward to see this in action!

- dask==2022.8.1
- distributed=2022.8.1
```
- `AB_environments/AB_baseline.dask.yaml`: (empty file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be an empty file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The alternative to forcing the developer to create an empty file every time was to ignore the file if it doesn't exist, but the risk of misspellings quietly resulting in unexpected results was too high. Explicit is better than implicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not include empty file with appropriate name in repo? That would mean you'd get no custom config as the default, but also make it easier to not accidentally make a new file w/ wrong name (e.g., you'd either edit existing file or you'd use cp your-file AB_[tab] and autocomplete to correct filename).

import yaml


def main(fname: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we're now able to ship dask config using the dask.config ctx manager. That might be a better interface than this. is there a reason why we use env vars instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we're now able to ship dask config using the dask.config ctx manager.

Just to be clear, change to ship dask config to cluster is not yet deployed, presumably will go out next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? This is news to me.
Are you saying that when you call coiled.create_software_environment it will pick up the local config?
Also, I needed something that could be passed to coiled env create. Ideally one would want to pass it a dask config file directly, but I believe there's no such feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm confused. The feature is that when you create cluster it ships local config. I don't see any connection between dask config and the software environment... am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, change to ship dask config to cluster is not yet deployed, presumably will go out next week.

This is exactly what we need, please ping me when it's generally available

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm confused. The feature is that when you create cluster it ships local config. I don't see any connection between dask config and the software environment... am I missing something?

The script is currently calling coiled env create -e DASK_DISTRIBUTED_...=VALUE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script is currently calling coiled env create -e DASK_DISTRIBUTED_...=VALUE

I could be wrong but I think that just applies when creating the software environment and not when you make a cluster using that software environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works fine for me?

import coiled
import dask.config
import distributed

!coiled env create --name crusaderky/test_vars --conda AB_environments/AB_baseline.conda.yaml -e DASK_TESTVAR=123
cluster = coiled.Cluster(name="test_vars", n_workers=0, software="crusaderky/test_vars")
client = distributed.Client(cluster)
client.run_on_scheduler(lambda: dask.config.get("testvar"))
123

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks, good to know!

@crusaderky crusaderky merged commit a08904c into main Sep 9, 2022
@crusaderky crusaderky deleted the ab_testing branch September 9, 2022 14:42
hendrikmakait pushed a commit that referenced this pull request Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automation of benchmark comparison
4 participants