-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/pf - End-to-End Github actions tests #61
Conversation
…er, less steps, also positions for GH actions
Added extras ...
|
This PR finishes off the end-to-end tests first draft. See below for a summary from CONTRIBUTION.md. Note, it also includes a new demo data zipfile, script to download and documentation, so assistant analysis tests can be run. End-to-end testsEnd-to-end tests have been configured in GitHub actions which use promptflow to call a wrapper around the chainlit UI, or order to test when memories/recipes are used as well as when the assistant does some on-the-fly analysis. To do this, the chainlit class is patched heavily, and there are limitations in how Additionally, there were some limitation when implementing in GitHub actions where workarounsd were implemented Code for e2e tests can be found in The tests work using promptflow evaluation and a call to an LLM to guage groundedness, due to the fact LLM assistants can produce slightly different results if not providing answers from memory/recipes. The promptflow evaluation test data can be found in A useful way to test a new scenario and to get the 'expected' output for TODO, future work:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this as the first iteration of promptflow testing/ evaluation
Reopened to respond to reviewer and add GH actions