feat(components): Use larger base reward model when tuning `text` and `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` #10663

copybara-service · 2024-04-04T14:21:05Z

feat(components): Use larger base reward model when tuning text and chat variants of bison@001 with the preview.llm.rlhf_pipeline

google-cla · 2024-04-04T14:21:10Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-oss-prow · 2024-04-04T14:21:16Z

Hi @copybara-service[bot]. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow · 2024-04-04T14:21:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sinachavoshi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

components/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

… `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` PiperOrigin-RevId: 622229648

google-oss-prow bot added the size/S label Apr 4, 2024

google-oss-prow bot added the needs-ok-to-test label Apr 4, 2024

google-oss-prow bot requested a review from IronPan April 4, 2024 14:21

google-oss-prow bot requested a review from SinaChavoshi April 4, 2024 14:21

copybara-service bot force-pushed the test_615798870 branch from 2487faf to eda3979 Compare April 5, 2024 15:51

feat(components): Use larger base reward model when tuning text and…

ac39931

… `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` PiperOrigin-RevId: 622229648

copybara-service bot force-pushed the test_615798870 branch from eda3979 to ac39931 Compare April 5, 2024 18:11

copybara-service bot merged commit ac39931 into master Apr 5, 2024

copybara-service bot deleted the test_615798870 branch April 5, 2024 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(components): Use larger base reward model when tuning `text` and `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` #10663

feat(components): Use larger base reward model when tuning `text` and `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` #10663

copybara-service bot commented Apr 4, 2024

google-cla bot commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

feat(components): Use larger base reward model when tuning text and chat variants of bison@001 with the preview.llm.rlhf_pipeline #10663

feat(components): Use larger base reward model when tuning text and chat variants of bison@001 with the preview.llm.rlhf_pipeline #10663

Conversation

copybara-service bot commented Apr 4, 2024

google-cla bot commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

feat(components): Use larger base reward model when tuning `text` and `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` #10663

feat(components): Use larger base reward model when tuning `text` and `chat` variants of `bison@001` with the `preview.llm.rlhf_pipeline` #10663