Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing WMT14 #3175

Open
MaxHahnbueck opened this issue Nov 19, 2024 · 1 comment
Open

Reproducing WMT14 #3175

MaxHahnbueck opened this issue Nov 19, 2024 · 1 comment

Comments

@MaxHahnbueck
Copy link

I want to reproduce and then adjust the data of the WMT2014 Benchmark. Therefore I cant use helm directly (if I understand it correctly). Therefore i want to use the dataset and the way how the prompt is constructed in my own pipeline. (same applies to the evaluation code)

Unfortunately I cant understand and find my way through the code.

I believe the PromptRunExpander is responsible for creating the prompt but i cant figure out what it exactly does and what values are used and where they are obtained from.

I would happy for any explenation on how the flow of information is and how the final prompt is constructed

@yifanmai
Copy link
Collaborator

yifanmai commented Dec 7, 2024

Hi @MaxHahnbueck, here's some pointers to the code:

The main flow for the prompt generation happens in Runner.run_one here. Specifically, this calls WMT14Scenario.get_instances (link) to download and load instances into memory, and it calls GenerationAdapter.generate_requests (link) to turn them into prompt strings, which are placed in RequestState.request.input.text.

You may also want to check out the documentation if you haven't already. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants