Reproducing WMT14 #3175

MaxHahnbueck · 2024-11-19T13:28:27Z

I want to reproduce and then adjust the data of the WMT2014 Benchmark. Therefore I cant use helm directly (if I understand it correctly). Therefore i want to use the dataset and the way how the prompt is constructed in my own pipeline. (same applies to the evaluation code)

Unfortunately I cant understand and find my way through the code.

I believe the PromptRunExpander is responsible for creating the prompt but i cant figure out what it exactly does and what values are used and where they are obtained from.

I would happy for any explenation on how the flow of information is and how the final prompt is constructed

yifanmai · 2024-12-07T00:41:39Z

Hi @MaxHahnbueck, here's some pointers to the code:

The main flow for the prompt generation happens in Runner.run_one here. Specifically, this calls WMT14Scenario.get_instances (link) to download and load instances into memory, and it calls GenerationAdapter.generate_requests (link) to turn them into prompt strings, which are placed in RequestState.request.input.text.

You may also want to check out the documentation if you haven't already. Hope this helps.

yifanmai added the user question label Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing WMT14 #3175

Reproducing WMT14 #3175

MaxHahnbueck commented Nov 19, 2024

yifanmai commented Dec 7, 2024

Reproducing WMT14 #3175

Reproducing WMT14 #3175

Comments

MaxHahnbueck commented Nov 19, 2024

yifanmai commented Dec 7, 2024