How shall I create the file "test.src"? #5

mriganktiwari · 2020-12-29T06:50:56Z

How shall I create the file "test.src" at question generation step? And does it contain?

sonsus · 2021-03-18T09:26:35Z

@mriganktiwari I guess i t's from the part preparing (tokenize and binarize) the inputs for QG model that corresponds to P(Q|Y). So I thought ~~this should be test.tgt of CNN/DM dataset (and thus doing so)~~

Correct me if I'm wrong =]

sonsus · 2021-03-19T01:56:02Z

@mriganktiwari I found that I need to tokenize and binarize test_w_10ans.txt instead of original test-tgt split= test.txt.tgt.tagged to make up 100 questions per test split sample (that is doc-summary pair).

I guess the annotation of P(Q|Y) was a good idea to avoid misleading the readers of what QAGS measuring, but, to reproduce, this leaves a question mark. Shouldn't it be P(Q|Y;A) ? @W4ngatang

Zhou-Zoey · 2022-11-09T01:53:25Z

How shall I create the file "test.src" at question generation step? And does it contain?

Have you figured out it? When reproducing this work, I was also confused by this problem

Zhou-Zoey · 2022-11-09T01:54:44Z

@mriganktiwari I found that I need to tokenize and binarize test_w_10ans.txt instead of original test-tgt split= test.txt.tgt.tagged to make up 100 questions per test split sample (that is doc-summary pair).

I guess the annotation of P(Q|Y) was a good idea to avoid misleading the readers of what QAGS measuring, but, to reproduce, this leaves a question mark. Shouldn't it be P(Q|Y;A) ? @W4ngatang

Do you mean that the test_w_10ans.txt is the only file that need to be tokenized?

dlaredo · 2023-09-04T22:57:28Z

Any new information on this one? I also dont know how to generate this file

sonsus · 2023-09-26T19:13:08Z

@mriganktiwari I found that I need to tokenize and binarize test_w_10ans.txt instead of original test-tgt split= test.txt.tgt.tagged to make up 100 questions per test split sample (that is doc-summary pair).

I guess the annotation of P(Q|Y) was a good idea to avoid misleading the readers of what QAGS measuring, but, to reproduce, this leaves a question mark. Shouldn't it be P(Q|Y;A) ? @W4ngatang

I don't really remember the details of the code but I succeeded reproducing it after writing the comment now I'm referring to. Feel sorry that I cannot share actual code (which is lost) I've run for the experiment. But if you want to reproduce it, the original author's code is worth reading if you already read the paper. It was a good starting point for me and took not too long to fill the gap. If you replace the generation models in this work with recent language models, it will definitely work better. IMHO, if I need to revisit this work, I wouldn't bother myself to train small models as the original work did but just adapt the instruct-tuned LLMs instead with some good instructions. @dlaredo @Zhou-Zoey

sonsus · 2023-09-26T19:26:28Z

Hope my mail with the author might help you to reproduce it or in some other way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How shall I create the file "test.src"? #5

How shall I create the file "test.src"? #5

mriganktiwari commented Dec 29, 2020

sonsus commented Mar 18, 2021 •

edited

Loading

sonsus commented Mar 19, 2021

Zhou-Zoey commented Nov 9, 2022

Zhou-Zoey commented Nov 9, 2022

dlaredo commented Sep 4, 2023

sonsus commented Sep 26, 2023

sonsus commented Sep 26, 2023 •

edited

Loading

How shall I create the file "test.src"? #5

How shall I create the file "test.src"? #5

Comments

mriganktiwari commented Dec 29, 2020

sonsus commented Mar 18, 2021 • edited Loading

sonsus commented Mar 19, 2021

Zhou-Zoey commented Nov 9, 2022

Zhou-Zoey commented Nov 9, 2022

dlaredo commented Sep 4, 2023

sonsus commented Sep 26, 2023

sonsus commented Sep 26, 2023 • edited Loading

sonsus commented Mar 18, 2021 •

edited

Loading

sonsus commented Sep 26, 2023 •

edited

Loading