Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scriptData missing in Dropbox upload #4

Open
jonaschn opened this issue Feb 9, 2021 · 4 comments
Open

scriptData missing in Dropbox upload #4

jonaschn opened this issue Feb 9, 2021 · 4 comments

Comments

@jonaschn
Copy link

jonaschn commented Feb 9, 2021

I could not find the data which is expected to be in the scriptDataPath.
The Dropbox upload contains (to my knowledge) only the training ( originalDataPath) and test data (testDataPath).

public static String scriptDataPath = "data/scriptData/ThreeM09/";

@ifonema
Copy link

ifonema commented Mar 17, 2021

scriptDataPath is used only within some of the classes in tem.script package and in SimpleEvaluate class of tem.main package. In particular scriptDataPath is utilized to read the userID file. To me, the export step and the training set creation step are both clear, but I can't figure out how to create the test set. Any ideas?

@jonaschn
Copy link
Author

@ifonema Did you noticed my fork https://github.com/jonaschn/TopicExpertiseModel?
I refactored a little bit and improved the code documentation.
Therein, I also mention how the test set is created:

testDocSet.readQATestDocs(testDataFolder, trainDocSet);
String testDocfile = testDataFolder + "QATest.data";
FileUtil.saveClass(testDocSet, testDocfile);

@ifonema
Copy link

ifonema commented Mar 20, 2021

@jonaschn Thank you for your fork, I am using it now.

testDocSet.readQATestDocs(testDataFolder, trainDocSet);

Method readQATestDocs in line 43 reads questions from testData.question file as it can be seen from the following code snippet:

String questionFile = testDataFolder + "testData.questions";

FileUtil.readLines(questionFile, questionLines);

testData.question file is created in the main method of ExportTestDataForRank class

String questionIDFile = testDataFolder + "testDataQuestions.id";
FileUtil.readLines(questionIDFile, questionIDs);

where testDataQuestions.id file is used in order to export data from the database, but I don't know where the testDataQuestions.id file is created.
A naive solution consists of creating a tab separated text file in which the second column contains the ids of questions belonging to the test set, but I'd like to know if there is in the repository the code for the automatic generation of the test set.

@jonaschn
Copy link
Author

To be honest: I don't know.
I didn't try to generate any test data for myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants