Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vowpal wabbit java: get raw predictions #3777

Open
onlynishant opened this issue Mar 7, 2022 · 3 comments
Open

vowpal wabbit java: get raw predictions #3777

onlynishant opened this issue Mar 7, 2022 · 3 comments
Labels
Feature Request New feature requested in system

Comments

@onlynishant
Copy link

I am using Java API of vowpal wabbit to get predictions. I need raw prediction (same as -r output.txt) but I couldn't find any such method in VWMulticlassLearner class. I am using below arg to train my model in python via cmd -

vw -f model_filepath -c --cache_file cache_filepath -k --csoaa 40 -b 24 -q cd -q .... -q n: --ignore a --ignore x

and we are using below code in Java to get predictions -

VWLearners.create("-i ./data/train.model  -t --quiet"); // VWMulticlassLearner
VWLearners.create("-i ./data/train.model  -t --quiet --csoaa_ldf=mc --loss_function=logistic --probabilities"); //VWProbLearner

None of the classes has any method which returns raw prediction.

I want the same prediction as below -

$ echo ' .. sample string .. ' | vw -i data/train.model -t -r test -p /dev/stdout
creating quadratic features for pairs: cd ce cu cw de du dw eu ew uw n:
ignoring namespaces beginning with: a x
only testing
predictions = /dev/stdout
raw predictions = test
Num weight bits = 24
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile =
num sources = 1
average  since         example        example  current  current  current
loss     last          counter         weight    label  predict features
39
0.000000 0.000000            1            1.0    known       39      171

finished run
number of examples per pass = 1
passes used = 1
weighted example sum = 1.000000
weighted label sum = 0.000000
average loss = 0.000000
total feature number = 171

$ cat test
0:1.05645 1:0.83437 2:-0.210798 3:-2.81048 4:-4.47558 5:-4.45883 6:-3.65177 7:-3.71191 8:-2.96008 9:-2.82846 10:-2.31816 11:0.925984 12:3.28547 13:5.20375 14:6.34244 15:6.13525 16:1.65726 17:1.22801 18:1.35034 19:3.27091 20:2.94066 21:-0.0276409 22:0.391437 23:1.267 24:-0.689573 25:0.0171876 26:3.12935 27:3.95045 28:3.86978 29:1.18468 30:0.0921049 31:0.436564 32:0.98946 33:1.00963 34:-0.265355 35:-3.02128 36:-2.52846 37:-2.8066 38:-3.50639 39:-4.6184

How can I get values that are in file test in Java as a method response? I don't want to read the file to get a response in Java which will be slow.

@onlynishant onlynishant added the Feature Request New feature requested in system label Mar 7, 2022
@jackgerrits
Copy link
Member

Raw predictions are only available via the command line currently. I can see this being useful, and there are similar situations such as getting the scores as well as probabilities when using contextual bandits. Patches are welcome, but for something of this nature proposing a design would be necessary before building it.

@onlynishant
Copy link
Author

@jackgerrits I saw there is already an old PR for it: #1244

do you think it's still relevant and can be used?

@jackgerrits
Copy link
Member

The fact partial_prediction is stashed into the cost sensitive label is already a pretty big hack, I would prefer we don't go forward with that design. In saying that though, raw predictions are not super well represented in the framework of predictions as they are effectively all one-off situations so I am not sure of a design at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request New feature requested in system
Projects
None yet
Development

No branches or pull requests

2 participants