Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--invert_hash doesn't work with --dsjson #1614

Closed
marco-rossi29 opened this issue Oct 4, 2018 · 2 comments
Closed

--invert_hash doesn't work with --dsjson #1614

marco-rossi29 opened this issue Oct 4, 2018 · 2 comments
Assignees
Labels
Bug Bug in learning semantics, critical by default Priority: High

Comments

@marco-rossi29
Copy link
Collaborator

This issue is present in both version 8.4.0 (commit 78e4ae7) and 8.6.1 (commit 94a6298).

Issue: Using DSJSON file here as data.txt and running:
vw.exe --dsjson data.txt --readable_model model.readable --invert_hash model.readable_inv --cb_adf
gives only 1 feature and the constant in model.readable_inv:

Version 8.6.1
Id 
Min label:-1.22449
Max label:0
bits:18
lda:0
0 ngram:
0 skip:
options: --hash_seed 0 --cb_adf --cb_type ips --csoaa_ldf multiline --csoaa_rank --link identity
Checksum: 62936095
event_sum 0
action_sum 0
:0
:26051:-0.0168612
Constant:116060:-0.0168612

whereas model.readable has ~190 features:

Version 8.6.1
Id 
Min label:-1.22449
Max label:0
bits:18
lda:0
0 ngram:
0 skip:
options: --hash_seed 0 --cb_adf --cb_type ips --csoaa_ldf multiline --csoaa_rank --link identity
Checksum: 62936095
event_sum 0
action_sum 0
:0
175:0.0513654
1982:-1.08739
2089:0.0265077
3437:0.00171433
3620:0.0620581
3923:-11.723
4765:0.00162573
5289:0.00970562
7524:-0.0468675
12196:0.00228014
12373:0.0449017
12553:0.63988
13266:-0.0716713
14327:0.00162573
16001:0.00162573
16565:0.0254798
19539:0.0105616
20477:0.0294717
21080:0.022225
22454:594.956
22788:0.00554071
23093:17.1233
23283:855.118
24121:-8.95805
24174:0.597908
24823:0.0270979
26051:-0.0168612
26553:-0.0536117
27399:0.0137574
27910:0.0100833
35901:0.0257047
36441:0.0254798
36831:0.0016337
38323:0.00162573
40990:0.0251746
41407:0.0213216
42868:0.0581745
43143:0.0100802
45848:0.0167105
47813:0.0212803
48026:0.108382
49356:0.00162573
53057:0.0101623
55125:0.0017898
55156:0.0214329
55384:6.36968
59949:0.0544446
61595:0.0254798
61666:0.00974771
63915:0.0016419
64110:0.00959619
66549:-0.0190973
72377:0.00162573
74138:0.00162573
75106:0.0254999
75981:0.00180222
77105:-5.79003
78004:0.0214775
78815:-0.0451893
79695:0.00162573
79900:0.00162573
80802:0.0016269
83044:0.00162573
83205:0.0540289
83321:0.0164848
84231:0.0113576
86597:0.0395648
86739:0.00162573
89785:-0.0527661
93058:0.00162573
94653:8.66265
95054:0.00920749
96482:0.0281576
98563:0.013353
99815:0.0099329
101042:0.0256731
103135:0.0547361
104000:68.4282
104155:0.00165006
105990:2.36476
106680:0.0544446
110002:-0.0190973
110338:27.7843
112177:0.013353
112729:0.030168
113804:0.013353
115073:0.0257047
115344:3.54633
116060:-0.0168612
119392:0.00230874
119652:0.0212803
119717:0.0327092
123962:0.00162573
124098:0.0267705
125209:3.05408
126417:0.00789929
128361:0.0392587
130819:0.00164846
131812:0.00162573
134629:0.010112
136730:0.00268611
139337:2.59888
140026:0.175844
141252:3.19846
144732:0.00162573
145330:0.0492531
145888:0.00252063
149238:0.0186415
150232:-1.73182
151096:0.200728
152187:0.0979684
152289:0.116092
152810:-0.134205
152984:0.00164002
155643:0.0100106
157471:3.80609
160537:4.45124
163964:0.0016269
166742:27.8495
172767:-0.0468675
175990:2.16485
176598:-0.0336927
176813:0.384276
178610:-0.0468675
179126:-0.328854
179966:0.0258893
180391:0.00162573
181868:0.0486424
183350:0.0212803
184440:0.00162573
185815:0.0016269
185868:0.0224254
186906:-0.0168612
187185:0.0186326
187295:0.114245
188970:-8.48921
190538:0.00162573
190809:0.399807
191905:0.0198237
194265:-6.00704
194843:0.00162573
195140:1.01341
195681:0.013353
197213:28.3185
198738:0.00371133
199171:0.160765
200048:-23.8867
200737:0.0430822
204571:0.015883
206544:0.00729267
206768:0.0212803
206850:0.013353
209343:0.321475
210713:0.0285368
216855:0.0107094
217656:0.0254798
218742:0.0330709
219144:-0.0473967
219687:0.149107
221342:0.0642925
222195:0.030631
223236:0.0285168
227378:0.00606413
229175:0.00999028
230019:1.41894
235673:0.0100562
240558:0.0472738
241456:0.0218051
241995:0.00162573
244364:0.0016269
244396:0.0677292
245652:0.0644517
246135:0.0121952
246866:0.0254798
247145:0.3251
249709:0.580481
249778:-0.0168612
250620:7.09266
251413:0.768552
251509:0.013353
251635:-0.0190973
251874:-53.5849
252385:1.2489
252509:0.0258934
253098:7.09266
257210:0.104905
258018:0.010041
259943:0.0213216
261623:5.32071
@jackgerrits jackgerrits added the Bug Bug in learning semantics, critical by default label Dec 12, 2018
@jackgerrits jackgerrits self-assigned this Dec 20, 2018
@jackgerrits jackgerrits added In Progress Issue has been assigned and being worked on Priority: High labels Dec 20, 2018
@jackgerrits
Copy link
Member

This bug affects both --dsjson and --json, --invert_hash implicitly relies on some audit functionality. When using standard VW format this works correctly but when using json parsing it is not using audit. While I work on a proper fix there is a workaround to explicitly add --audit to your command line.


Temporary Workaround

When using --json or --dsjson add --audit to the command line.

@jackgerrits jackgerrits removed the In Progress Issue has been assigned and being worked on label Jan 16, 2019
@ataymano ataymano self-assigned this Apr 19, 2019
@ataymano
Copy link
Member

Cannot reproduce on current master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Bug in learning semantics, critical by default Priority: High
Projects
None yet
Development

No branches or pull requests

3 participants