Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: spin off automl predict_only_model to standard cb model #4279

Merged
merged 46 commits into from
Dec 6, 2022

Conversation

bassmang
Copy link
Member

No description provided.

@bassmang bassmang marked this pull request as draft November 10, 2022 22:17
@bassmang bassmang requested a review from lalo November 15, 2022 19:04
@bassmang bassmang marked this pull request as ready for review November 16, 2022 14:41
uint64_t multiplier = static_cast<uint64_t>(data.cm->wpp) << data.cm->weights.stride_shift();
for (uint32_t index = 0; index < data.cm->weights.mask(); index += multiplier)
{
if (data.cm->weights[index] != 0.0f)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might not be compatible with ftrl/coin - you would need to check all the other values in the same chunk just like here

if (*v != 0. || (&(*v))[1] != 0. || (&(*v))[2] != 0. || (&(*v))[3] != 0. || (&(*v))[4] != 0. ||

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also might be worth checking if GD writes to model if we have something like w[index] == 0.0f, w[index+1] != 0.0f, w[index+2] != 0.0f, w[index+3] != 0.0f.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to check all weights within the chunk as opposed to just the first

]
},
{
"id": 426,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also recycle this unit test:

BOOST_AUTO_TEST_CASE(automl_equal_no_automl_w_iterations)

line 280 model is -b 20, line 279 is -b 18, so if you run --predict_only_model on 280 line one it should be equal to the other one 279 -- as in exactly the same bit by bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made a test automl_equal_spin_off_model which is similar and shows the weights matching

clear_non_champ_weights(data.cm->weights, data.cm->estimators.size(), data.cm->wpp);

uint64_t multiplier = static_cast<uint64_t>(data.cm->wpp) << data.cm->weights.stride_shift();
for (uint32_t index = 0; index < data.cm->weights.mask(); index += multiplier)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be index *= multiplier?

Copy link
Collaborator

@olgavrou olgavrou Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe create a function/iterator here that encapsulates this multiplier indexing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is supposed to be +=, maybe multiplier isnt the right word

if (data.cm->weights[index] != 0.0f)
{
uint32_t cb_ind = index / data.cm->wpp;
for (uint32_t stride = 0; stride < (static_cast<uint32_t>(1) << data.cm->weights.stride_shift()); ++stride)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want the stride here or the indexes between two strides? if it is the latter maybe using stride here for the indexing parameter is misleading

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to stride_ind, also moved this function to array_parameters_dense.cc

clear_non_champ_weights(data.cm->weights, data.cm->estimators.size(), data.cm->wpp);

uint64_t multiplier = static_cast<uint64_t>(data.cm->wpp) << data.cm->weights.stride_shift();
for (uint32_t index = 0; index < data.cm->weights.mask(); index += multiplier)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the iterators from array_parameters_dense useful here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems possible but might complicate things

],
"depends_on": [
427
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good if these model diffs were to happen with larger files with more features and interactions to allow for hash collisions to happen and check this more thoroughly (something like the ccb_lots_of_interactions.dat file)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feature isnt compatable with ccb yet. Also it would be difficult to compare these in runtests since each interaction would need to be enumerated in the command line when comparing to a standard model

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't suggesting we use the ccb file but a file like that one where there are multiple namespaces to explore and is a bit more complex than the one exercised here and could potentially surface issues

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the test file, now it has 290k interacted features across 6 namespaces
4fb69a8#diff-94d0ac7a20310e17e8f59703288b6806f838c0d36c8893ac70385c2227ae62d8R1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@@ -89,6 +89,26 @@ void dense_parameters::clear_offset(const size_t offset, const size_t params_per
}
}

void dense_parameters::adjust_weights_single_model(const size_t params_per_problem, const size_t model_num)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method could also be unit tested in weights_test.cc seems like a prime candidate for some rigorous unit testing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weights are tested here:
https://github.com/VowpalWabbit/vowpal_wabbit/pull/4279/files#diff-9932fc58f28480ef64a94f6fc4cb26b0b50e7d94c621a3edb00fd0131fcc14dfR333

It doesn't make much sense to test this without initializing weights with a standard and automl model since it converts one to the other

@bassmang bassmang merged commit 583ce44 into VowpalWabbit:master Dec 6, 2022
@bassmang bassmang deleted the spin_off_aml branch December 6, 2022 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants