Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEA Kernel decision trees with user-defined kernel dictionaries #302

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

adam2392
Copy link
Collaborator

Reference Issues/PRs

Towards: #20
Closes: #269

What does this implement/fix? Explain your changes.

  • Adds kernel decision trees

Any other comments?

@adam2392
Copy link
Collaborator Author

Notes:

  1. fused-type util functions to allow vector[intp_t] and intp_t[:] is useful and should be merged upstream
  2. the performance penalty of MORF most likely is in space and in time. We iterate over the patches, which may become non-trivial size multiples times. The first time is in generating the patch vectorized indices and storing these in the projection matrix. The second time is in actually applying the patch vectorized indices to obtain the feature values. First of all, we shouldn't need to store the entire projection matrix, as this may potentially scale up the space complexity. Second of all, we should only need to iterate once.

To fix 2., we need to decouple the assumptions of the design with that of the oblique splitter, where storing all this stuff and iterating more than once has negligible cost.

Instead for MORF, we may want to instantiate a separate subclass that isn't an extension of oblique. Here, we want to store the "parameters" of the projection matrix per split since we need some way of indexing what was the "best split" and then getting all the relevant information needed in order to apply it. The current way of doing it leverages the fact that one can simultaneously iterate over projection_weights and projection_indices to obtain the feature value during predict. So we should maybe piggyback that method, but we only store the final projection_indices and weights once we've selected the "best split".

To parameterize a split indices, we need the top left point and the corresponding patch dims. To parameters a split weight, we need the relevant kernel. In the naive case, this is just all values of "1". In the case of a gaussian kernel for instance, The # of non-zeros would correspond to the number of split indices. So we can either i) store the parameters of the gaussian kernel (mu, sigma, shape), ii) the full set of weights, or iii) the index of kernel in the pre-generated kernel dictionary (passed in by the user). Obviously we don't want to store the full sets of weights, so that's not viable. So i) and iii) seem most viable, where i) requires then one to write a function to generate the full kernel given the parameters, and iii) just requires keeping a reference to the original kernel dictionary generated by the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant