FEA Kernel decision trees with user-defined kernel dictionaries #302

adam2392 · 2024-07-23T16:08:29Z

Reference Issues/PRs

Towards: #20
Closes: #269

What does this implement/fix? Explain your changes.

Adds kernel decision trees

Any other comments?

Signed-off-by: Adam Li <adam2392@gmail.com>

for more information, see https://pre-commit.ci

Signed-off-by: Adam Li <adam2392@gmail.com>

…ernelv2

Signed-off-by: Adam Li <adam2392@gmail.com>

for more information, see https://pre-commit.ci

adam2392 · 2024-08-29T23:34:21Z

Notes:

fused-type util functions to allow vector[intp_t] and intp_t[:] is useful and should be merged upstream
the performance penalty of MORF most likely is in space and in time. We iterate over the patches, which may become non-trivial size multiples times. The first time is in generating the patch vectorized indices and storing these in the projection matrix. The second time is in actually applying the patch vectorized indices to obtain the feature values. First of all, we shouldn't need to store the entire projection matrix, as this may potentially scale up the space complexity. Second of all, we should only need to iterate once.

To fix 2., we need to decouple the assumptions of the design with that of the oblique splitter, where storing all this stuff and iterating more than once has negligible cost.

Instead for MORF, we may want to instantiate a separate subclass that isn't an extension of oblique. Here, we want to store the "parameters" of the projection matrix per split since we need some way of indexing what was the "best split" and then getting all the relevant information needed in order to apply it. The current way of doing it leverages the fact that one can simultaneously iterate over projection_weights and projection_indices to obtain the feature value during predict. So we should maybe piggyback that method, but we only store the final projection_indices and weights once we've selected the "best split".

To parameterize a split indices, we need the top left point and the corresponding patch dims. To parameters a split weight, we need the relevant kernel. In the naive case, this is just all values of "1". In the case of a gaussian kernel for instance, The # of non-zeros would correspond to the number of split indices. So we can either i) store the parameters of the gaussian kernel (mu, sigma, shape), ii) the full set of weights, or iii) the index of kernel in the pre-generated kernel dictionary (passed in by the user). Obviously we don't want to store the full sets of weights, so that's not viable. So i) and iii) seem most viable, where i) requires then one to write a function to generate the full kernel given the parameters, and iii) just requires keeping a reference to the original kernel dictionary generated by the user.

…ernelv2

adam2392 and others added 10 commits July 23, 2024 12:07

Initial commit

73e8173

Signed-off-by: Adam Li <adam2392@gmail.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

a365c29

for more information, see https://pre-commit.ci

Adding example

ea97670

Signed-off-by: Adam Li <adam2392@gmail.com>

Merge branch 'kernelv2' of https://github.com/adam2392/treeple into k…

564f4fb

…ernelv2

Merge branch 'main' into kernelv2

f5c23a6

Refactoring to make it easier to understand

7d74b36

Signed-off-by: Adam Li <adam2392@gmail.com>

Merge branch 'main' into kernelv2

a73e5a7

new wip

a201ba1

Signed-off-by: Adam Li <adam2392@gmail.com>

Updated morf and utils code

6fe71a5

[pre-commit.ci] auto fixes from pre-commit.com hooks

3f5366f

for more information, see https://pre-commit.ci

adam2392 added 2 commits September 9, 2024 12:25

Merge main

f3e0ed8

Merge branch 'kernelv2' of https://github.com/adam2392/treeple into k…

263c7c8

…ernelv2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA Kernel decision trees with user-defined kernel dictionaries #302

FEA Kernel decision trees with user-defined kernel dictionaries #302

adam2392 commented Jul 23, 2024

adam2392 commented Aug 29, 2024

FEA Kernel decision trees with user-defined kernel dictionaries #302

Are you sure you want to change the base?

FEA Kernel decision trees with user-defined kernel dictionaries #302

Conversation

adam2392 commented Jul 23, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

adam2392 commented Aug 29, 2024