Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FETA subsampling not working #160

Open
timokau opened this issue Sep 24, 2020 · 3 comments
Open

FETA subsampling not working #160

timokau opened this issue Sep 24, 2020 · 3 comments

Comments

@timokau
Copy link
Collaborator

timokau commented Sep 24, 2020

While working on #116, I noticed that the sub_sampling function of feta_network is broken. Its not exercised in our standard test-suite, since its only needed when the number of objects is higher than the 5 objects our testsuite uses.

The function is implemented as follows:

def sub_sampling(self, X, Y):
    if self.n_objects_fit_ > self.max_number_of_objects:
        bucket_size = int(self.n_objects_fit_ / self.max_number_of_objects)
        idx = self.random_state_.randint(
            bucket_size, size=(len(X), self.n_objects_fit_)
        )
        # TODO: subsampling multiple rankings
        idx += np.arange(start=0, stop=self.n_objects_fit_, step=bucket_size)[
            : self.n_objects_fit_
        ]
        X = X[np.arange(len(X))[:, None], idx]
        Y = Y[np.arange(len(X))[:, None], idx]
        tmp_sort = Y.argsort(axis=-1)
        Y = np.empty_like(Y)
        Y[np.arange(len(X))[:, None], tmp_sort] = np.arange(self.n_objects_fit_)
    return X, Y

and breaks at the idx += line because of a dimension mismatch. It's trying to concatenate arrays like

[[0 1 0 0 0]
 [0 0 1 1 0]]

and

[0 2 4]

i.e. a 2d array with a 1d array. I'm not sure how this sampling is supposed to work. Is the intention documented somewhere @kiudee @prithagupta?

@prithagupta
Copy link
Collaborator

@timokau The example you gave is for the choice function, for which the function is overridden in feta_choice.
For discrete choice, this function will produce an error and we need to implement it for discrete choice as well.

@timokau
Copy link
Collaborator Author

timokau commented Nov 4, 2020

So this implementation should always be overridden? Could we just remove it then?

@prithagupta
Copy link
Collaborator

prithagupta commented Nov 5, 2020

I think we are using it for ranking, or we can move it in the FetaObjectRanking class and we should think about the subsampling method for discrete choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants