Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add option to accept sample weight vectors to fit methods #669

Open
beckernick opened this issue Jun 11, 2019 · 7 comments · Fixed by #2057
Open

[FEA] Add option to accept sample weight vectors to fit methods #669

beckernick opened this issue Jun 11, 2019 · 7 comments · Fixed by #2057
Labels
? - Needs Triage Need team to review and classify CUDA / C++ CUDA issue Cython / Python Cython or Python issue feature request New feature or request

Comments

@beckernick
Copy link
Member

beckernick commented Jun 11, 2019

Is your feature request related to a problem? Please describe.
In sklearn, estimator.fit can (almost always?) accept a sample_weight parameter (defaulting to None) that allows users to pass in a weights vector that determines how much weight each sample should receive (with length equal to the number of samples).

This would be a useful feature for cuML estimators, too. As an example, see the sklearn KMeans documentation

sample_weight : array-like, shape (n_samples,), optional
The weights for each observation in X. If None, all observations are assigned equal weight (default: None)
@beckernick beckernick added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jun 11, 2019
@beckernick beckernick changed the title [FEA] Add options to accept samples weight vectors to fit methods [FEA] Add option to accept samples weight vectors to fit methods Jun 11, 2019
@JohnZed
Copy link
Contributor

JohnZed commented Jun 27, 2019

Agreed this will be useful for most estimators. It will be an estimator-by-estimator process to add it, but we could start with linear models and get some commonality there. Not going to make it to 0.9 given current load there, but we'll keep it for a near future release.

@JohnZed
Copy link
Contributor

JohnZed commented Aug 8, 2019

Priority is for KMeans based on requests

@Denisevi4
Copy link

Denisevi4 commented Sep 18, 2019

Linear models pretty please?

@JohnZed
Copy link
Contributor

JohnZed commented Sep 19, 2019

Sorry, this didn't make it to the current release, but we'll add it to the list for an upcoming release.

@cjnolet cjnolet added CUDA / C++ CUDA issue Cython / Python Cython or Python issue labels Jan 16, 2020
@JohnZed
Copy link
Contributor

JohnZed commented Feb 3, 2020

Removing from 0.13 as we've added the k-means specific: #1625

@beckernick
Copy link
Member Author

beckernick commented Feb 23, 2021

I think it may be worth re-opening this issue for tracking purposes.

A variety of issues exist requesting the ability to specify observation-level weights for various estimators and primitives. As the implementation may need to vary across estimators, it may make sense to keep these issues separate but linked together like an epic. Perhaps this issue can serve as that link, as it's the most broad and the oldest.

Estimators

Primitives

Additionally, as these are implemented, it will also unblock using the respective estimators inside the sklearn AdaBoostClassifer meta-estimator API (#2401 (comment))

@beckernick beckernick reopened this Feb 23, 2021
@JohnZed
Copy link
Contributor

JohnZed commented Feb 23, 2021

Long term definitely viable. We will evaluate in more detail whether it can make it into 0.19 and mark it as P1 or P0 if so.

@beckernick beckernick changed the title [FEA] Add option to accept samples weight vectors to fit methods [FEA] Add option to accept sample weight vectors to fit methods Feb 23, 2021
rapids-bot bot pushed a commit that referenced this issue Aug 31, 2022
…t) (#4867)

Linking #669.
This PR adds `sample_weight` parameter to the C++ Coordinate Descent solver, which is used by Lasso and ElasticNet.
With some tests on C++ and Python level.
I am also removing some cudaStream parameters when the raft handle can be used.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4867
jakirkham pushed a commit to jakirkham/cuml that referenced this issue Feb 27, 2023
…t) (rapidsai#4867)

Linking rapidsai#669.
This PR adds `sample_weight` parameter to the C++ Coordinate Descent solver, which is used by Lasso and ElasticNet.
With some tests on C++ and Python level.
I am also removing some cudaStream parameters when the raft handle can be used.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#4867
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify CUDA / C++ CUDA issue Cython / Python Cython or Python issue feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants