-
-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify conventions for public interfaces #8
Comments
I agree in principle @bplevin36, but I disagree in practice - if nothing else, because I tried to do this already in rust-ml/classical-ml-discussion#2 and I quickly realised that it's difficult to have design discussions when there are not enough concrete examples. For the current stage of the project, I would focus on algorithm implementation without spending too much time on traits/interfaces until we have at least 3/4 algorithms in each relevant class (supervised learning, unsupervised learning, pre-processing functions, etc.). I do as well understand that a minimal set of directives/guidelines is required to avoid having contributors wander in incompatible directions.
I would add that imitating the API of Different story for |
My current heuristics (which might or might not survive the test of time as we start adding more models to
What are your thoughts on this @bplevin36? |
Just to chip in my two cents here; I think it's a good idea to get a couple of community contributed examples into the project so there can be some evaluation of what is working and what isn't with the initial structure and API for algorithm implementation. I just started on the implementation of NearestNeighbors and am currently roughly following the structure that I am going to open a work in progress PR as soon as I'm happy with the scaffolding so other members of the community can have a look and comment. To me this seemed like the most reasonable approach since the project is pretty new and there will be a handful of people doing parallel implementations and most likely hitting similar problems related to project structure and API design. |
These two comments alone help a lot in terms of a unifying strategy. The only thing I would caution against is postponing a finalization of public interface structure until too late. If there's one constant I've observed in the Rust ecosystem, it's that large teams have overlooked Rust for projects it would be really good for due to API instability both in the core language and in popular projects. I'll continue working on the PR I have in mind for linfa and we'll see where the discussion chains go :) |
If what we seek is adoption, following sklearn's api is probably the best idea. |
I think to some extent the terminology, namings of the functions, etc should be kept, but as Rust gives us the opportunity to, for instance, guarantee that a model was fitted to call This would also allow for some nice abstractions for easily creating say, model pipelines as tuples of models that would themselves implement This would imply deviating from the clf = Something(alpha = 0.1)
clf.fit(X, y)
y2 = clf.predict(X2) of sklearn, into something like: let hp = Something { alpha: 0.1, ..Default::default() };
let clf = hp.fit(X, y);
let y2 = clf.predict(X2);
Given that typically the set of hyperparameters is rather small and that copying it will take a very neglictable amount of time compared to the fitting time, and because of the very good compromise in terms of simplicity/flexibility/performance compared to storing references/ The interface I have in mind would look something like this: pub trait HyperParameters: Clone + Default {}
pub trait Model {
type HyperParameters: HyperParameters;
fn hyper_parameters(&self) -> &Self::HyperParameters;
}
#[derive(Clone, Debug, Default)]
struct SomeModelHyperParameters {}
struct SomeModel {
hyper_parameters: SomeModelHyperParameters,
/// When fitted, this and other similar variables get filled:
parameters: (),
}
// These three could be part of derives
impl HyperParameters for SomeModelHyperParameters {}
impl Model for SomeModel {
type HyperParameters = SomeModelHyperParameters;
fn hyper_parameters(&self) -> &Self::HyperParameters {
&self.hyper_parameters
}
}
// For ease of access to parameters from rest of code, incl. `fit` & the like
impl std::ops::Deref for SomeModel {
type Target = <Self as Model>::HyperParameters;
fn deref(&self) -> &Self::Target {
self.hyper_parameters()
}
}
// This is how you would typically call fit after setting your parameters:
pub trait Fit<FD>: Model + Sized {
fn fit(hyper_parameters: <Self as Model>::HyperParameters, fit_data: FD) -> Self;
}
pub trait TargetedHyperParameters: HyperParameters {
type Model: Model<HyperParameters = Self>;
fn fit<FD>(self, fit_data: FD) -> Self::Model
where
Self::Model: Fit<FD>,
{
<Self::Model as Fit<FD>>::fit(self, fit_data)
}
} Now given that, the model has the hyperparameters and it seems easy to factorize
I can see a few scenarios where fitting could fail: invalid hyperparameters, invalid data, using a distributed service as a back-end and it disconnects... We could have a What do you think? |
I was referring to a different problem - the following calls to The sketch of those traits looks quite interesting - not so convinced by the
|
These are very complex questions! :) Let's see...
Saving the full history seems to me like a very edgy use-case so I wouldn't necessarily always store it in the model (would require a Then one could create a wrapper struct for any model that has struct ModelWithHPHistory<M: Model> {
model: M,
previous_hyper_parameters: Vec<M::HyperParameters>,
}
impl<M: Model> Model for ModelWithHPHistory<M> {
type HyperParameters = M::HyperParameters;
fn hyper_parameters(&self) -> &Self::HyperParameters {
self.model.hyper_parameters()
}
}
impl<M: Model, FD> Fit<FD> for ModelWithHPHistory<M>
where
M: Fit<FD>,
{
fn fit(hyper_parameters: <Self as Model>::HyperParameters, fit_data: FD) -> Self {
ModelWithHPHistory {
model: M::fit(hyper_parameters, fit_data),
previous_hyper_parameters: Vec::new(),
}
}
}
// etc. But if it's edgy enough, users who need it could just store it manually. I don't think I've seen that feature anywhere in SKLearn.
We should note that in the SKLearn terminology (which I like around transform and predict):
I think we'd want both KMeans and DBSCAN to have the same trait related to their ability to pub trait Predict<Input, Output>: Model {
fn predict(&self, x: Input) -> Output;
}
pub trait FitPredict<FD, Output>: Model + Sized {
fn fit_predict(
hyper_parameters: <Self as Model>::HyperParameters,
fit_data: FD,
) -> (Self, Output);
} This would imply that even DBSCAN would output something, be it a nearly-empty struct containing just the hyperparameters but I think it's preferable: maybe later in the development we would want it to gain some other post-fit capabilities like some details about how it fitted stuff or even actually Once specialization is stabilized, we could provide a default impl for About
I agree we probably need to be able to customize RNGs, in particular for seeding for having reproducible results. I've started analyzing what
|
Let's give our minds permission to wander outside what has already been explored a little bit 😁
The
I think we will have to evaluate the ergonomics of
There is also another option: make |
I agree, this option looks better than all the other ones! :)
I agree that it wouldn't be bad to provide that feature, and that the Also, I guess saving the HP history would not be all you need to reproduce: you would also need the incremental data history synchronized with the HP history, which you probably don't want to store in the model without an opt-in because it would be too large/complex, so you would still have some more code to write, in the middle of which I also feel like it's conceptually closer to precisely what we need: sometimes, to require saving any change to the hyperparameters when we update them, for models where it's updatable, and this "being conceptually close" is probably my favorite way to design things that tend to scale well with new features.
You're right, supporting some rarely-useful edge-cases could hurt ergonomics in the most common case here. (Oh, looks a lot like the previous problem! ^^') I think in order to solve this we could have the "ability for output" and "typical output" be separate traits (so adding a |
Closed by #55 |
It would be useful, especially to potential contributors, to have a unified description of how public interfaces should be structured. At first glance, I assumed we would be attempting to stay as close to sklearn's conventions as is reasonable with Rust's conventions and syntax. Looking at the KMeans implementation, however, shows significant departure from sklearn's parameter naming scheme, and introduces several additional types for Hyperparameter management.
I think it's important in the long run for the public interfaces to be both intuitive and consistent. With that in mind, I think we should:
Start a discussion about design choices for the public interfaces. I personally am not sure that the utility of introducing HyperParams structs and accompanying factory functions justifies the additional API complexity. But I could absolutely be wrong about that. I'd just like to see some rationale.
Write up the conclusions of that discussion in a design doc. This doesn't need to be that complex or in-depth, just a basic statement of conventions and design philosophy to make it easier for contributors.
The text was updated successfully, but these errors were encountered: