Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Expose seasonality parameters of ProphetPiecewiseLinearTrendForecaster #5834

Merged
merged 11 commits into from
Feb 18, 2024

Conversation

sbuse
Copy link
Contributor

@sbuse sbuse commented Jan 25, 2024

The PR exposes the seasonal parameters of the ProphetPiecewiseLinearTrendForecaster.

Reference Issues/PRs

The PR is the result of the discussion in this (#5592).

What does this implement/fix? Explain your changes.

This is a simple change to allow the user to define what the seasonality parameters should be. @tpvasconcelos suggested they should be

daily_seasonality=False,
weekly_seasonality=False,
yearly_seasonality=False

but as the discussion (#5592) showed it is not clear if this is the best setting.

Does your contribution introduce a new dependency? If yes, which one?

No there are no new dependencies.

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

I added no new test since it is such a simple change.

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • Optionally, I've added myself and possibly others to the CODEOWNERS file - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 25, 2024

quick question, does this change the default behaviour? I would hope not?

@fkiraly fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Jan 25, 2024
@sbuse
Copy link
Contributor Author

sbuse commented Jan 25, 2024

The default behavior does not change. It just makes the parameters accessible.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-principle ok, though I would suggest to improve the docstring (non-blocking)

@sbuse
Copy link
Contributor Author

sbuse commented Jan 26, 2024

Thanks @fkiraly for asking for a more precise description. Reading into the exact meaning of the parameters made me realise this change will just create a lot of confusion and constraining the model just to do a piece wise linear fit seems a lot clearer.

If you agree, I will create a new branch and constrain the model just to do the piece wise linear fit.

@sbuse sbuse closed this Jan 26, 2024
@tpvasconcelos
Copy link
Contributor

If you agree, I will create a new branch and constrain the model just to do the piece wise linear fit.

I think that this is a better idea and would make this trend forecaster's behaviour more intuitive 👍

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 26, 2024

hmmmmm - could you explain the two options?

I will also reopen the PR even if we do not merge it, until the discussion is complete, as closed PRs have much lower visibility for developers. If discussionn gets longer, we should open an issue.

@fkiraly fkiraly reopened this Jan 26, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be merged until discussion is complete

@tpvasconcelos
Copy link
Contributor

@fkiraly fair enough (good policy!)

For the record, I will quickly re-iterate and extend on some of the points I mentioned in the original PR.

Why I'm not a fan of the current solution

I don't think that a separate ProphetPiecewiseLinearTrendForecaster class should have been introduced. Few reasons for this:

  1. Lots of code repetition between this and the original Prophet class implementation
  2. In the future, users will want more control over how this works internally and will ask for more parameters to be exposed (like the seasonality parameters @sbuse is suggesting in this PR).
    • To give another example, users might also ask to expose holiday-related parameters if, for instance, the effect size of some special days/periods in the series has a significant effect on the fitted trend component and the user doesn't want this since he/she is already modelling special days/periods some other way.
  3. Implementing this class, leaves a door open for someone to create yet more separate ProphetLogisticTrendForecaster, ProphetDeseasonalizer, ProphetRegressors, ProphetHolidays classes, and so on...
  4. All of the extra class implementations listed in 3. suffer from the same problems described in 1. and 2.

My preferred solution

I think that the solution to this problem that would be simplest, cleanest, and clearest-to-the-end-user, is to simply expose an extra parameter to the existing Prophet class that allows users to extract the structural component(s) they want to get out of the Prophet model.

For instance

# this:
forecaster = Prophet(extract_components="trend")
# would be the same as this:
forecaster = ProphetPiecewiseLinearTrendForecaster()

and

# this:
forecaster = Prophet(yearly_seasonality=False, extract_components="trend")
# would be the same as this:
forecaster = ProphetPiecewiseLinearTrendForecaster(yearly_seasonality=False)

and

# this:
forecaster = Prophet(yearly_seasonality=False, extract_components=["daily", "weekly"])
# would be the same as this
forecaster = ProphetDeseasonalizer(yearly_seasonality=False)

The implementation is very straightforward and clear:

  1. Add an extra extract_components parameter to the Prophet class
  2. Change this line in _fbprophet.py to extract the special component instead of "yhat" (default)

y_pred = out.loc[:, "yhat"]

i.e.,

-  y_pred = out.loc[:, "yhat"] 
+  component = self.extract_component or "yhat"
+  y_pred = out.loc[:, component] 

This ☝️ could be extended to accept multiple components (e.g., extract_components=["daily", "weekly"]) but needs a bit more speccing since it depends on whether the components are additive or multiplicative which can to be inferred from the fitted self._forecaster. The point is: it can be done!

@sbuse
Copy link
Contributor Author

sbuse commented Jan 27, 2024

I would like to answer why we should not expose the seasonal components and rather fix them all to false and disable any seasonal modeling. When I suggested implementing a piecewise linear detrender it was due to the fact there is no such thing in the arsenal of sktime nor sci-kit learn. What I was looking for was a function that just models the trend and I would then compose the result with another forecaster or deseasonalizer in a pipeline.

What the current implementation does, is to fit a complex model to the data and it is not clear if there are seasonal components added or not. I see two problems with this.

  1. This is confusing for the user. Why do I need to specify a seasonal behavior of my detrender when I want to compose it with another deseasonalizer later in my pipeline?
  2. The seasonal modeling could affect the result of the detrending (we saw this happening in the experiment I posted). I think the detrender should just model the trend and the residual will be handled by the next step in the pipeline.

To me, it was frustrating not finding a way to do this type of detrending and that is why I suggested it. How to best add this to the code base, I don't know but having something with a clear intended usage is appealing to me.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

Hm, I think this is a very interesting discussion. Let me see if I understand things correctly, please let me know if not.

  • @sbuse's original motivation was to have a piecewise linear interpolator or forecaster - this did not exist, and the quickest way he thought was getting it from Prophet with most features turned off, since it is a component of prophet but not available easily on its one anywhere. @sbuse is not actually interested in Prophet as a whole, just in getting a component for use in a large pipeline.
  • @tpvasconcelos is concerned about multiplication of classes, but acknowledges that at the moment it is not easy to get components from the current Prophet interface in sktime. He proposes to add a feature to the existing Prophet that allows to obtain components, including a piecewise linear trend but also others.

Am I understanding well?

If yes, I do think both viewpoints are valid and not contradictory.
That is:

  • it makes sense to have a class that just does "piecewise linear trend forecast" - prophet or not
  • it would be nice to have trends obtainable from prophet

Some semi-ordered thoughts:

  • from a policy standpoint, sktime encourages contributing classes as long as it is well-described what they do. So even if classes were multiplicative, if someone thinks this is exactly what they need/want (and they commit to contribute/maintain), then it is fine to add it. This is slightly different from how sklearn manages contributions (there, the bar is quite high to add anything).
  • Personally, I do think piecewise linear trend as its own component makes sense. It is a bit unfortunate that there is no separate implementation, but using a constrained prophet instance is better than not having it.
  • It also makes sense to have a decomposition estimator for the prophet model. However, @tpvasconcelos, I wonder, would that not be more sth like a transformer? For example, look at variational mode decomposition, VmdTransformer.
  • I also agree that it makes sense to have seasonality defaults as False in the piecewise linear trend estimator, since users will expect it to not have any additional components by default. However, the estimator has been released, so if we want to change that, we need to go through a deprecation cycle. The easiest way would be to introduce it as a parameter, and change the default, the earliest point for such a change at current is 0.28.0, if a warning message and the parameter is added in 0.27.0 or earlier.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

apologies for the ping, @hliebert, I meant @tpvasconcelos. The reason is very mundane, GitHub web API has an auto-complete dropdown menu where possible pings are ordered by - not sure - likelihood estimated by an AI or similar. I misclicked.

I wonder though whether the model in question knows something. I would find it scary if you actually end up finding this discussion highly relevant for you, @hliebert.

@tpvasconcelos
Copy link
Contributor

@fkiraly good summary and I agree with your points!

i.e.,

  • It makes sense to keep this piecewise linear detrender in place since users don't care and don't need to know that the internal implementation is coming from Prophet. They simply want to use a piecewise linear detrender and it makes sense to make it available as a standalone class 👍
  • That said, I also agree with you and @sbuse that the seasonal components should be turned off by default since this is not obvious and not the expected default behaviour.
  • Shame for the deprecation cycle but it makes sense

@tpvasconcelos
Copy link
Contributor

  • It also makes sense to have a decomposition estimator for the prophet model. However, @tpvasconcelos, I wonder, would that not be more sth like a transformer? For example, look at variational mode decomposition, VmdTransformer.

@fkiraly I think it would be used in a transformer however I'm not sure what the best way to implement this is. I usually use the Detrender transformer whenever I want to remove a component from a ts, even if that component is not a trend.

A bit unrelated to this conversation, but I think that it's a shame that the Detrender class was named this way 😄 because it really is much more generalised than that. Its implementation just fits any forecaster to the data and returns the in-sample residuals (using either an additive or multiplicative model). Sure it works as a "de-tender" if the forecaster is a trend-based forecaster but it works just as well with all other forecasters.

So, back to your question, here's a toy example of how I could use the Prophet components with the Detrender transformer:

forecaster = TransformedTargetForecaster(
    [
        # Remove the trend component w/ a linear Lasso detrender
        ("detrender": Detrender(TrendForecaster(Lasso()))),

        # Remove daily, weekly, and yearly seasonal components using Prophet
        ("deseasonalizer-dwy": Detrender(Prophet(trend="flat", extract_components=["daily", "weekly", "yearly"]))),

        # Forecast the residuals
        ("forecaster", StatsForecastAutoARIMA()),
    ]
)

I hope you see now why I wish Detrender was named something more generic like Remover (naming is hard!)

Does this make sense to you?

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

Shame for the deprecation cycle but it makes sense

Well, we try to not accidentally impact users' downstream code without giving advance warning. Not everyone has the capacity to run the full staging/testing/deploy/monitor mlops cycle, and with a sufficiently large user base there is always someone who (a) relies heavily on any given component and (b) ends up getting their pipeline killed if the change were breaking and unannounced...

On the other hand, two months are not as long as one might think.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

I usually use the Detrender transformer whenever I want to remove a component from a ts, even if that component is not a trend.

Yes, "subtractor" would be more accurate, at the cost of being orders of magnitude more confusing...

We did have this conversation at the very start, btw - Residualator? TakeResiduals?
Remover sounds sth like out of a mafia movie.
A slightly better option would be RemoveNoun, but if Noun = Trend, then we're back at Detrender...

But agreed that naming is hard.

@tpvasconcelos
Copy link
Contributor

Subtractor doesn't work when model="multiplicative" and my teammates would kill me if I ever named a class Residualator... DetrenderEtAl it is!

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

DetrenderButAlsoResidualsComputatorInGeneral

@sbuse
Copy link
Contributor Author

sbuse commented Jan 27, 2024

Thanks @tpvasconcelos for the example. Now I got how you want to extract and use the other parts of the model. I wonder though if there are no tools for a Fourier-sum in the scikit learn toolbox.

Some semi-ordered thoughts:

  • Personally, I do think piecewise linear trend as its own component makes sense. It is a bit unfortunate that there is no separate implementation, but using a constrained prophet instance is better than not having it.

We could also try to wrap another implementation of a piecewise linear regression that is not from the prophet model. Personally, I would like that a lot more even though prophet has shown to work quite well. A quick search revealed this code base (https://github.com/chasmani/piecewise-regression)

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 27, 2024

A quick search revealed this code base (https://github.com/chasmani/piecewise-regression)

Hm, that seems not to be scikit-learn compliant.

If you can find a scikit-learn compliant piecewise linear regressor, you can plug it into TrendForcaster to get a piecewise linear forecaster (which you could then also use in a Detrender...)

@sbuse
Copy link
Contributor Author

sbuse commented Jan 29, 2024

  • I also agree that it makes sense to have seasonality defaults as False in the piecewise linear trend estimator, since users will expect it to not have any additional components by default. However, the estimator has been released, so if we want to change that, we need to go through a deprecation cycle. The easiest way would be to introduce it as a parameter, and change the default, the earliest point for such a change at current is 0.28.0, if a warning message and the parameter is added in 0.27.0 or earlier.

@fkiraly I wonder how we should proceed to set the seasonality defaults to False. Adding the parameters, then changing the default and then hiding them again sounds cumbersome but if we have to do it we could use this PR to add the parameters.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 29, 2024

Adding the parameters, then changing the default and then hiding them again sounds cumbersome but if we have needed we could use this PR to add the parameters.

Imo that's the simplest compliant pathway to the end state where they are internally different and not exposed.

I would aim for a differet end state, exposed but default as False, that's one step shorter.

Yes, this PR could be the start - we could even add a warning right away that the default will change to False in 0.28.0. The typical trick is to set a default of None, and raise the warning only if the value is None (because other value means it is set by user).

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - almost done, but I think the warning trigger condition is not right.

The warning should be triggered in every case where the user has code that changes logic witih 0.28.0. I've added a recipe here on how to ensure that: #5875

The case we need to cover is where the user does not set the seasonality parameter explicitly. Under the current condition, no user would see the warning with the version that contains this PR.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conition is now right, but now the requirement "values of self params should always mirror values of init params" is violated.

I've created an example specifically for htis case and would appreciate feedback on how easy it is to understand!
https://www.sktime.net/en/latest/developer_guide/deprecation.html#id1

Of course I can also make the change (it is small), but (a) it's probably interesting to do and (b) it would be great if we can "test" the new example in the developer guide.

@sbuse
Copy link
Contributor Author

sbuse commented Feb 14, 2024

@fkiraly Thanks for the example and the doc extension. It makes the deprecation very clear and i will change the script accordingly.

Could you elaborate why the trick with self._parameter is necessary? What do you gain compared to overriding self.parameter? Maybe you could also put a short explanation in the template description.

@fkiraly
Copy link
Collaborator

fkiraly commented Feb 14, 2024

Could you elaborate why the trick with self._parameter is necessary?

That's coming from an overriding sklearn interface expectation, namely that __init__ params are (a) immediately written to self, and (b) never changed from their original value. If we would not do the "trick", then at some point self.parameter has a different value than what was passed to __init__.

Good idea to add that to the docs. Would you like to add an explanatory sentence at the end of the first example, or elsewhere were it might be useful (and not to distracting)? The best place would be the point at which the reader starts to ask the question, but late enough so they have had time to digest the example.

That is, the best place might be the point at which you started to wonder.

fkiraly
fkiraly previously approved these changes Feb 17, 2024
@fkiraly fkiraly merged commit 9bb3766 into sktime:main Feb 18, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants