Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example setting the penalty parameter? #4

Closed
wqp89324 opened this issue Jan 18, 2019 · 5 comments
Closed

example setting the penalty parameter? #4

wqp89324 opened this issue Jan 18, 2019 · 5 comments

Comments

@wqp89324
Copy link

Is there an example about how to set the pen parameter for Pelt?

@deepcharles
Copy link
Owner

Hello,

Finding a proper value for the pen paper heavily depends on the signal at hand. As a rule of thumb, the more noise, samples or dimensions, the larger this parameter should be.
For parametric changes (such as mean-shifts, scale-shifts,...), the Bayesian Information Criterion (BIC) is a good starting point. For instance, detecting mean-shifts with BIC yields:

T, d = signal.shape  # number of samples, dimension
sigma = ...  # noise standard deviation
bic = sigma*sigma*np.log(T)*d
algo = rpt.Pelt().fit(signal)
my_bkps = algo.predict(pen=bic)

However, BIC tends to produce too low penalty values. When that happens, the simplest procedure, is to test several values, as below:

pen_values = np.logspace(0, 3, 10)  # for instance
algo = rpt.Pelt().fit(signal)
bkps_list = [algo.predict(pen=pen) for pen in pen_values]
# then compare elements of bkps_list

Cheers,

Charles

@mylife126
Copy link

Hello thanks for sharing the BIC theory, however, could you please let me know why the way you calculate BIC is bic = sigmasigmanp.log(T)*d ? In the Wiki, i could not find such derivation. Thanks!

@deepcharles
Copy link
Owner

Hello,

There might be a mistake indeed. The formula should be

This is only valid for the cost function l2 (see Table 1 of this article).
To verify it, assume the signal is multivariate Gaussian with isotropic variance $sigma^2$, use the normal formula for BIC and get rid of all terms that does not depend on $K$, the number of change points.

@deepcharles
Copy link
Owner

deepcharles commented Nov 3, 2020

If you would like to correct it in the docs, do not hesitate to make a pull request ;)

oh it is not in the docs yet, never mind then.

@DPTPaul
Copy link

DPTPaul commented Dec 21, 2021

Hello,

Thanks a lot for this very useful package!
I'm currently working on a signal and have no idea about the number of breakpoints.
I understand that BIC approximation can only be used with a l2 cost function. As I try to detect both mean and variance shifts, I would like to use a `rbf`` cost function.

Is there any trick to set a good penalty value (or at least a sense of scale)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants