Complex penalties #131
-
Moved from an issue I note in the paper you talk about different penalties in section 6.1. However from looking through the library, it seems that ruptures only supports a fixed linear penalty (ie Beta). Am I right to assume that it doesn't work with more complex penalties linear such as AIC? Further, if I wanted to implement a method were you normally would calculate a p-value of splitting (ie a likelihood ratio test following Chi squared), is the idea that we just have the error() method return the test statistic without testing for significance (ie the raw likelihood ratio), and the penalty constant implies a p-value? I suppose this makes segmentation fast and flexible, but highly dependent on the choice of penalty? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
Hello Michael, AIC is also a linear penalty (in the context of change point detection) so you could also use it (you only need to set the beta accordingly). However you are right to assume that ruptures can only deal with linear penalties. In the case of general penalty formulas, there is no efficient way to find the best segmentation. If you have a specific case in mind that you think might be worth integrating to ruptures, I would be glad to hear it. As for your second question, the bkps = algo.predict(n_bkps=2)
# compute p-value for bkps
...
bkps = algo.predict(n_bkps=3)
# compute p-value for bkps
... If you have a reference that describe such procedure, I could help more. On a side note, we are currently adding examples (which are basically notebooks) to the documentation. If you have an interesting procedure, we would be glad to add it. Just let us know. Cheers |
Beta Was this translation helpful? Give feedback.
Hello Michael,
AIC is also a linear penalty (in the context of change point detection) so you could also use it (you only need to set the beta accordingly). However you are right to assume that ruptures can only deal with linear penalties. In the case of general penalty formulas, there is no efficient way to find the best segmentation. If you have a specific case in mind that you think might be worth integrating to ruptures, I would be glad to hear it.
As for your second question, the
.error(start, end)
method only returns the cost on a given sub-signalsignal[start:end]
. It is not always equal to the likelihood because constant terms were sometimes discarded (because they did not change …