Skip to content

Commit

Permalink
docs(costcosine): add entry for CostCosine in docs (#93)
Browse files Browse the repository at this point in the history
  • Loading branch information
deepcharles authored Dec 5, 2020
1 parent b4abc34 commit da7544f
Show file tree
Hide file tree
Showing 4 changed files with 84 additions and 4 deletions.
5 changes: 5 additions & 0 deletions docs/code-reference/costs/costcosine-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Kernelized mean change (CostCosine)

::: ruptures.costs.costcosine.CostCosine
rendering:
show_root_heading: true
70 changes: 70 additions & 0 deletions docs/user-guide/costs/costcosine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Kernelized mean change (`CostCosine`)

## Description

Given a positive semi-definite kernel $k(\cdot, \cdot) : \mathbb{R}^d\times \mathbb{R}^d \mapsto \mathbb{R}$ and its associated feature map $\Phi:\mathbb{R}^d \mapsto \mathcal{H}$ (where $\mathcal{H}$ is an appropriate Hilbert space), this cost function detects changes in the mean of the embedded signal $\{\Phi(y_t)\}_t$ [[Arlot2019](#Arlot2019)].
Formally, for a signal $\{y_t\}_t$ on an interval $I$,

$$
c(y_{a..b}) = \sum_{t=a}^{b-1} \| \Phi(y_t) - \bar{\mu} \|_{\mathcal{H}}^2
$$

where $\bar{\mu}_{a..b}$ is the empirical mean of the embedded sub-signal $\{\Phi(y_t)\}_{a\leq t < b-1}$.
Here the kernel is the cosine similarity:

$$
k(x, y) = \frac{\langle x\mid y\rangle}{\|x\|\|y\|}
$$

where $\langle \cdot\mid\cdot \rangle$ and $\| \cdot \|$ are the Euclidean scalar product and norm respectively.
In other words, it is equal to the L2-normalized dot product of vectors.
This cost function has been used for music segmentation tasks [[Cooper2002](#Cooper2002)] and topic segmentation of text [[Hearst1994](#Hearst1994)].

## Usage

Start with the usual imports and create a signal.

```python
import numpy as np
import matplotlib.pylab as plt
import ruptures as rpt

# creation of data
n, dim = 500, 3 # number of samples, dimension
n_bkps, sigma = 3, 5 # number of change points, noise standart deviation
signal, bkps = rpt.pw_constant(n, dim, n_bkps, noise_std=sigma)
```

Then create a [`CostCosine`][ruptures.costs.costcosine.CostCosine] instance and print the cost of the sub-signal `signal[50:150]`.

```python
c = rpt.costs.CostCosine().fit(signal)
print(c.error(50, 150))
```

You can also compute the sum of costs for a given list of change points.

```python
print(c.sum_of_costs(bkps))
print(c.sum_of_costs([10, 100, 200, 250, n]))
```

In order to use this cost class in a change point detection algorithm (inheriting from [`BaseEstimator`][ruptures.base.BaseEstimator]), either pass a [`CostCosine`][ruptures.costs.costcosine.CostCosine] instance (through the argument `custom_cost`) or set `model="cosine"`.

```python
c = rpt.costs.CostCosine()
algo = rpt.Dynp(custom_cost=c)
# is equivalent to
algo = rpt.Dynp(model="cosine")
```

## References

<a id="Hearst1994">[Hearst1994]</a>
Hearst, M. A. (1994). Multi-paragraph segmentation of expository text. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 9–16). Las Cruces, New Mexico, USA.

<a id="Cooper2002">[Cooper2002]</a>
Cooper, M., & Foote, J. (2002). Automatic music summarization via similarity analysis. In Proceedings of the International Conference on Music Information Retrieval (ISMIR) (pp. 81–85). Paris, France.

<a id="Arlot2019">[Arlot2019]</a>
Arlot, S., Celisse, A., & Harchaoui, Z. (2019). A kernel multiple change-point algorithm via model selection. Journal of Machine Learning Research, 20(162), 1–56.
11 changes: 7 additions & 4 deletions docs/user-guide/detection/kernelcpd.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,14 @@ The exact optimization procedure is described in [[Killick2012]](#Killick2012).

## Available kernels
We list below a number of kernels that are already implemented in `ruptures`.
In the following, $u$ and $v$ are two d-dimensional vectors.
In the following, $u$ and $v$ are two d-dimensional vectors and $\|\cdot\|$ is the Euclidean norm.

| Kernel | Description | Cost function |
| -------------------------- | --------------------------------------------------------------------------------------------------- | ---------------------------------------------------- |
| Linear<br>`model="linear"` | $k_{\text{linear}}(u, v) = u^T v$. | [`CostL2`](../../user-guide/costs/costl2.md) |
| Gaussian<br>`model="rbf"` | $k_{\text{Gaussian}}(u,v)=\exp(-\gamma \|u-v\|^2)$<br>where $\gamma>0$ is a user-defined parameter. | [`CostRbf`](../../user-guide/costs/costrbf.md) |
| Cosine<br>`model="cosine"` | $k_{\text{cosine}}(u, v) = (u^T v)/(\|u\|\|v\|)$ | [`CostCosine`](../../user-guide/costs/costcosine.md) |

- **Linear kernel:** $k_{\text{linear}}(u, v) = u^T v$ and the induced norm is the Euclidean norm.
- **Gaussian kernel:** (also known as radial basis function, rbf), $k_{\text{Gaussian}}(u,v)=\exp(-\gamma \|u-v\|^2)$ where $\|\cdot\|$ is the Euclidean norm and $\gamma>0$ is a user-defined parameter.
- **Cosine similarity:** $k_{\text{cosine}}(u, v) = (u^T v)/(\|u\|\|v\|)$ (scaled version of the linear kernel).

## Implementation and usage

Expand Down
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ nav:
- 'CostL2': user-guide/costs/costl2.md
- 'CostNormal': user-guide/costs/costnormal.md
- 'CostRbf': user-guide/costs/costrbf.md
- 'CostCosine': user-guide/costs/costcosine.md
- 'CostLinear': user-guide/costs/costlinear.md
- 'CostRank': user-guide/costs/costrank.md
- 'CostMl': user-guide/costs/costml.md
Expand Down Expand Up @@ -93,6 +94,7 @@ nav:
- 'CostL2': code-reference/costs/costl2-reference.md
- 'CostNormal': code-reference/costs/costnormal-reference.md
- 'CostRbf': code-reference/costs/costrbf-reference.md
- 'CostCosine': code-reference/costs/costcosine-reference.md
- 'CostLinear': code-reference/costs/costlinear-reference.md
- 'CostRank': code-reference/costs/costrank-reference.md
- 'CostMl': code-reference/costs/costml-reference.md
Expand Down

0 comments on commit da7544f

Please sign in to comment.