-
Notifications
You must be signed in to change notification settings - Fork 1
/
theory.tex
79 lines (66 loc) · 2.92 KB
/
theory.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
% Copyright (c) 2012-2020, Yegor Bugayenko
% All rights reserved.
%
% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions
% are met: 1) Redistributions of source code must retain the above
% copyright notice, this list of conditions and the following
% disclaimer. 2) Redistributions in binary form must reproduce the above
% copyright notice, this list of conditions and the following
% disclaimer in the documentation and/or other materials provided
% with the distribution. 3) Neither the name of Yegor Bugayenko nor
% the names of other contributors may be used to endorse or promote
% products derived from this software without specific prior written
% permission.
%
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
% "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
% NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
% FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
% THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
% INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
% (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
% SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
% HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
% STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
% OF THE POSSIBILITY OF SUCH DAMAGE.
%
\documentclass[12pt]{article}
\usepackage{amsmath}
\begin{document}
\title{Source Code Volatility (SCV)\\to Spot Dead Code}
\author{Yegor Bugayenko}
\maketitle
\section{Introduction}
Volatility of source code is an experimental metric that
shows how big is the difference between actively and rarely changed (possibly dead)
code. It is assumed that a big percentage of dead code is
an indicator of maintainability problems in the project.
\section{Details}
First, by looking at Git history,
it is observed how many times every source code file out of $N$ was touched
during the lifetime of the repository (excluding the files that don't exist
in the repository anymore):
\begin{eqnarray}
T = [t_1, t_2, \dots, t_N]
\end{eqnarray}
Then, the entire interval between $\check{T}$ (the maximum value)
and $\hat{T}$ (the minimum value) is divided to $Z$ equivalent groups:
\begin{align}
G &= [g_1, g_2, \dots, g_{Z}] \\
\delta &= ( \check{T} - \hat{T} ) / Z \\
g_j &= \sum_{i=1}^N [ j(\delta-1) < t_i < j\delta ]
\end{align}
Then, the mean $\mu$ is calculated as:
\begin{eqnarray}
\mu = \frac{1}{Z}\sum_{j=1}^{Z}{g_j}
\end{eqnarray}
Finally, the variance is calculated as:
\begin{eqnarray}
Var(g) = \frac{1}{Z}\sum_{j=1}^{Z}{|g_j - \mu|^2}
\end{eqnarray}
The variance $Var(g)$ is the volatility of the source code. The smaller
the volatility the more cohesive is the repository and the smaller
the amount of the dead code inside it.
\end{document}