Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Cleaned up the intro by removing documentation that's duplicated below.
  • Loading branch information
Alex Olivas authored Jan 14, 2022
1 parent 4273b3c commit 3a98fc0
Showing 1 changed file with 19 additions and 5 deletions.
24 changes: 19 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,17 @@ $ coverage run --source=voka -m unittest
```

# Introduction
Nearly all modern HPC applications execute in a distributed system.  Whether the system is a commercial cloud, public grid, or private cluster, application execution represents a significant cost of time, energy, and money.  Continuous Integration (CI) systems have been around for a while as the first stage of code validation.  More recently Continuous Delivery (CD) systems have become more popular, as a means to autonomously deliver validated software.  A new stage in this pipeline is emerging, called Continuous Benchmarking (CB) that attempts to look beyond traditional unit and integration tests common to all CI/CD systems. No prior knowledge, by human observers, of distribution expectations are required.  Distributions are not required to be mathematically well-behaved, i.e. derived by sampling functions with continuous low-order derivatives.  Distributions are not required to adhere to Poissonian statistics.  Finally, there is no threshold on the sample statistics required.  The technique presented is currently being used by IceCube in a nightly CI/CD/CB system to validate its physics codebase before release and subsequent petabyte-scale mass production.

The method described here was inspired by F. Porter's paper[1] describing various methods to test the consistency of two histograms and can be considered an extension, where instead of asking whether two histograms are consistent, we ask whether one test histogram is consistent with an ensemble of N benchmark histograms.
Nearly all modern HPC applications execute in a distributed system.  Whether the system is a commercial cloud, public grid,
or private cluster, application execution represents a significant cost of time, energy, and money.  Continuous Integration
(CI) systems have been around for a while as the first stage of code validation.  More recently Continuous Delivery (CD)
ystems have become more popular, as a means to autonomously deliver validated software.  A new stage in this pipeline is
emerging, called Continuous Benchmarking (CB) that attempts to look beyond traditional unit and integration tests common
to all CI/CD systems. The technique presented is currently being used by IceCube in a nightly CI/CD/CB system to validate
its physics codebase before release and subsequent petabyte-scale mass production.

The method described here was inspired by F. Porter's paper[1] describing various methods to test the consistency of two
histograms and can be considered an extension, where instead of asking whether two histograms are consistent, we ask
whether one test histogram is consistent with an ensemble of N benchmark histograms.

## Advantages
- No prior detailed knowledge of the distribution is required.
Expand All @@ -76,12 +84,18 @@ The method described here was inspired by F. Porter's paper[1] describing variou
- Works on distributions of naturally, human-interpretable metrics (i.e. histograms).

## Difficulty with Traditional Methods
The chi^2 distribution between two histograms doesn't follow a chi^2 distribution. It has long tails. Can't easily derive a p-value from it using standard tools. Necessary to empirically determine the test statistic distribution for each physical distribution, which is impractical. Fitting has the same problem[3]. It's impractical and often distributions don't easily fit analytic functions since there's no guarantee they're sampled from a function with continuous first derivatives.
The chi^2 distribution between two histograms doesn't follow a chi^2 distribution. It has long tails. Can't easily
derive a p-value from it using standard tools. Necessary to empirically determine the test statistic distribution for
each physical distribution, which is impractical. Fitting has the same problem[3]. It's impractical and often
istributions don't easily fit analytic functions since there's no guarantee they're sampled from a function with
continuous first derivatives.

In practice, physical distributions are not guaranteed to be Poissonian [2]. IceCube has non-Poissonian noise [?].

## Basics
The method presented here uses traditional test statistics, such as Chi^2, KS, AD, as inputs to an outlier detection algorithm (specifically calculating a Local Outlier Factor) to determine whether the test histogram is consistent with the benchmark ensemble.
The method presented here uses traditional test statistics, such as Chi^2, KS, AD, as inputs to an outlier detection
algorithm (specifically calculating a Local Outlier Factor) to determine whether the test histogram is consistent
with the benchmark ensemble.

### Getting Started

Expand Down

0 comments on commit 3a98fc0

Please sign in to comment.