Skip to content
This repository has been archived by the owner on Nov 2, 2018. It is now read-only.

paloukari/TypeEdge.AnomalyDetection

Repository files navigation

Update:This project has been transferred to the official Azure.IoT.TypeEdge repo as an implementation example

TypeEdge.AnomalyDetection

High Frequency Unsupervised Anomaly Detection on the Edge using TypeEdge.

Prerequisites

The minimum requirements to get started with TypeEdge.AnomalyDetection are:

Quickstart

  1. Clone this repo:

     git clone https://github.com/paloukari/TypeEdge.AnomalyDetection
    
  2. Edit the IotHubConnectionString value of the /Thermostat.Emulator/appsettings.json file. You need to use the iothubowner connection string from your Azure IoT Hub.

  3. Build and run the Thermostat.Emulator console app. To observe the generated waveform with a Fast Fourier Transformation, visit the visualization URL at http://localhost:5001. This is a visualization web application that runs on the Edge and helps you understand the data stream characteristics. You will see something like this:

Preface

Although Anomaly Detection is a well-studied AI/ML field, Edge introduces a multi-constraint hosting environment that needs to be investigated en bloc. Anomaly Detection, as with every other compute-intensive processing on the Edge, is a highly tailored, scenario specific balance of trade-offs, such as performance, accuracy, latency, robustness and/or maintainability. These constraints arise from the nature of the Edge: in contrast to Cloud ML, the Edge ML usually has a non-scalable and fixed size compute capacity, limited memory and storage capabilities, with possibly unreliable network connectivity.

As with most ML on the Edge scenarios, Anomaly Detection on the Edge is assumed to be part of a bigger, composite Cloud+Edge solution. The main reason for having a composite ML solution is the clear need to leverage the scaling flexibility of the Cloud for model training, and then shipping the pre-tained ML models down to the Edge. This new composite ML application paradigm introduces complex requirements for solution operationalization and associated DevOps practices.

Homoiconicity and ML on the Edge

ML relies heavily on data and data transformations. These trandformations, in their most abstract form, can be annotated as:

Note: for temporal input processing, using a set input eliminates the need for state.

This annotation is a clear indication that the ML trained models should not be considered as application data, but rather as code, raising the need for a DevOps pipeline that includes the the ML models as a first-class citizen. As a matter of fact, there are cases where a typical ML model defined as a data structure, is transformed to native, highly optimized machine code to achieve better performance.

Currently, the deployment mechanism of an Azure IoT Edge application uses container images. When trying to identify the image layers of the container definition of an ML on the Edge application, it is apparent that some of the application components are expected to change more frequently than the rest of them. Most container engines like Docker offer an IO optimization of these layer when pulling a container update, only the updated layers and below will are being downloaded.

This discrepancy of the update possibility between different layers can be used to define a more optimized container layers stacking and a deployment strategy that leverages the existing tools to define an effective deployment CI/CD strategy.

Continuous training and DevOps

A defined DevOps pipeline that includes the ML model as part of the ML Edge application simplifies the Anomaly Detection ALM model in continuous training scenarios. The basic premise here is that these ML applications evolve over time, perhaps on a faster cadence compared to the traditional non-ML apps. Retraining can happen both on the Edge and on the Cloud, usually with different datasets and frequencies. The hypothesis of the fundamental purpose for training on the Edge is that the Edge needs to be able to minimize the false positives by recognizing the normal (non-anomalous) changes of a signal behavior, while maintaining the same accuracy. This decision of course cannot be generalized to all ML cases, but is part of the aforementioned trade-offs balance that depends on the scenario specific constraints (connectivity, latency, accuracy, etc.)

High Level Architecture

This is the high level architecture. This diagram depicts all of the logical components of a complete Anomaly Detection on the Edge application, including a cloud pipeline as a reference.

There are many factors to weigh in when designing the Edge Software Architecture based on this logical design. The prime logical design principle was to define an abstract Edge application based on optional generic Micro-Services, where each service can be easily replaced by scenario specific modules. The tradeoff of this highly decoupled design is the introduction of a performance penalty in terms of increased inter-process communication on the Edge.

This abstract design serves a second purpose, to define a benchmarking mechanism to evaluate multiple options on different device specific capabilities.

Injecting ad hoc Anomalies

To inject Anomalies to the waveform, run the Thermostat.ServiceApp console app, after editing first the IotHubConnectionString value inside the /Thermostat.ServiceApp/appsettings.json file. Same as before, you need to use the iothubowner connection string from your Azure IoT Hub.

This is a Service (cloud) side application that sends twin updates and calls direct methods of the IoT Edge application modules.

When you call the Anomaly Direct Method, an ad hoc anomaly value is generated. This anomaly is a Dirac delta function (Impulse), added to the normal waveform.

A Dirac delta distribution is defined as:

where $f(t)$ is smooth function.

The Fourier transformation of the Dirac delta function is:

which in our case equals to 1.

You can observe this anomaly in the real time visualization page:

Note: It's interesting to observe the impact this spike has on the frequency spectrum. The discrepancy of the theoretical and the computational results is apparent. This happens because this implementation is an FFT over discrete sampled data, rather than actual Fourier transformation on a theoretical input and for an infinite time scale.