The Institute for Ethical AI & ML

The state of Production ML in 2020

Alejandro Saucedo | a@ethical.institute

Twitter: @AxSaucedo

[NEXT]

The Institute for Ethical AI & ML

The state of Production ML in 2020

![portrait](images/aletechuk.png)
Alejandro Saucedo
Twitter: @AxSaucedo

    <br>
    Chief Scientist
    <br>
    <a style="color: cyan" href="http://e-x.io">The Institute for Ethical AI & ML</a
    <br>
    <br>
    <br>
    Engineering Director
    <br>
    <a style="color: cyan" href="#">Seldon Technologies</a>
    <br>
    <br>
    <hr>
    <br>
    Head of Solutions Eng. & Sci.
    <br>
    <a style="color: cyan" href="http://eigentech.com">Eigen Technologies</a>
    <br>
    <br>
    Software Engineer
    <br>
    <a style="color: cyan" href="#">Bloomberg LP.</a>

</td>

[NEXT]

OSS ML Serving in k8s

We're hiring: seldon.io

[NEXT]

The Institute for Ethical AI & Machine Learning

[NEXT]

We are part of the Linux Foundation AI

[NEXT]

Small data science projects

Works relatively well

[NEXT]

However

As our data science requirements grow...

We face new issues

[NEXT]

Increasing complexity in flow of data

[NEXT]

Each data scientist has their own set of tools

Some ♥ Tensorflow
Some ♥ R
Some ♥ Spark

![classification_large](images/mlibs.jpg)

### Some ♥ all of them

[NEXT]

Serving models becomes increasinly harder

[NEXT]

When stuff goes wrong it's hard to trace back

[NEXT]

As your technical functions grow...

[NEXT]

So should your infrastructure

[NEXT]

It's challenging

[NEXT]

Mapping the Ecosystem

https://landscape.lfai.foundation

[NEXT]

Principles today

Orchestration
Explainability
Reproducibility

[NEXT SECTION]

2.1 Model Orchestration

Training & serving at scale

[NEXT]

Computational Resource allocation

Services with different computational requirements

With often complex computational graphs

We need to be able to allocate the right resources

### This is a hard problem

[NEXT]

Adding Governance/Compliance

[NEXT]

Standardisation of metrics

[NEXT]

Standardisation of errors

[NEXT]

Complex Deployment Strategies

[NEXT]

Hands on example using:

Seldon core is an OSS library for machine learning orchestration and monitoring in production

[NEXT]

Basic Example:

Wrapping an income classifier Python model

[NEXT]

GitOps Strategies for ML

[NEXT]

More advanced Example:

PyTorch Hub Deployment: https://bit.ly/pytorchseldon

[NEXT]

Other libraries to watch

[NEXT]

KFServing

Serverness for machine learning in kubernetes based on Knative

[NEXT]

DeepDetect

Unifying multiple external machine learning libraries on a single API

www.deepdetect.com/

[NEXT SECTION]

2.2 Explainability

Tackling "black box model" situations

[NEXT]

Going beyond the algorithms

Explainability through tools, process and domain expertise.

[Our talk on Explainability of Tensorflow Models]

[NEXT]

Data assessment

Class imbalances
Protected features
Correlations
Data representability

[NEXT]

Model assessment

Feature importance
Model specific methods
Domain knowledge abstraction
Model metrics analysis

[NEXT]

Production monitoring

Evaluation of metrics
Manual human review
Monitoring of anomalies
Setting thresholds for divergence

[NEXT]

Infrastructure level XAI Design patterns

[NEXT]

Hands on example using:

Alibi is a library that contains production-level black box model explainability techniques

[NEXT]

Example

Deploying Explainer Modules: http://bit.ly/seldonexplainer

[NEXT]

Other OSS libraries to watch

[NEXT]

ELI5

github.com/TeamHG-Memex/eli5

[NEXT]

SHAP

Unifying multiple model explainability techniques

github.com/slundberg/shap

[NEXT]

XAI

Analyse datasets, evaluate models and monitor production

github.com/ethicalml/xai

[NEXT SECTION]

2.3 Reproducibility

Model & data versioning

[NEXT]

Abstracting individual steps

Data in


$ cat data-input.csv

>            Date    Open    High     Low   Close     Market Cap
> 1608 2013-04-28  135.30  135.98  132.10  134.21  1,500,520,000
> 1607 2013-04-29  134.44  147.49  134.00  144.54  1,491,160,000
> 1606 2013-04-30  144.00  146.93  134.05  139.00  1,597,780,000

Code / Config


$ cat feature-extractor.py

def open_norm_feature_extractor(df):
feature = some_lib.get_open(df)
return feature

Data out

$ cat data-output.csv Open 0.57 0.59 0.47

[NEXT]

![classification_large](images/versioning.jpg)

## Going one level higher

We can abstract our entire pipeline and data flows

[NEXT]

Hands on example using:

Kubeflow is a Cloud Native platform for reusable machine learning pipelines in kubernetes

[NEXT]

Example

Reusable NLP Pipelines: https://bit.ly/seldon-kf-nlp

[NEXT]

Other OSS libraries to watch

[NEXT]

Data Version Control (DVC)

Add your data

dvc add images.zip

commit data input, model output and code

dvc run -d images.zip -o model.p ./cnn.py

Add repository location (here is s3)

dvc remote add myrepo s3://mybucket

Push to the location specified

dvc push

Check it out at dvc.org

[NEXT]

MLFlow

http://github.com/databricks/mlflow

[NEXT]

Pachyderm

www.pachyderm.io/

[NEXT SECTION]

Much more content


🔍 Explainability	🔏 Privacy	📜 Versioning
🏁 Orchestration	🌀 FeaturEng	🤖 AutoML
📓 Notebooks	📊 Visualisation	🔠 NLP
🧵 ETL	🗞️ Storage	📡 FaaS
🗺️ Computation	📥 Serialisation	🎁 Compiler
💸 CommercialML	💰 CommercialETL

### Check it out & add more libraries

[NEXT]

The Institute for Ethical AI & ML

The state of Production ML in 2020

![portrait](images/aletechuk.png)
Alejandro Saucedo
Twitter: @AxSaucedo

    <br>
    Chief Scientist
    <br>
    <a style="color: cyan" href="http://e-x.io">The Institute for Ethical AI & ML</a
    <br>
    <br>
    <br>
    Engineering Director
    <br>
    <a style="color: cyan" href="#">Seldon Technologies</a>
    <br>
    <br>
    <hr>
    <br>
    Head of Solutions Eng. & Sci.
    <br>
    <a style="color: cyan" href="http://eigentech.com">Eigen Technologies</a>
    <br>
    <br>
    Software Engineer
    <br>
    <a style="color: cyan" href="#">Bloomberg LP.</a>

</td>

Files

contents.md

Latest commit

History

contents.md

File metadata and controls

The Institute for Ethical AI & ML

The state of Production ML in 2020

The Institute for Ethical AI & ML

The state of Production ML in 2020

OSS ML Serving in k8s

We're hiring: seldon.io

The Institute for Ethical AI & Machine Learning

We are part of the Linux Foundation AI

Small data science projects

Works relatively well

However

We face new issues

Increasing complexity in flow of data

Each data scientist has their own set of tools

Serving models becomes increasinly harder

When stuff goes wrong it's hard to trace back

As your technical functions grow...

So should your infrastructure

It's challenging

Mapping the Ecosystem

https://landscape.lfai.foundation

Principles today

2.1 Model Orchestration

Training & serving at scale

Computational Resource allocation

Adding Governance/Compliance

Standardisation of metrics

Standardisation of errors

Complex Deployment Strategies

Hands on example using:

Seldon core is an OSS library for machine learning orchestration and monitoring in production

Basic Example:

GitOps Strategies for ML

More advanced Example:

Other libraries to watch

KFServing

DeepDetect

www.deepdetect.com/

2.2 Explainability

Going beyond the algorithms

Data assessment

Model assessment

Production monitoring

Infrastructure level XAI Design patterns

Hands on example using:

Alibi is a library that contains production-level black box model explainability techniques

Example

Other OSS libraries to watch

ELI5

github.com/TeamHG-Memex/eli5

SHAP

github.com/slundberg/shap

XAI

github.com/ethicalml/xai

2.3 Reproducibility

Model & data versioning

Abstracting individual steps

Data in

Code / Config

Data out

Hands on example using:

Kubeflow is a Cloud Native platform for reusable machine learning pipelines in kubernetes

Example

Other OSS libraries to watch

Data Version Control (DVC)

Add your data

commit data input, model output and code

Add repository location (here is s3)

Push to the location specified

Check it out at dvc.org

MLFlow

http://github.com/databricks/mlflow

Pachyderm

www.pachyderm.io/