Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add page on decentralized AI inference #3407

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions docs/developer-docs/ai/inference.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
keywords: [intermediate, concept, AI, ai, deAI, deai]
---

import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";

# Decentralized AI inference

<MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />

It's possible for canister smart contracts to run inference in a number of ways, depending on the decentralization and performance requirements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It's possible for canister smart contracts to run inference in a number of ways, depending on the decentralization and performance requirements.
## Overview
Inference in the context of decentralized AI refers to using a trained model to draw conclusions about new data.
It's possible for canister smart contracts to run inference in a number of ways depending on the decentralization and performance requirements.
Canisters can utilize inference run on-chain, on-device, or through HTTPS outcalls.


## Inference on-chain

Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.

### Examples

- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain using Rust.
- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
- [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses TypeScript and a pre-trained model for making predictions.
- [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Try it here](https://icgpt.icpp.world/).
- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent written in Rust using chain of thoughts for reasoning and actions. [Try it here](https://arcmindai.app).

### On-chain inference frameworks

- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, and PyTorch models, and compiles to WebAssembly.
- [MotokoLearn](https://github.com/ildefons/motokolearn): A Motoko package that enables on-chain machine learning.
[The image classification example](https://github.com/dfinity/examples/tree/master/rust/image-classification) explains how to integrate it into a canister to run on ICP.
- [Rust-Connect-Py-AI-to-IC](https://github.com/jeshli/rust-connect-py-ai-to-ic): Open-source tool for deploying and running Python AI models on-chain using Sonos Tract.
- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, and PyTorch models, and compiles to WebAssembly.
[The MNIST example](https://github.com/smallstepman/ic-mnist) explains how to integrate it into a canister to run on ICP. [Try it here](https://jsi2g-jyaaa-aaaam-abnia-cai.icp0.io/).
- [Candle](https://github.com/huggingface/candle): a minimalist ML framework for Rust that compiles to WebAssembly.
[An AI chatbot example](https://github.com/ldclabs/ic-panda/tree/main/src/ic_panda_ai) shows how to run a Qwen 0.5B model in a canister on ICP.


## Inference on-device

An alternative to running the model on-chain would be for the user to download the model from a canister smart contract, and the inference then happens on the user's device.
If the user trusts their own device, then they can trust that the inference ran correctly.
A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
Comment on lines +40 to +43
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An alternative to running the model on-chain would be for the user to download the model from a canister smart contract, and the inference then happens on the user's device.
If the user trusts their own device, then they can trust that the inference ran correctly.
A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
An alternative to running the model on-chain would be to download the model from a canister, then run the inference on the local device. If the user trusts their own device, then they can trust that the inference ran correctly.
A disadvantage of this workflow is that the model needs to be downloaded to the user's device, resulting in less confidentiality of the model and decreased user experience due to increased latency.
ICP supports this workflow for most existing models because a smart contract on ICP can store models up to 400GiB.


### Examples

- [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Try it here](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/).


## Inference with HTTP calls

Smart contracts running on ICP can make [HTTP requests through HTTP outcalls](/docs/current/developer-docs/smart-contracts/advanced-features/https-outcalls/https-outcalls-overview) to Web2 services including OpenAI and Claude.

### Examples

- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts. [Try it here](https://pycrs-xiaaa-aaaal-ab6la-cai.icp0.io/).
48 changes: 8 additions & 40 deletions docs/developer-docs/ai/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Inference happens on the user's device after downloading the model.
If the user trusts their own device, then they can trust that the inference ran correctly.
A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
See [an example]((https://github.com/patnorris/DecentralizedAIonIC) of an in-browser AI chatbot that uses an open-source LLM model served from ICP.
See [an example](https://github.com/patnorris/DecentralizedAIonIC) of an in-browser AI chatbot that uses an open-source LLM model served from ICP.

4. **Tokenization, marketplaces, orchestration**:
This refers to using smart contracts as the tokenization, marketplace, and orchestration layer for AI models and AI hardware.
Expand Down Expand Up @@ -61,10 +61,10 @@ Running AI models on-chain is too compute and memory-intensive for traditional b
2. Deterministic time slicing that automatically splits long-running computation over multiple blocks.
3. Powerful node hardware with a standardized specification. Nodes have 32-core CPUs, 512GiB RAM, and 30TB NVMe.

Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.
The long-term [vision of DeAI on ICP](https://internetcomputer.org/roadmap#Decentralized%20AI-start) is to support on-chain GPU compute to enable both training and inference of larger models.

You can learn more about running AI inference on ICP [here](./inference.mdx).

## Technical working group: DeAI

A technical working group dedicated to discussing decentralized AI and related projects meets bi-weekly on Thursdays at 5pm UTC. You can join via the [community Discord server](https://discord.gg/jnjVVQaE2C).
Expand All @@ -75,15 +75,12 @@ You can learn more about the group, review the notes from previous meetings, and

Several community projects that showcase how to use AI on ICP are available.

### On-chain inference frameworks
### Language models, agents, and chatbots

- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, PyTorch models and compiles to WebAssembly.
[The image classification example](https://github.com/dfinity/examples/tree/master/rust/image-classification) explains how to integrate it into a canister to run on ICP.
- [Rust-Connect-Py-AI-to-IC](https://github.com/jeshli/rust-connect-py-ai-to-ic): Open-source tool for deploying and running Python AI models on-chain using Sonos Tract.
- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, PyTorch models and compiles to WebAssembly.
[The MNIST example](https://github.com/smallstepman/ic-mnist) explains how to integrate it into a canister to run on ICP. [Try it here](https://jsi2g-jyaaa-aaaam-abnia-cai.icp0.io/).
- [Candle](https://github.com/huggingface/candle): a minimalist ML framework for Rust that compiles to WebAssembly.
[An AI chatbot example](https://github.com/ldclabs/ic-panda/tree/main/src/ic_panda_ai) shows how to run a Qwen 0.5B model in a canister on ICP.
- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain.
- [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Check out the canister yourself](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/).
- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. Try the [app in-browser](https://arcmindai.app).
- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).

### Vector databases

Expand All @@ -99,32 +96,3 @@ Several community projects that showcase how to use AI on ICP are available.
See their [forum post](https://forum.dfinity.org/t/blueband-vector-database/33934) for additional details on how it works and demos.
- [ELNA Vector DB](https://github.com/elna-ai/elna-vector-db): An open-source and fully on-chain vector database and vector similarity search engine primarily used to power the [ELNA.ai](https://elna.ai/) application.

### Language models, agents, and chatbots

- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain.
- [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Check out the canister yourself](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/).
- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. Try the [app in-browser](https://arcmindai.app).
- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).

### Calling OpenAI from a canister

- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts.

### Programming language specific resources

- **Rust**:
- See the links above for Rust examples.

- **Motoko**:
- [MotokoLearn](https://github.com/ildefons/motokolearn): A Motoko package that enables on-chain machine learning.
- [In-browser AI chat](https://github.com/patnorris/DecentralizedAIonIC).

- **C++**:
- [icpp-pro Getting Started](https://docs.icpp.world/getting-started.html).
- [icpp_llm](https://github.com/icppWorld/icpp_llm).
- [icpp-llama2 Deployment Tutorial](https://github.com/icppWorld/icpp_llm/blob/main/icpp_llama2/README.md).
- [icgpt](https://github.com/icppWorld/icgpt).

- **TypeScript/JavaScript**:
- [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses a pre-trained model for making predictions.
- [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Check it out yourself](https://icgpt.icpp.world/).
15 changes: 13 additions & 2 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -1092,8 +1092,19 @@ const sidebars = {
label: "Overview",
id: "developer-docs/ai/overview",
},
"developer-docs/ai/ai-on-chain",
"developer-docs/ai/machine-learning-sample",
{
type: "category",
label: "Inference",
items: [
{
type: "doc",
label: "Overview",
id: "developer-docs/ai/inference",
},
"developer-docs/ai/ai-on-chain",
"developer-docs/ai/machine-learning-sample",
]
},
],
},
{
Expand Down
Loading