From 17094095135c10ed3a67338905ebc96efd1f8a8d Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Thu, 29 Aug 2024 17:17:54 +0200
Subject: [PATCH 1/7] docs: Add page on decentralized AI inference

Most of the content is taken from the overview page and moved/reorganized
into an inference page that we can evolve over time.

I moved the examples into the inference tab in the sidebar as well, as
they are inference examples.
---
 docs/developer-docs/ai/overview.mdx | 54 ++++++-----------------------
 sidebars.js                         | 15 ++++++--
 2 files changed, 24 insertions(+), 45 deletions(-)
diff --git a/docs/developer-docs/ai/overview.mdx b/docs/developer-docs/ai/overview.mdx
index fea4e8ac3f..7e9b4a1440 100644
--- a/docs/developer-docs/ai/overview.mdx
+++ b/docs/developer-docs/ai/overview.mdx
@@ -4,7 +4,7 @@ keywords: [intermediate, concept, AI, ai, deAI, deai]
 
 import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";
 
-# Decentralized AI overview
+# Decentralized AI Overview
 
 <MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />
 
@@ -33,7 +33,7 @@ Inference happens on the user's device after downloading the model.
 If the user trusts their own device, then they can trust that the inference ran correctly.
 A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
 ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
-See [an example]((https://github.com/patnorris/DecentralizedAIonIC) of an in-browser AI chatbot that uses an open-source LLM model served from ICP.
+See [an example](https://github.com/patnorris/DecentralizedAIonIC) of an in-browser AI chatbot that uses an open-source LLM model served from ICP.
 
 4. **Tokenization, marketplaces, orchestration**:
 This refers to using smart contracts as the tokenization, marketplace, and orchestration layer for AI models and AI hardware.
@@ -61,10 +61,10 @@ Running AI models on-chain is too compute and memory-intensive for traditional b
 2. Deterministic time slicing that automatically splits long-running computation over multiple blocks.
 3. Powerful node hardware with a standardized specification. Nodes have 32-core CPUs, 512GiB RAM, and 30TB NVMe.
 
-Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
-Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.
 The long-term [vision of DeAI on ICP](https://internetcomputer.org/roadmap#Decentralized%20AI-start) is to support on-chain GPU compute to enable both training and inference of larger models.
 
+You can learn more about running AI inference on ICP [here](./inference.mdx).
+
 ## Technical working group: DeAI
 
 A technical working group dedicated to discussing decentralized AI and related projects meets bi-weekly on Thursdays at 5pm UTC. You can join via the [community Discord server](https://discord.gg/jnjVVQaE2C).
@@ -75,54 +75,22 @@ You can learn more about the group, review the notes from previous meetings, and
 
 Several community projects that showcase how to use AI on ICP are available.
 
-### On-chain inference frameworks
+### Language models, agents, and chatbots
 
-- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, PyTorch models and compiles to WebAssembly.
-  [The image classification example](https://github.com/dfinity/examples/tree/master/rust/image-classification) explains how to integrate it into a canister to run on ICP.
-- [Rust-Connect-Py-AI-to-IC](https://github.com/jeshli/rust-connect-py-ai-to-ic): Open-source tool for deploying and running Python AI models on-chain using Sonos Tract.
-- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, PyTorch models and compiles to WebAssembly.
-  [The MNIST example](https://github.com/smallstepman/ic-mnist) explains how to integrate it into a canister to run on ICP. [Try it here](https://jsi2g-jyaaa-aaaam-abnia-cai.icp0.io/).
-- [Candle](https://github.com/huggingface/candle): a minimalist ML framework for Rust that compiles to WebAssembly.
-  [An AI chatbot example](https://github.com/ldclabs/ic-panda/tree/main/src/ic_panda_ai) shows how to run a Qwen 0.5B model in a canister on ICP.
+- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain.
+- [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Check out the canister yourself](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/).
+- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. Try the [app in-browser](https://arcmindai.app).
+- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
 
-### Vector databases
+## Vector databases
 
 - [Vectune](https://github.com/ClankPan/Vectune): Vectune is a lightweight VectorDB with incremental indexing, based on FreshVamana written in Rust.
   See [a forum post](https://forum.dfinity.org/t/worlds-largest-web3-vector-database/33309) that explains how it works.
 
-#### Developed on ICP
+### Developed on ICP
 - [ArcMind Vector DB](https://github.com/arcmindai/arcmindvector): A vector database that supports text, image, and audio embedding.
 - [KinicDAO Vector DB](https://xcvai-qiaaa-aaaak-afowq-cai.icp0.io/): A high-performance, completely on-chain, tamper-proof vector database specifically built for decentrlized apps. 
 - [Blueband](https://github.com/highfeast/ic-use-blueband-db): A vector database built based on [Vectra](https://www.vectra.ai/), a local vector database for [Node.js](https://nodejs.org/en).
   See their [forum post](https://forum.dfinity.org/t/blueband-vector-database/33934) for additional details on how it works and demos.
 - [ELNA Vector DB](https://github.com/elna-ai/elna-vector-db): An open-source and fully on-chain vector database and vector similarity search engine primarily used to power the [ELNA.ai](https://elna.ai/) application.
 
-### Language models, agents, and chatbots
-
-- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain.
-- [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Check out the canister yourself](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/).
-- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. Try the [app in-browser](https://arcmindai.app).
-- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
-
-### Calling OpenAI from a canister
-
-- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts.
-
-### Programming language specific resources
-
-  - **Rust**:
-    - See the links above for Rust examples.
-
-  - **Motoko**:
-    - [MotokoLearn](https://github.com/ildefons/motokolearn): A Motoko package that enables on-chain machine learning.
-    - [In-browser AI chat](https://github.com/patnorris/DecentralizedAIonIC).
-
-  - **C++**:
-    - [icpp-pro Getting Started](https://docs.icpp.world/getting-started.html).
-    - [icpp_llm](https://github.com/icppWorld/icpp_llm).
-    - [icpp-llama2 Deployment Tutorial](https://github.com/icppWorld/icpp_llm/blob/main/icpp_llama2/README.md).
-    - [icgpt](https://github.com/icppWorld/icgpt).
-
-  - **TypeScript/JavaScript**:
-    - [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses a pre-trained model for making predictions.
-    - [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Check it out yourself](https://icgpt.icpp.world/).
diff --git a/sidebars.js b/sidebars.js
index a38d70e540..cbd7c5d9f1 100644
--- a/sidebars.js
+++ b/sidebars.js
@@ -1093,8 +1093,19 @@ const sidebars = {
           label: "Overview",
           id: "developer-docs/ai/overview",
         },
-        "developer-docs/ai/ai-on-chain",
-        "developer-docs/ai/machine-learning-sample",
+        {
+          type: "category",
+          label: "Inference",
+          items: [
+            {
+              type: "doc",
+              label: "Overview",
+              id: "developer-docs/ai/inference",
+	    },
+	    "developer-docs/ai/ai-on-chain",
+	    "developer-docs/ai/machine-learning-sample",
+	  ]
+        },
       ],
     },
     {

From 1548c21d2cfd553f8ad8357042d58d5357d530a6 Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Thu, 29 Aug 2024 17:19:24 +0200
Subject: [PATCH 2/7] .

---
 docs/developer-docs/ai/inference.mdx | 63 ++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)
 create mode 100644 docs/developer-docs/ai/inference.mdx

diff --git a/docs/developer-docs/ai/inference.mdx b/docs/developer-docs/ai/inference.mdx
new file mode 100644
index 0000000000..69419fff4d
--- /dev/null
+++ b/docs/developer-docs/ai/inference.mdx
@@ -0,0 +1,63 @@
+---
+keywords: [intermediate, concept, AI, ai, deAI, deai]
+---
+
+import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";
+
+# Decentralized AI Inference
+
+<MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />
+
+It's possible for canister smart contracts to run inference in a number of ways, depending on the decentralization and performance requirements.
+
+## 1. Inference on-chain
+
+Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
+Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.
+The long-term [vision of DeAI on ICP](https://internetcomputer.org/roadmap#Decentralized%20AI-start) is to support on-chain GPU compute to enable both training and inference of larger models.
+
+### Examples
+
+- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain. `Rust`
+- [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
+- [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses a pre-trained model for making predictions. `TypeScript`
+- [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Try it here](https://icgpt.icpp.world/). `TypeScript` `C++`
+- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. [Try it here](https://arcmindai.app). `Rust`
+
+### On-chain inference frameworks
+
+- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, PyTorch models and compiles to WebAssembly. `Rust`
+- [MotokoLearn](https://github.com/ildefons/motokolearn): A Motoko package that enables on-chain machine learning.
+  [The image classification example](https://github.com/dfinity/examples/tree/master/rust/image-classification) explains how to integrate it into a canister to run on ICP.
+  `Motoko`
+- [Rust-Connect-Py-AI-to-IC](https://github.com/jeshli/rust-connect-py-ai-to-ic): Open-source tool for deploying and running Python AI models on-chain using Sonos Tract.
+  `Rust`
+- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, PyTorch models and compiles to WebAssembly.
+  [The MNIST example](https://github.com/smallstepman/ic-mnist) explains how to integrate it into a canister to run on ICP. [Try it here](https://jsi2g-jyaaa-aaaam-abnia-cai.icp0.io/).
+  `Rust`
+- [Candle](https://github.com/huggingface/candle): a minimalist ML framework for Rust that compiles to WebAssembly.
+  [An AI chatbot example](https://github.com/ldclabs/ic-panda/tree/main/src/ic_panda_ai) shows how to run a Qwen 0.5B model in a canister on ICP.
+  `Rust`
+
+----
+
+## 2. Inference on-device
+
+An alternative to running the model on-chain would be for the user to download the model from a canister smart contract, and the inference then happens on the user's device.
+If the user trusts their own device, then they can trust that the inference ran correctly.
+A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
+ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
+
+### Examples
+
+  - [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Try it here](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/). `Motoko`
+
+----
+
+## 3. Inference with HTTP calls
+
+Smart contracts running on ICP can make [HTTP requests](https://internetcomputer.org/docs/current/tutorials/developer-journey/level-3/3.2-https-outcalls/) to Web2 services including OpenAI and Claude.
+
+### Examples
+
+- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts. [Try it here](https://pycrs-xiaaa-aaaal-ab6la-cai.icp0.io/). `TypeScript` `Rust`

From 521d8b859ae77a116ce6129ad1ee35c38a3a6d23 Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Thu, 29 Aug 2024 17:20:08 +0200
Subject: [PATCH 3/7] .

---
 docs/developer-docs/ai/overview.mdx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/developer-docs/ai/overview.mdx b/docs/developer-docs/ai/overview.mdx
index 7e9b4a1440..8144d9e7f1 100644
--- a/docs/developer-docs/ai/overview.mdx
+++ b/docs/developer-docs/ai/overview.mdx
@@ -82,12 +82,12 @@ Several community projects that showcase how to use AI on ICP are available.
 - [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. Try the [app in-browser](https://arcmindai.app).
 - [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
 
-## Vector databases
+### Vector databases
 
 - [Vectune](https://github.com/ClankPan/Vectune): Vectune is a lightweight VectorDB with incremental indexing, based on FreshVamana written in Rust.
   See [a forum post](https://forum.dfinity.org/t/worlds-largest-web3-vector-database/33309) that explains how it works.
 
-### Developed on ICP
+#### Developed on ICP
 - [ArcMind Vector DB](https://github.com/arcmindai/arcmindvector): A vector database that supports text, image, and audio embedding.
 - [KinicDAO Vector DB](https://xcvai-qiaaa-aaaak-afowq-cai.icp0.io/): A high-performance, completely on-chain, tamper-proof vector database specifically built for decentrlized apps. 
 - [Blueband](https://github.com/highfeast/ic-use-blueband-db): A vector database built based on [Vectra](https://www.vectra.ai/), a local vector database for [Node.js](https://nodejs.org/en).

From e7d1f28693c82f30928b5eb2631a031620a2a162 Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Wed, 11 Sep 2024 11:48:10 +0200
Subject: [PATCH 4/7] Apply suggestions from code review

Co-authored-by: Jessie Mongeon <133128541+jessiemongeon1@users.noreply.github.com>
---
 docs/developer-docs/ai/inference.mdx | 32 +++++++++++-----------------
 docs/developer-docs/ai/overview.mdx  |  2 +-
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/docs/developer-docs/ai/inference.mdx b/docs/developer-docs/ai/inference.mdx
index 69419fff4d..af973b6f53 100644
--- a/docs/developer-docs/ai/inference.mdx
+++ b/docs/developer-docs/ai/inference.mdx
@@ -4,13 +4,13 @@ keywords: [intermediate, concept, AI, ai, deAI, deai]
 
 import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";
 
-# Decentralized AI Inference
+# Decentralized AI inference
 
 <MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />
 
 It's possible for canister smart contracts to run inference in a number of ways, depending on the decentralization and performance requirements.
 
-## 1. Inference on-chain
+## Inference on-chain
 
 Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
 Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.
@@ -18,30 +18,25 @@ The long-term [vision of DeAI on ICP](https://internetcomputer.org/roadmap#Decen
 
 ### Examples
 
-- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain. `Rust`
+- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain using `Rust`.
 - [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
-- [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses a pre-trained model for making predictions. `TypeScript`
-- [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Try it here](https://icgpt.icpp.world/). `TypeScript` `C++`
-- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent using Chain of Thoughts for reasoning and actions. [Try it here](https://arcmindai.app). `Rust`
+- [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses TypeScript and a pre-trained model for making predictions. 
+- [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Try it here](https://icgpt.icpp.world/).
+- [ArcMind AI](https://github.com/arcmindai/arcmindai): An autonomous agent written in Rust using chain of thoughts for reasoning and actions. [Try it here](https://arcmindai.app).
 
 ### On-chain inference frameworks
 
-- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, PyTorch models and compiles to WebAssembly. `Rust`
+- [Sonos Tract](https://github.com/sonos/tract): An open-source AI inference engine written in Rust that supports ONNX, TensorFlow, and PyTorch models, and compiles to WebAssembly. 
 - [MotokoLearn](https://github.com/ildefons/motokolearn): A Motoko package that enables on-chain machine learning.
   [The image classification example](https://github.com/dfinity/examples/tree/master/rust/image-classification) explains how to integrate it into a canister to run on ICP.
-  `Motoko`
 - [Rust-Connect-Py-AI-to-IC](https://github.com/jeshli/rust-connect-py-ai-to-ic): Open-source tool for deploying and running Python AI models on-chain using Sonos Tract.
-  `Rust`
-- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, PyTorch models and compiles to WebAssembly.
+- [Burn](https://github.com/tracel-ai/burn): An open-source deep learning framework written in Rust that supports ONNX, and PyTorch models, and compiles to WebAssembly.
   [The MNIST example](https://github.com/smallstepman/ic-mnist) explains how to integrate it into a canister to run on ICP. [Try it here](https://jsi2g-jyaaa-aaaam-abnia-cai.icp0.io/).
-  `Rust`
 - [Candle](https://github.com/huggingface/candle): a minimalist ML framework for Rust that compiles to WebAssembly.
   [An AI chatbot example](https://github.com/ldclabs/ic-panda/tree/main/src/ic_panda_ai) shows how to run a Qwen 0.5B model in a canister on ICP.
-  `Rust`
 
-----
 
-## 2. Inference on-device
+## Inference on-device
 
 An alternative to running the model on-chain would be for the user to download the model from a canister smart contract, and the inference then happens on the user's device.
 If the user trusts their own device, then they can trust that the inference ran correctly.
@@ -50,14 +45,13 @@ ICP supports this use case for practically all existing models because a smart c
 
 ### Examples
 
-  - [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Try it here](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/). `Motoko`
+  - [DeVinci](https://github.com/patnorris/DecentralizedAIonIC): An in-browser AI chatbot that uses an open-source LLM model served from ICP. [Try it here](https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/). 
 
-----
 
-## 3. Inference with HTTP calls
+## Inference with HTTP calls
 
-Smart contracts running on ICP can make [HTTP requests](https://internetcomputer.org/docs/current/tutorials/developer-journey/level-3/3.2-https-outcalls/) to Web2 services including OpenAI and Claude.
+Smart contracts running on ICP can make [HTTP requests through HTTP outcalls](/docs/current/developer-docs/smart-contracts/advanced-features/https-outcalls/https-outcalls-overview) to Web2 services including OpenAI and Claude.
 
 ### Examples
 
-- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts. [Try it here](https://pycrs-xiaaa-aaaal-ab6la-cai.icp0.io/). `TypeScript` `Rust`
+- [Juno + OpenAI](https://github.com/peterpeterparker/juno-openai): An example using Juno and OpenAI to generate images from prompts. [Try it here](https://pycrs-xiaaa-aaaal-ab6la-cai.icp0.io/).
diff --git a/docs/developer-docs/ai/overview.mdx b/docs/developer-docs/ai/overview.mdx
index 8144d9e7f1..8fbee848f0 100644
--- a/docs/developer-docs/ai/overview.mdx
+++ b/docs/developer-docs/ai/overview.mdx
@@ -4,7 +4,7 @@ keywords: [intermediate, concept, AI, ai, deAI, deai]
 
 import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";
 
-# Decentralized AI Overview
+# Decentralized AI overview
 
 <MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />
 

From 824657b6a9d2788192e5f8ab186b7ad253b2470a Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Wed, 11 Sep 2024 11:58:04 +0200
Subject: [PATCH 5/7] .

---
 docs/developer-docs/ai/inference.mdx | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/docs/developer-docs/ai/inference.mdx b/docs/developer-docs/ai/inference.mdx
index af973b6f53..606753c95f 100644
--- a/docs/developer-docs/ai/inference.mdx
+++ b/docs/developer-docs/ai/inference.mdx
@@ -14,11 +14,10 @@ It's possible for canister smart contracts to run inference in a number of ways,
 
 Currently, ICP supports on-chain inference of small models using AI libraries such as [Sonos Tract](https://github.com/sonos/tract) that compile to WebAssembly.
 Check out the [image classification example](/docs/current/developer-docs/ai/ai-on-chain) to learn how it works.
-The long-term [vision of DeAI on ICP](https://internetcomputer.org/roadmap#Decentralized%20AI-start) is to support on-chain GPU compute to enable both training and inference of larger models.
 
 ### Examples
 
-- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain using `Rust`.
+- [GPT2](https://github.com/modclub-app/rust-connect-py-ai-to-ic/tree/main/internet_computer/examples/gpt2): An example of GPT2 running on-chain using Rust.
 - [ELNA AI](https://github.com/elna-ai): A fully on-chain AI agent platform and marketplace. Supports both on-chain and off-chain LLMs. [Try it here](https://dapp.elna.ai/).
 - [Tensorflow on ICP](https://github.com/carlosarturoceron/decentAI): An Azle example that uses TypeScript and a pre-trained model for making predictions. 
 - [ICGPT](https://github.com/icppWorld/icgpt): A React frontend that uses a C/C++ backend running an LLM fully on-chain. [Try it here](https://icgpt.icpp.world/).

From da77bb43dff6a555ee4385a3817f5e08d2d476f2 Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Wed, 25 Sep 2024 11:03:45 +0200
Subject: [PATCH 6/7] Update docs/developer-docs/ai/inference.mdx

Co-authored-by: Jessie Mongeon <133128541+jessiemongeon1@users.noreply.github.com>
---
 docs/developer-docs/ai/inference.mdx | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/docs/developer-docs/ai/inference.mdx b/docs/developer-docs/ai/inference.mdx
index 606753c95f..96afdd3462 100644
--- a/docs/developer-docs/ai/inference.mdx
+++ b/docs/developer-docs/ai/inference.mdx
@@ -8,7 +8,13 @@ import { MarkdownChipRow } from "/src/components/Chip/MarkdownChipRow";
 
 <MarkdownChipRow labels={["Intermediate", "Concept", "DeAI" ]} />
 
-It's possible for canister smart contracts to run inference in a number of ways, depending on the decentralization and performance requirements.
+
+## Overview 
+
+Inference in the context of decentralized AI refers to using a trained model to draw conclusions about new data. 
+It's possible for canister smart contracts to run inference in a number of ways depending on the decentralization and performance requirements.
+
+Canisters can utilize inference run on-chain, on-device, or through HTTPS outcalls.
 
 ## Inference on-chain
 

From 1eb46e35178b131a7c5b33616c699b6c34d4b818 Mon Sep 17 00:00:00 2001
From: Islam El-Ashi <islam.elashi@dfinity.org>
Date: Wed, 25 Sep 2024 11:03:52 +0200
Subject: [PATCH 7/7] Update docs/developer-docs/ai/inference.mdx

Co-authored-by: Jessie Mongeon <133128541+jessiemongeon1@users.noreply.github.com>
---
 docs/developer-docs/ai/inference.mdx | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/docs/developer-docs/ai/inference.mdx b/docs/developer-docs/ai/inference.mdx
index 96afdd3462..0c895e42d8 100644
--- a/docs/developer-docs/ai/inference.mdx
+++ b/docs/developer-docs/ai/inference.mdx
@@ -43,10 +43,11 @@ Check out the [image classification example](/docs/current/developer-docs/ai/ai-
 
 ## Inference on-device
 
-An alternative to running the model on-chain would be for the user to download the model from a canister smart contract, and the inference then happens on the user's device.
-If the user trusts their own device, then they can trust that the inference ran correctly.
-A disadvantage here is that the model needs to be downloaded to the user's device with corresponding drawbacks of less confidentiality of the model and decreased user experience due to increased latency.
-ICP supports this use case for practically all existing models because a smart contract on ICP can store models up to 400GiB.
+An alternative to running the model on-chain would be to download the model from a canister, then run the inference on the local device. If the user trusts their own device, then they can trust that the inference ran correctly.
+
+A disadvantage of this workflow is that the model needs to be downloaded to the user's device, resulting in less confidentiality of the model and decreased user experience due to increased latency.
+
+ICP supports this workflow for most existing models because a smart contract on ICP can store models up to 400GiB.
 
 ### Examples