From 42b51a301c0aa7e645b8482a40d1f7d8e0e07208 Mon Sep 17 00:00:00 2001
From: Georges Lorre <georges.lorre@ml6.eu>
Date: Fri, 18 Aug 2023 14:10:31 +0200
Subject: [PATCH 1/2] Update docs with the new CLI commands

---
 docs/getting_started.md |  7 +++++
 docs/pipeline.md        | 67 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/docs/getting_started.md b/docs/getting_started.md
index 1ef68e440..fe8328695 100644
--- a/docs/getting_started.md
+++ b/docs/getting_started.md
@@ -309,3 +309,10 @@ fondant explore --data-directory "path/to/your/data"
 ```
 
 Note that if you use a remote path (S3, GCS) you can also pass credentials using the `--credentials` flag. For all the options of the data explorer run `fondant explore --help`.
+
+
+
+## Running at scale
+
+You can find more information on how to configure and run your pipeline on different runners [here](pipeline.md)
+
diff --git a/docs/pipeline.md b/docs/pipeline.md
index beb355601..1919f8696 100644
--- a/docs/pipeline.md
+++ b/docs/pipeline.md
@@ -115,15 +115,67 @@ where processing one row significantly increases the number of rows in the datas
 By setting a lower value for input partition rows, you can mitigate issues where the processed data
 grows larger than the available memory before being written to disk.
 
-## Compiling a pipeline
+## Compiling and Running a pipeline
 
 Once all your components are added to your pipeline you can use different compilers to run your pipeline:
 
+!!! note "IMPORTANT"
+  When using other runners you will need to make sure that your new environment has access to:
+  - The base_path of your pipeline (can be storage bucket like S3, GCS, etc)
+  - The images used in your pipeline (make sure you have access to the registries where the images are stored)
+
 ### Kubeflow
-TODO: update this once kubeflow compiler is implemented
 
-~~Once the pipeline is built, you need to initialize the client with the kubeflow host path (more info about the host path can be found in the [infrastructure documentation](https://github.com/ml6team/fondant/blob/main/docs/infrastructure.md))
-and use it to compile and run the pipeline with the `compile_and_run()` method. This performs static checking to ensure that all required arguments are provided to the components and that the required input data subsets are available. If the checks pass, a URL will be provided, allowing you to visualize and monitor the execution of your pipeline.~~
+The Kubeflow compiler will take your pipeline and compile it to a Kubeflow pipeline spec. This spec can be used to run your pipeline on a Kubeflow cluster. There are 2 ways to compile your pipeline to a Kubeflow spec:
+
+- Using the CLI:
+```bash
+fondant compile <pipeline_ref> --kubeflow --output <path_to_output>
+```
+
+- Using the compiler directly:
+```python
+from fondant.compiler import KubeFlowCompiler
+
+
+pipeline = ...
+
+compiler = KubeFlowCompiler()
+compiler.compile(pipeline=pipeline, output_path="pipeline.yaml")
+```
+
+Both of these options will produce a kubeflow specification as a file, if you also want to immediately start a run you can also use the runner we provide (see below).
+
+### Running a Kubeflow compiled pipeline
+
+You will need a Kubeflow cluster to run your pipeline on and specify the host of that cluster. More info on setting up a Kubeflow pipelines deployment and the host path can be found in the [infrastructure documentation](infrastructure.md).
+
+There are 2 ways to run a Kubeflow compiled pipeline:
+
+- Using the CLI:
+```bash
+fondant run <pipeline_ref> --kubeflow --host <kubeflow_host>
+```
+NOTE: that the pipeline ref is the path to the compiled pipeline spec OR a reference to an fondant pipeline in which case the compiler will compile the pipeline first before running.
+
+
+- Using the compiler directly:
+```python
+from fondant.compiler import KubeFlowCompiler
+from fondant.runner import KubeflowRunner
+
+# Your pipeline definition here
+
+if __name__ == "__main__":
+    compiler = KubeFlowCompiler()
+    compiler.compile(pipeline=pipeline, output_path="pipeline.yaml")
+    runner = KubeflowRunner(
+        host="YOUR KUBEFLOW HOST",
+    )
+    runner.run(input_spec="pipeline.yaml")
+```
+
+Once your pipeline is running you can monitor it using the Kubeflow UI.
 
 ### Docker-Compose
 
@@ -188,4 +240,11 @@ Navigate to the folder where your docker compose is located and run (you need to
 docker compose up
 ```
 
+Or you can use the fondant cli to run the pipeline:
+```bash
+fondant run <pipeline_ref> --local
+```
+
+NOTE: that the pipeline ref is the path to the compiled pipeline spec OR a reference to an fondant pipeline in which case the compiler will compile the pipeline first before running.
+
 This will start the pipeline and provide logs per component(service)
\ No newline at end of file

From 448280826f5c31b1b824039736457719caccabfe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Georges=20Lorr=C3=A9?=
 <35808396+GeorgesLorre@users.noreply.github.com>
Date: Fri, 18 Aug 2023 15:46:28 +0200
Subject: [PATCH 2/2] Update docs/pipeline.md

Co-authored-by: Philippe Moussalli <philippe.moussalli95@gmail.com>
---
 docs/pipeline.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/pipeline.md b/docs/pipeline.md
index 1919f8696..756830b72 100644
--- a/docs/pipeline.md
+++ b/docs/pipeline.md
@@ -247,4 +247,4 @@ fondant run <pipeline_ref> --local
 
 NOTE: that the pipeline ref is the path to the compiled pipeline spec OR a reference to an fondant pipeline in which case the compiler will compile the pipeline first before running.
 
-This will start the pipeline and provide logs per component(service)
\ No newline at end of file
+This will start the pipeline and provide logs per component(service).
\ No newline at end of file