Merge pull request #85 from truefoundry/docs_branch

Separated docs to different files
truefoundry · Feb 17, 2025 · 37ea4ee · 37ea4ee
2 parents 4077414 + c1b5706
commit 37ea4ee
Show file tree

Hide file tree

Showing 7 changed files with 488 additions and 0 deletions.
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -0,0 +1,131 @@
+---
+title: Elasti Architecture
+---
+
+<!-- START doctoc generated TOC please keep comment here to allow auto update -->
+<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
+**Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*
+
+- [Elasti Project Documentation](#elasti-project-documentation)
+  - [1. Introduction](#1-introduction)
+    - [Overview](#overview)
+    - [Key Components](#key-components)
+  - [2. Architecture](#2-architecture)
+    - [Flow Description](#flow-description)
+  - [3. Controller](#3-controller)
+  - [4. Resolver](#4-resolver)
+  - [5. Helm Values](#5-helm-values)
+
+<!-- END doctoc generated TOC please keep comment here to allow auto update -->
+
+# Elasti Project Documentation
+
+## 1. Introduction
+
+### Overview
+The Elasti project is designed to enable serverless capability for Kubernetes services by dynamically scaling services based on incoming requests. It comprises two main components: operator and resolver. The elasti-operator manages the scaling of target services, while the resolver intercepts and queues requests when the target service is scaled down to zero replicas.
+
+### Key Components
+- **Operator**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scales target services as needed.
+- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-operator to scale up the target service.
+
+<div align="center">
+<img src="./assets/components.png" width="500px">
+</div>
+
+
+## 2. Architecture
+<div align="center">
+<img src="./assets/hld.png" width="1000px">
+</div>
+
+### Flow Description
+
+- **[CRD Created]** The Operator fetches details from the CRD.
+   1. Adds a finalizer to the CRD, ensuring it is only deleted by the Operator for proper cleanup.
+   2. Fetches the `ScaleTargetRef` and initiates a watch on it.
+   3. Adds the CRD details to a `crdDirectory`, caching the details of all CRDs.
+- **[ScaleTargetRef Watch]** When a watch is added to the `ScaleTargetRef`:
+   1. Identifies the kind of target and checks the available ready pods.
+   2. If `replicas == 0` -> Switches to **Proxy Mode**.
+   3. If `replicas > 0` -> Switches to **Serve Mode**.
+   4. Currently, it supports only `deployments` and `rollouts`.
+
+- **When pods scale to 0**
+
+- **[Switch to Proxy Mode]**
+   1. Creates a Private Service for the target service. This allows the resolver to reach the target pod, even when the public service has been modified, as described in the following steps.
+   2. Creates a watch on the public service to monitor changes in ports or selectors.
+   3. Creates a new `EndpointSlice` for the public service to redirect any traffic to the resolver.
+   4. Creates a watch on the resolver to monitor the addition of new pods.
+
+- **[In Proxy Mode]**
+    1. Traffic reaching the target service, which has no pods, is sent to the resolver, capable of handling requests on all endpoints.
+    2. [**In Resolver**]
+       1. Once traffic hits the resolver, it reaches the `handleAnyRequest` handler.
+       2. The host is extracted from the request. If it's a known host, the cache is retrieved from `hostManager`. If not, the service name is extracted from the host and saved in `hostManager`.
+       3. The service name is used to identify the private service.
+       4. Using `operatorRPC`, the controller is informed about the incoming request.
+       5. The request is sent to the `throttler`, which queues the requests. It checks if the pods for the private service are up.
+          1. If yes, a proxy request is made, and the response is sent back.
+          2. If no, the request is re-enqueued, and the check is retried after a configurable time interval (set in the Helm values file).
+       6. If the request is successful, traffic for this host is disabled temporarily (configurable). This prevents new incoming requests to the resolver, as the target is now verified to be up.
+    3. [**In Controller/Operator**]
+       1. ElastiServer processes requests from the resolver, containing the service experiencing traffic.
+       2. Matches the service with the `crdDirectory` entry to retrieve the `ScaleTargetRef`, which is then used to scale the target.
+       3. Evaluates triggers defined in the ElastiService:
+           - If **any** trigger indicates that the service should be scaled up -> Scales to minTargetReplicas
+       4. Once scaled up, switches to **Serve Mode**
+
+- **When pods scale to 1**
+
+- **[Switch to Serve Mode]**
+    1. The Operator stops the informer/watch on the resolver.
+    2. The Operator deletes the `EndpointSlice` pointing to the resolver.
+    3. The system switches to **Serve Mode**.
+- **[In Serve Mode]**
+    1. Traffic hits the gateway, is routed to the target service, then to the target pod, and resolves the request.
+    2. The Operator periodically evaluates triggers defined in the ElastiService.
+    3. If **all** triggers indicate that the service is to be scaled down and cooldownPeriod has elapsed since last scale-up:
+        - Scales down the target service to zero replicas
+        - Switches to **Proxy Mode**
+
+
+## 3. Controller
+
+<div align="center">
+<img src="./assets/lld-operator.png" width="1000px">
+</div>
+
+## 4. Resolver
+
+<div align="center">
+<img src="./assets/lld-resolver.png" width="800px">
+</div>
+
+## 5. Helm Values
+
+Values you can pass to elastiResolver env.
+```yaml
+
+# HeaderForHost is the header to look for to get the host. X-Envoy-Decorator-Operation is the key for istio
+headerForHost: X-Envoy-Decorator-Operation
+# InitialCapacity is the initial capacity of the semaphore
+initialCapacity: "500"
+maxIdleProxyConns: "100"
+maxIdleProxyConnsPerHost: "500"
+# MaxQueueConcurrency is the maximum number of concurrent requests
+maxQueueConcurrency: "100"
+# OperatorRetryDuration is the duration for which we don't inform the operator
+# about the traffic on the same host
+operatorRetryDuration: "10"
+# QueueRetryDuration is the duration after we retry the requests in queue
+queueRetryDuration: "3"
+# QueueSize is the size of the queue
+queueSize: "50000"
+# ReqTimeout is the timeout for each request
+reqTimeout: "120"
+# TrafficReEnableDuration is the duration for which the traffic is disabled for a host
+# This is also duration for which we don't recheck readiness of the service
+trafficReEnableDuration: "5"
+```
diff --git a/docs/comparisons.md b/docs/comparisons.md
@@ -0,0 +1,54 @@
+# Comparisons with Other Solutions
+
+This document compares Elasti with other popular serverless and scale-to-zero solutions in the Kubernetes ecosystem.
+
+## Knative
+
+### Overview
+Knative is a comprehensive platform for deploying and managing serverless workloads on Kubernetes. It provides a complete serverless experience with features like scale-to-zero, request-based autoscaling, and traffic management.
+
+### Key Differences
+- **Complexity**: Knative is a full-featured platform that requires significant setup and maintenance. Elasti is focused solely on scale-to-zero functionality and can be added to existing services with minimal configuration.
+- **Integration**: Knative requires services to be deployed as Knative services. Elasti works with existing Kubernetes deployments and Argo Rollouts without modification.
+- **Learning Curve**: Knative has a steeper learning curve due to its many concepts and components. Elasti follows familiar Kubernetes patterns with simple CRD-based configuration.
+
+## OpenFaaS
+
+### Overview
+OpenFaaS is a framework for building serverless functions with Docker and Kubernetes, making it easy to deploy serverless functions to any cloud or on-premises.
+
+### Key Differences
+- **Purpose**: OpenFaaS is primarily designed for Function-as-a-Service (FaaS) workloads. Elasti is built for existing HTTP services.
+- **Architecture**: OpenFaaS requires functions to be written and packaged in a specific way. Elasti works with any HTTP service without code changes.
+- **Scaling**: OpenFaaS uses its own scaling mechanisms. Elasti integrates with existing autoscalers (HPA/KEDA) while adding scale-to-zero capability.
+
+## KEDA HTTP Add-on
+
+### Overview
+KEDA HTTP Add-on is an extension to KEDA that enables HTTP-based scaling, including scale-to-zero functionality.
+
+### Key Differences
+- **Maturity**: KEDA HTTP Add-on is in beta and not recommended for production use
+- **Request Handling**: 
+  - KEDA http add-on inserts itself in the http path and handles requests even when the service has been scaled up.
+  - Elasti takes itself out of the http path once the service has been scaled up.
+- **Integration**:
+  - KEDA HTTP Add-on requires KEDA installation and configuration.
+  - Elasti can work standalone or integrate with KEDA if needed.
+
+## Feature Comparison Table
+
+| Feature | Elasti | Knative | OpenFaaS | KEDA HTTP Add-on |
+|---------|---------|----------|-----------|------------------|
+| Scale to Zero | ✅ | ✅ | ✅ | ✅ |
+| Works with Existing Services | ✅ | ❌ | ❌ | ✅ |
+| Resource Footprint | Low | High | Medium | Low |
+| Setup Complexity | Low | High | Medium | Medium |
+
+## When to Choose Elasti
+
+Elasti is the best choice when you:
+1. Need to add scale-to-zero capability to existing HTTP services
+2. Want to ensure zero request loss during scaling operations
+3. Prefer a lightweight solution with minimal configuration
+4. Need integration with existing autoscalers (HPA/KEDA)
diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -0,0 +1,126 @@
+# Getting Started
+
+With Elasti, you can easily manage and scale your Kubernetes services by using a proxy mechanism that queues and holds requests for scaled-down services, bringing them up only when needed. Get started by following below steps:
+
+## Prerequisites
+
+- **Kubernetes Cluster:** You should have a running Kubernetes cluster. You can use any cloud-based or on-premises Kubernetes distribution.
+- **kubectl:** Installed and configured to interact with your Kubernetes cluster.
+- **Helm:** Installed for managing Kubernetes applications.
+
+## Install
+
+### 1. Install Elasti using helm
+
+Use Helm to install elasti into your Kubernetes cluster. Replace `<release-name>` with your desired release name and `<namespace>` with the Kubernetes namespace you want to use:
+
+```bash
+helm install <release-name> oci://tfy.jfrog.io/tfy-helm/elasti --namespace <namespace> --create-namespace
+```
+Check out [values.yaml](./charts/elasti/values.yaml) to see config in the helm value file.
+
+### 2. Verify the Installation
+
+Check the status of your Helm release and ensure that the elasti components are running:
+
+```bash
+helm status <release-name> --namespace <namespace>
+kubectl get pods -n <namespace>
+```
+
+You will see 2 components running.
+
+1.  **Controller/Operator:** `elasti-operator-controller-manager-...` is to switch the traffic, watch resources, scale etc.
+2.  **Resolver:** `elasti-resolver-...` is to proxy the requests.
+
+Refer to the [Docs](./docs/architecture) to know how it works.
+
+## Configuration
+
+To configure a service to handle its traffic via elasti, you'll need to create and apply a `ElastiService` custom resource:
+
+### 1. Define an ElastiService
+
+```yaml
+apiVersion: elasti.truefoundry.com/v1alpha1
+kind: ElastiService
+metadata:
+  name: <service-name>
+  namespace: <service-namespace>
+spec:
+  minTargetReplicas: <min-target-replicas>
+  service: <service-name>
+  cooldownPeriod: <cooldown-period>
+  scaleTargetRef:
+    apiVersion: <apiVersion>
+    kind: <kind>
+    name: <deployment-or-rollout-name>
+  triggers:
+  - type: <trigger-type>
+    metadata:
+      <trigger-metadata>
+  autoscaler:
+    name: <autoscaler-object-name>
+    type: <autoscaler-type>
+```
+
+- `<service-name>`: Replace it with the service you want managed by elasti.
+- `<min-target-replicas>`: Min replicas to bring up when first request arrives.
+- `<service-namespace>`: Replace by namespace of the service.
+- `<scaleTargetRef>`: Reference to the scale target similar to the one used in HorizontalPodAutoscaler.
+- `<kind>`: Replace by `rollouts` or `deployments`
+- `<apiVersion>`: Replace with `argoproj.io/v1alpha1` or `apps/v1`
+- `<deployment-or-rollout-name>`: Replace with name of the rollout or the deployment for the service. This will be scaled up to min-target-replicas when first request comes
+- `cooldownPeriod`: Minimum time (in seconds) to wait after scaling up before considering scale down
+- `triggers`: List of conditions that determine when to scale down (currently supports only Prometheus metrics)
+- `autoscaler`: **Optional** integration with an external autoscaler (HPA/KEDA) if needed
+  - `<autoscaler-type>`: hpa/keda
+  - `<autoscaler-object-name>`: name of the KEDA ScaledObject or HPA HorizontalPodAutoscaler object
+
+Below is an example configuration for an ElastiService.
+```yaml
+apiVersion: elasti.truefoundry.com/v1alpha1
+kind: ElastiService
+metadata:
+  name: httpbin
+spec:
+  service: httpbin
+  minTargetReplicas: 1
+  cooldownPeriod: 300
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployments
+    name: httpbin
+  triggers:
+    - type: prometheus
+      metadata:
+        query: sum(rate(istio_requests_total{destination_service="httpbin.demo.svc.cluster.local"}[1m])) or vector(0)
+        serverAddress: http://prometheus-server.prometheus.svc.cluster.local:9090
+        threshold: "0.01"
+```
+
+### 2. Apply the configuration
+
+Apply the configuration to your Kubernetes cluster:
+
+```bash
+kubectl apply -f <service-name>-elasti-CRD.yaml
+```
+
+### 3. Check Logs
+
+You can view logs from the controller to watchout for any errors.
+
+```bash
+kubectl logs -f deployment/elasti-operator-controller-manager -n <namespace>
+```
+
+## Uninstall
+
+To uninstall Elasti, **you will need to remove all the installed ElastiServices first.** Then, simply delete the installation file.
+
+```bash
+kubectl delete elastiservices --all
+helm uninstall <release-name> -n <namespace>
+kubectl delete namespace <namespace>
+```
diff --git a/docs/index.md b/docs/index.md
@@ -0,0 +1,14 @@
+---
+layout: default
+---
+
+# Elasti
+
+
+- [Introduction to Elasti](introduction.md)
+- [Getting Started](getting-started.md)
+- [Monitoring Elasti](monitoring.md)
+- [Architecture](architecture.md)
+- [Integrations](integrations.md)
+- [Comparisons](comparisons.md)
+- [Development](../DEVELOPMENT.md)