Skip to content

Commit

Permalink
Merge pull request #85 from truefoundry/docs_branch
Browse files Browse the repository at this point in the history
Separated docs to different files
  • Loading branch information
innoavator authored Feb 17, 2025
2 parents 4077414 + c1b5706 commit 37ea4ee
Show file tree
Hide file tree
Showing 7 changed files with 488 additions and 0 deletions.
131 changes: 131 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
title: Elasti Architecture
---

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*

- [Elasti Project Documentation](#elasti-project-documentation)
- [1. Introduction](#1-introduction)
- [Overview](#overview)
- [Key Components](#key-components)
- [2. Architecture](#2-architecture)
- [Flow Description](#flow-description)
- [3. Controller](#3-controller)
- [4. Resolver](#4-resolver)
- [5. Helm Values](#5-helm-values)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

# Elasti Project Documentation

## 1. Introduction

### Overview
The Elasti project is designed to enable serverless capability for Kubernetes services by dynamically scaling services based on incoming requests. It comprises two main components: operator and resolver. The elasti-operator manages the scaling of target services, while the resolver intercepts and queues requests when the target service is scaled down to zero replicas.

### Key Components
- **Operator**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scales target services as needed.
- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-operator to scale up the target service.

<div align="center">
<img src="./assets/components.png" width="500px">
</div>


## 2. Architecture
<div align="center">
<img src="./assets/hld.png" width="1000px">
</div>

### Flow Description

- **[CRD Created]** The Operator fetches details from the CRD.
1. Adds a finalizer to the CRD, ensuring it is only deleted by the Operator for proper cleanup.
2. Fetches the `ScaleTargetRef` and initiates a watch on it.
3. Adds the CRD details to a `crdDirectory`, caching the details of all CRDs.
- **[ScaleTargetRef Watch]** When a watch is added to the `ScaleTargetRef`:
1. Identifies the kind of target and checks the available ready pods.
2. If `replicas == 0` -> Switches to **Proxy Mode**.
3. If `replicas > 0` -> Switches to **Serve Mode**.
4. Currently, it supports only `deployments` and `rollouts`.

- **When pods scale to 0**

- **[Switch to Proxy Mode]**
1. Creates a Private Service for the target service. This allows the resolver to reach the target pod, even when the public service has been modified, as described in the following steps.
2. Creates a watch on the public service to monitor changes in ports or selectors.
3. Creates a new `EndpointSlice` for the public service to redirect any traffic to the resolver.
4. Creates a watch on the resolver to monitor the addition of new pods.

- **[In Proxy Mode]**
1. Traffic reaching the target service, which has no pods, is sent to the resolver, capable of handling requests on all endpoints.
2. [**In Resolver**]
1. Once traffic hits the resolver, it reaches the `handleAnyRequest` handler.
2. The host is extracted from the request. If it's a known host, the cache is retrieved from `hostManager`. If not, the service name is extracted from the host and saved in `hostManager`.
3. The service name is used to identify the private service.
4. Using `operatorRPC`, the controller is informed about the incoming request.
5. The request is sent to the `throttler`, which queues the requests. It checks if the pods for the private service are up.
1. If yes, a proxy request is made, and the response is sent back.
2. If no, the request is re-enqueued, and the check is retried after a configurable time interval (set in the Helm values file).
6. If the request is successful, traffic for this host is disabled temporarily (configurable). This prevents new incoming requests to the resolver, as the target is now verified to be up.
3. [**In Controller/Operator**]
1. ElastiServer processes requests from the resolver, containing the service experiencing traffic.
2. Matches the service with the `crdDirectory` entry to retrieve the `ScaleTargetRef`, which is then used to scale the target.
3. Evaluates triggers defined in the ElastiService:
- If **any** trigger indicates that the service should be scaled up -> Scales to minTargetReplicas
4. Once scaled up, switches to **Serve Mode**

- **When pods scale to 1**

- **[Switch to Serve Mode]**
1. The Operator stops the informer/watch on the resolver.
2. The Operator deletes the `EndpointSlice` pointing to the resolver.
3. The system switches to **Serve Mode**.
- **[In Serve Mode]**
1. Traffic hits the gateway, is routed to the target service, then to the target pod, and resolves the request.
2. The Operator periodically evaluates triggers defined in the ElastiService.
3. If **all** triggers indicate that the service is to be scaled down and cooldownPeriod has elapsed since last scale-up:
- Scales down the target service to zero replicas
- Switches to **Proxy Mode**


## 3. Controller

<div align="center">
<img src="./assets/lld-operator.png" width="1000px">
</div>

## 4. Resolver

<div align="center">
<img src="./assets/lld-resolver.png" width="800px">
</div>

## 5. Helm Values

Values you can pass to elastiResolver env.
```yaml

# HeaderForHost is the header to look for to get the host. X-Envoy-Decorator-Operation is the key for istio
headerForHost: X-Envoy-Decorator-Operation
# InitialCapacity is the initial capacity of the semaphore
initialCapacity: "500"
maxIdleProxyConns: "100"
maxIdleProxyConnsPerHost: "500"
# MaxQueueConcurrency is the maximum number of concurrent requests
maxQueueConcurrency: "100"
# OperatorRetryDuration is the duration for which we don't inform the operator
# about the traffic on the same host
operatorRetryDuration: "10"
# QueueRetryDuration is the duration after we retry the requests in queue
queueRetryDuration: "3"
# QueueSize is the size of the queue
queueSize: "50000"
# ReqTimeout is the timeout for each request
reqTimeout: "120"
# TrafficReEnableDuration is the duration for which the traffic is disabled for a host
# This is also duration for which we don't recheck readiness of the service
trafficReEnableDuration: "5"
```
54 changes: 54 additions & 0 deletions docs/comparisons.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Comparisons with Other Solutions

This document compares Elasti with other popular serverless and scale-to-zero solutions in the Kubernetes ecosystem.

## Knative

### Overview
Knative is a comprehensive platform for deploying and managing serverless workloads on Kubernetes. It provides a complete serverless experience with features like scale-to-zero, request-based autoscaling, and traffic management.

### Key Differences
- **Complexity**: Knative is a full-featured platform that requires significant setup and maintenance. Elasti is focused solely on scale-to-zero functionality and can be added to existing services with minimal configuration.
- **Integration**: Knative requires services to be deployed as Knative services. Elasti works with existing Kubernetes deployments and Argo Rollouts without modification.
- **Learning Curve**: Knative has a steeper learning curve due to its many concepts and components. Elasti follows familiar Kubernetes patterns with simple CRD-based configuration.

## OpenFaaS

### Overview
OpenFaaS is a framework for building serverless functions with Docker and Kubernetes, making it easy to deploy serverless functions to any cloud or on-premises.

### Key Differences
- **Purpose**: OpenFaaS is primarily designed for Function-as-a-Service (FaaS) workloads. Elasti is built for existing HTTP services.
- **Architecture**: OpenFaaS requires functions to be written and packaged in a specific way. Elasti works with any HTTP service without code changes.
- **Scaling**: OpenFaaS uses its own scaling mechanisms. Elasti integrates with existing autoscalers (HPA/KEDA) while adding scale-to-zero capability.

## KEDA HTTP Add-on

### Overview
KEDA HTTP Add-on is an extension to KEDA that enables HTTP-based scaling, including scale-to-zero functionality.

### Key Differences
- **Maturity**: KEDA HTTP Add-on is in beta and not recommended for production use
- **Request Handling**:
- KEDA http add-on inserts itself in the http path and handles requests even when the service has been scaled up.
- Elasti takes itself out of the http path once the service has been scaled up.
- **Integration**:
- KEDA HTTP Add-on requires KEDA installation and configuration.
- Elasti can work standalone or integrate with KEDA if needed.

## Feature Comparison Table

| Feature | Elasti | Knative | OpenFaaS | KEDA HTTP Add-on |
|---------|---------|----------|-----------|------------------|
| Scale to Zero |||||
| Works with Existing Services |||||
| Resource Footprint | Low | High | Medium | Low |
| Setup Complexity | Low | High | Medium | Medium |

## When to Choose Elasti

Elasti is the best choice when you:
1. Need to add scale-to-zero capability to existing HTTP services
2. Want to ensure zero request loss during scaling operations
3. Prefer a lightweight solution with minimal configuration
4. Need integration with existing autoscalers (HPA/KEDA)
126 changes: 126 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Getting Started

With Elasti, you can easily manage and scale your Kubernetes services by using a proxy mechanism that queues and holds requests for scaled-down services, bringing them up only when needed. Get started by following below steps:

## Prerequisites

- **Kubernetes Cluster:** You should have a running Kubernetes cluster. You can use any cloud-based or on-premises Kubernetes distribution.
- **kubectl:** Installed and configured to interact with your Kubernetes cluster.
- **Helm:** Installed for managing Kubernetes applications.

## Install

### 1. Install Elasti using helm

Use Helm to install elasti into your Kubernetes cluster. Replace `<release-name>` with your desired release name and `<namespace>` with the Kubernetes namespace you want to use:

```bash
helm install <release-name> oci://tfy.jfrog.io/tfy-helm/elasti --namespace <namespace> --create-namespace
```
Check out [values.yaml](./charts/elasti/values.yaml) to see config in the helm value file.

### 2. Verify the Installation

Check the status of your Helm release and ensure that the elasti components are running:

```bash
helm status <release-name> --namespace <namespace>
kubectl get pods -n <namespace>
```

You will see 2 components running.

1. **Controller/Operator:** `elasti-operator-controller-manager-...` is to switch the traffic, watch resources, scale etc.
2. **Resolver:** `elasti-resolver-...` is to proxy the requests.

Refer to the [Docs](./docs/architecture) to know how it works.

## Configuration

To configure a service to handle its traffic via elasti, you'll need to create and apply a `ElastiService` custom resource:

### 1. Define an ElastiService

```yaml
apiVersion: elasti.truefoundry.com/v1alpha1
kind: ElastiService
metadata:
name: <service-name>
namespace: <service-namespace>
spec:
minTargetReplicas: <min-target-replicas>
service: <service-name>
cooldownPeriod: <cooldown-period>
scaleTargetRef:
apiVersion: <apiVersion>
kind: <kind>
name: <deployment-or-rollout-name>
triggers:
- type: <trigger-type>
metadata:
<trigger-metadata>
autoscaler:
name: <autoscaler-object-name>
type: <autoscaler-type>
```
- `<service-name>`: Replace it with the service you want managed by elasti.
- `<min-target-replicas>`: Min replicas to bring up when first request arrives.
- `<service-namespace>`: Replace by namespace of the service.
- `<scaleTargetRef>`: Reference to the scale target similar to the one used in HorizontalPodAutoscaler.
- `<kind>`: Replace by `rollouts` or `deployments`
- `<apiVersion>`: Replace with `argoproj.io/v1alpha1` or `apps/v1`
- `<deployment-or-rollout-name>`: Replace with name of the rollout or the deployment for the service. This will be scaled up to min-target-replicas when first request comes
- `cooldownPeriod`: Minimum time (in seconds) to wait after scaling up before considering scale down
- `triggers`: List of conditions that determine when to scale down (currently supports only Prometheus metrics)
- `autoscaler`: **Optional** integration with an external autoscaler (HPA/KEDA) if needed
- `<autoscaler-type>`: hpa/keda
- `<autoscaler-object-name>`: name of the KEDA ScaledObject or HPA HorizontalPodAutoscaler object

Below is an example configuration for an ElastiService.
```yaml
apiVersion: elasti.truefoundry.com/v1alpha1
kind: ElastiService
metadata:
name: httpbin
spec:
service: httpbin
minTargetReplicas: 1
cooldownPeriod: 300
scaleTargetRef:
apiVersion: apps/v1
kind: Deployments
name: httpbin
triggers:
- type: prometheus
metadata:
query: sum(rate(istio_requests_total{destination_service="httpbin.demo.svc.cluster.local"}[1m])) or vector(0)
serverAddress: http://prometheus-server.prometheus.svc.cluster.local:9090
threshold: "0.01"
```

### 2. Apply the configuration

Apply the configuration to your Kubernetes cluster:

```bash
kubectl apply -f <service-name>-elasti-CRD.yaml
```

### 3. Check Logs

You can view logs from the controller to watchout for any errors.

```bash
kubectl logs -f deployment/elasti-operator-controller-manager -n <namespace>
```

## Uninstall

To uninstall Elasti, **you will need to remove all the installed ElastiServices first.** Then, simply delete the installation file.

```bash
kubectl delete elastiservices --all
helm uninstall <release-name> -n <namespace>
kubectl delete namespace <namespace>
```
14 changes: 14 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
layout: default
---

# Elasti


- [Introduction to Elasti](introduction.md)
- [Getting Started](getting-started.md)
- [Monitoring Elasti](monitoring.md)
- [Architecture](architecture.md)
- [Integrations](integrations.md)
- [Comparisons](comparisons.md)
- [Development](../DEVELOPMENT.md)
Loading

0 comments on commit 37ea4ee

Please sign in to comment.