Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross cloud providers tests #1583

Merged
merged 67 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
d0a569b
init commit
Oct 10, 2024
5085be2
feat: initial commit
Oct 13, 2024
c725c8f
feat: trigger on PR for testing proposes
Oct 13, 2024
fe3cdf4
fix: upgrade confgiure-aws-creds to v4
Oct 13, 2024
6d0a650
fix: adding token-id permissions
Oct 13, 2024
2682ad7
fix: adding permissions
Oct 13, 2024
ef9a79a
fix: adding permissions
Oct 13, 2024
684dd66
fix: permissions
Oct 13, 2024
188bc87
fix: permissions
Oct 13, 2024
94d5d4b
fix: permissions
Oct 13, 2024
e01b348
fix: debug
Oct 13, 2024
251857f
fix: adding env to push step
Oct 13, 2024
d6aae0a
fix: fix opentofu syntax
Oct 13, 2024
853527f
fix: fix connect to docker registry
Oct 13, 2024
f47b28f
fix: docker login in the test job
Oct 13, 2024
fe96535
fix: closing fi
Oct 13, 2024
fdc9ae6
fix: move the tofu global option
Oct 13, 2024
9c96314
fix: fix the way importinv az envs
Oct 13, 2024
e8b15ad
fix: remove debug info
Oct 13, 2024
cbc9550
fix: move the get-creds post apply az resource
Oct 13, 2024
19636c2
fix: identation fix
Oct 13, 2024
685b03d
fix: build cli and add execution permissions to it
Oct 13, 2024
7e1b554
fix: chainsaw not see the env var
Oct 14, 2024
3062c44
fix: change e2e version no not null
Oct 14, 2024
baffde4
fix: change traceql to get also aks nodes names
Oct 14, 2024
cbfa333
fix: expand timeout
Oct 14, 2024
99647cf
fix: aks connection
Oct 14, 2024
2913286
fix: fix identiatioj
Oct 14, 2024
07e7867
fix: instlal FE using yarn
Oct 14, 2024
7c0685c
fix: remove debug comments
Oct 14, 2024
fb4cca4
feat: adding helm tests for multicloud
Oct 14, 2024
0279cde
fix: set kubeconfig post tofu apply
Oct 14, 2024
ada4f9a
fix: change assertions in e2e tests
Oct 14, 2024
d62e30b
fix: add support for multiple state files
Oct 14, 2024
78b508e
fix: identation fix
Oct 14, 2024
b39ebd6
fix: identation fix
Oct 14, 2024
9614d98
fix: remove the backend-config from tf init
Oct 14, 2024
c38d242
fix: different state file for each run
Oct 14, 2024
7482002
fix: adding env var for local backend
Oct 14, 2024
3b9505d
fix: created different rg for each run
Oct 14, 2024
21aa00e
fix: improve readbility
Oct 14, 2024
b82893b
fix: chainsaw to verify cross cloud provider version differentely
Oct 14, 2024
1be5039
fix: test helm chart
Oct 20, 2024
ca80afa
fix
Oct 20, 2024
25468c2
avoid deleting cluster to observ issues
Oct 20, 2024
2a36347
push new image
Oct 20, 2024
7114cd1
fix identation
Oct 20, 2024
55e8d46
fix no space left on device
Oct 20, 2024
53b5baa
fix: build latest image
Oct 20, 2024
a3e7640
test: test all flow
Oct 20, 2024
c2213b7
fix: run on spot instances
Oct 20, 2024
ff1725d
add-sleep-to-avoid-aks-failures
Oct 20, 2024
dc41300
remove-duplicate
Oct 20, 2024
4b94dec
fix: remove destroy for debbuging
Oct 20, 2024
e452ff0
increase the sleep for aks
Oct 20, 2024
0215251
fix: timeout increase
Oct 20, 2024
ea827ed
replace sleep with kubectl wait
Oct 20, 2024
8f04fd6
adjust demo timeouts
Oct 20, 2024
acd2b65
fix webhook
Oct 20, 2024
3f0712f
fix: increase timeout inastlling demo
Oct 20, 2024
a7ba8f8
fix: increase timeout waiting for demo
Oct 20, 2024
5c5f537
fix: adding back the destroy part
Oct 20, 2024
8f95d49
fix :remove the pr trigger
Oct 20, 2024
c0b29af
fix: missing pricing
Oct 21, 2024
beb6826
fix: bump steps version
Oct 21, 2024
f723b20
fix: bump steps versions
Oct 21, 2024
f7ef8f9
Merge branch 'main' into cross-cloud-providers-tests
tamirdavid1 Oct 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
226 changes: 226 additions & 0 deletions .github/workflows/cross-cloud-tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
name: Cross-Cloud Chainsaw Tests

on:
schedule:
- cron: '0 0 * * *' # Nightly run at midnight
workflow_dispatch: # Manual trigger


permissions:
id-token: write
contents: read

jobs:
build-and-push-images:
permissions:
id-token: write
contents: read
name: Build and Push Docker Images
runs-on: warp-ubuntu-latest-x64-8x-spot
steps:
- name: Checkout Code
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3


- name: Configure AWS credentials from OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::061717858829:role/ecr-pull-push-role
aws-region: us-east-1


- name: Login to Amazon ECR
run: |
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

- name: Build and Tag Docker Images
env:
COMMIT_HASH: ${{ github.sha }}
run: |
# Build images
make build-images TAG=${COMMIT_HASH}
# Tag images for public ECR
docker tag keyval/odigos-collector:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-collector:${COMMIT_HASH}
docker tag keyval/odigos-instrumentor:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-instrumentor:${COMMIT_HASH}
docker tag keyval/odigos-ui:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-ui:${COMMIT_HASH}
docker tag keyval/odigos-scheduler:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-scheduler:${COMMIT_HASH}
docker tag keyval/odigos-autoscaler:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-autoscaler:${COMMIT_HASH}
docker tag keyval/odigos-odiglet:${COMMIT_HASH} public.ecr.aws/y2v0v6s7/keyval/odigos-odiglet:${COMMIT_HASH}

docker push public.ecr.aws/y2v0v6s7/keyval/odigos-collector:${COMMIT_HASH}
docker push public.ecr.aws/y2v0v6s7/keyval/odigos-instrumentor:${COMMIT_HASH}
docker push public.ecr.aws/y2v0v6s7/keyval/odigos-ui:${COMMIT_HASH}
docker push public.ecr.aws/y2v0v6s7/keyval/odigos-scheduler:${COMMIT_HASH}
docker push public.ecr.aws/y2v0v6s7/keyval/odigos-autoscaler:${COMMIT_HASH}
docker push public.ecr.aws/y2v0v6s7/keyval/odigos-odiglet:${COMMIT_HASH}

test:
permissions:
id-token: write
contents: read
needs: build-and-push-images
runs-on: warp-ubuntu-latest-x64-8x-spot
strategy:
matrix:
cloud-provider: [aks] # Add or remove providers as needed [TODO: later add -> eks + gke]
test-scenario: [multi-apps, helm-chart] # Add or remove scenarios as needed

steps:

- name: Configure AWS credentials from OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::061717858829:role/ecr-pull-push-role
aws-region: us-east-1

- name: Login to Amazon ECR
run: |
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

- name: Checkout Code
uses: actions/checkout@v4

- name: Set Environment Variables for Terraform
run: |
CLUSTER_NAME="${{ matrix.test-scenario }}-${{ github.run_id }}"
echo "CLUSTER_NAME=${CLUSTER_NAME}" >> $GITHUB_ENV
echo "TF_VAR_cluster_name=${CLUSTER_NAME}" >> $GITHUB_ENV
echo "TF_VAR_resource_group_name=${CLUSTER_NAME}" >> $GITHUB_ENV
echo "TF_VAR_test_scenario=${{ matrix.test-scenario }}" >> $GITHUB_ENV
echo "TF_VAR_run_id=${{ github.run_id }}" >> $GITHUB_ENV


- name: Configure Cloud Provider
run: |
if [ "${{ matrix.cloud-provider }}" = "aks" ]; then
echo "Configuring for AKS"

# Set environment variables for Azure provider
echo "ARM_CLIENT_ID=${{ secrets.AZURE_CLIENT_ID }}" >> $GITHUB_ENV
echo "ARM_CLIENT_SECRET=${{ secrets.AZURE_CLIENT_SECRET }}" >> $GITHUB_ENV
echo "ARM_TENANT_ID=${{ secrets.AZURE_TENANT_ID }}" >> $GITHUB_ENV
echo "ARM_SUBSCRIPTION_ID=${{ secrets.AZURE_SUBSCRIPTION_ID }}" >> $GITHUB_ENV

az login --service-principal -u ${{ secrets.AZURE_CLIENT_ID }} -p ${{ secrets.AZURE_CLIENT_SECRET }} --tenant ${{ secrets.AZURE_TENANT_ID }}
az account set --subscription ${{ secrets.AZURE_SUBSCRIPTION_ID }}

elif [ "${{ matrix.cloud-provider }}" = "eks" ]; then
echo "Configuring for EKS"
aws configure set aws_access_key_id ${{ secrets.AWS_ACCESS_KEY_ID }}
aws configure set aws_secret_access_key ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws configure set region us-east-1

elif [ "${{ matrix.cloud-provider }}" = "gke" ]; then
echo "Configuring for GKE"
echo "${{ secrets.GCP_SERVICE_ACCOUNT_KEY }}" | base64 --decode > gcp-key.json
gcloud auth activate-service-account --key-file=gcp-key.json
gcloud config set project ${{ secrets.GCP_PROJECT_ID }}

else
echo "Unknown cloud provider: ${{ matrix.cloud-provider }}"
exit 1
fi

- uses: opentofu/setup-opentofu@v1

- name: Set Terraform Directory Based on Cloud Provider
run: |
if [ "${{ matrix.cloud-provider }}" == "aks" ]; then
echo "TF_DIR=./tests-infrastructure/terraform/aks" >> $GITHUB_ENV
elif [ "${{ matrix.cloud-provider }}" == "eks" ]; then
echo "TF_DIR=./tests-infrastructure/terraform/eks" >> $GITHUB_ENV
elif [ "${{ matrix.cloud-provider }}" == "gke" ]; then
echo "TF_DIR=./tests-infrastructure/terraform/gke" >> $GITHUB_ENV
else
echo "Unknown cloud provider"
exit 1
fi

- name: Initialize OpenTofu
run: tofu -chdir=$TF_DIR init


- name: Plan OpenTofu
run: tofu -chdir=$TF_DIR plan

- name: Apply OpenTofu Configuration
run: |
tofu -chdir=$TF_DIR apply -auto-approve

- name: Get kubeconfig for AKS/EKS/GKE
run: |
if [ "${{ matrix.cloud-provider }}" == "aks" ]; then
echo "Fetching AKS kubeconfig..."
az aks get-credentials --resource-group $CLUSTER_NAME --name $CLUSTER_NAME
elif [ "${{ matrix.cloud-provider }}" == "eks" ]; then
echo "Fetching EKS kubeconfig..."
elif [ "${{ matrix.cloud-provider }}" == "gke" ]; then
echo "Fetching GKE kubeconfig..."
else
echo "Unknown cloud provider"
exit 1
fi

- name: Verify cluster Access
run: |
kubectl get nodes || exit 1

- name: Install Chainsaw
uses: kyverno/action-install-chainsaw@v0.2.8

- name: Build CLI
run: |
cd cli
go build -tags=embed_manifests -o odigos
chmod +x odigos

- name: Install FE
# this is used for cypress tests which are not run in every scenario
if: matrix.test-scenario == 'multi-apps' || matrix.test-scenario == 'helm-chart' || matrix.test-scenario == 'fe-synthetic'
run: |
cd frontend/webapp
yarn install

- name: Run E2E Tests
run: |

# This uses in chainsaw to split e2e tests from cross cloud tests
export MODE=cross-cloud-tests

# This uses in chainsaw to verify the odigos version is equal to the commit version
export COMMIT_HASH=${{ github.sha }}

chainsaw test tests/e2e/${{ matrix.test-scenario }}

- name: Destroy Resources
if: always() # Ensures this runs even if earlier steps fail
run: |
tofu -chdir=$TF_DIR destroy -auto-approve

- name: Extract Tag
id: extract_tag
run: echo "tag=${GITHUB_REF#refs/*/}" >> $GITHUB_OUTPUT

# Notify Slack on Failure or Cancellation
- name: Notify Slack on Failure or Cancellation
if: ${{ failure() || cancelled() }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.CLOUD_PROVIDERS_TESTS_WEBHOOK_URL }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
curl -X POST -H 'Content-type: application/json' --data '{"blocks":[{"type":"section","text":{"type":"mrkdwn","text":"*ERROR*: Providers tests fail > `${{ matrix.cloud-provider }} - ${{ matrix.test-scenario }}`"}},{"type":"section","fields":[{"type":"mrkdwn","text":"*Link:*\n<https://github.com/${{ env.GITHUB_REPOSITORY }}/actions/runs/${{ env.GITHUB_RUN_ID }}|View the GitHub Run>"},{"type":"mrkdwn","text":"*Tag:*\n`${{ steps.extract_tag.outputs.tag }}`"}]}]}' ${{ env.SLACK_WEBHOOK_URL }}

# Notify Slack on Success
- name: Notify Slack on Success
if: ${{ success() }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.CLOUD_PROVIDERS_TESTS_WEBHOOK_URL }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
curl -X POST -H 'Content-type: application/json' --data '{"blocks":[{"type":"section","text":{"type":"mrkdwn","text":"*SUCCESS*: Providers tests succeed > `${{ matrix.cloud-provider }} - ${{ matrix.test-scenario }}`"}},{"type":"section","fields":[{"type":"mrkdwn","text":"*Link:*\n<https://github.com/${{ env.GITHUB_REPOSITORY }}/actions/runs/${{ env.GITHUB_RUN_ID }}|View the GitHub Run>"},{"type":"mrkdwn","text":"*Tag:*\n`${{ steps.extract_tag.outputs.tag }}`"}]}]}' ${{ env.SLACK_WEBHOOK_URL }}

6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,8 @@ cli/odigos
.venv
**/__pycache__/
**/*.pyc
serving-certs/
serving-certs/

**.tfstate
**.tfstate.backup
**.terraform**
27 changes: 27 additions & 0 deletions tests-infrastructure/terraform/aks/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
terraform {
backend "local" {
path = "terraform-${var.test_scenario}-${var.run_id}.tfstate"
}
}

resource "azurerm_kubernetes_cluster" "aks" {
name = var.cluster_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
dns_prefix = var.cluster_name

default_node_pool {
name = "default"
node_count = var.node_count
vm_size = "Standard_B2s"
}

identity {
type = "SystemAssigned"
}
}

resource "azurerm_resource_group" "rg" {
name = var.resource_group_name
location = "East US"
}
4 changes: 4 additions & 0 deletions tests-infrastructure/terraform/aks/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
output "kube_config" {
value = azurerm_kubernetes_cluster.aks.kube_config_raw
sensitive = true
}
3 changes: 3 additions & 0 deletions tests-infrastructure/terraform/aks/provider.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
provider "azurerm" {
features {}
}
24 changes: 24 additions & 0 deletions tests-infrastructure/terraform/aks/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
variable "resource_group_name" {
description = "Name of the resource group"
default = "tests-rg"
}

variable "cluster_name" {
description = "Name of the AKS cluster"
default = "tests-aks"
}

variable "node_count" {
description = "Number of nodes in the cluster"
default = 1
}

variable "test_scenario" {
description = "Test scenario to differentiate state files"
type = string
}

variable "run_id" {
description = "GitHub run ID for uniquely identifying state files"
type = string
}
2 changes: 1 addition & 1 deletion tests/e2e/helm-chart/assert-instrumented-and-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@ status:
- key: process.runtime.version
(value != null): true
- key: telemetry.distro.version
value: e2e-test
(value != null): true
- key: process.pid
(value != null): true
---
Expand Down
58 changes: 44 additions & 14 deletions tests/e2e/helm-chart/chainsaw-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,35 @@ spec:
P="../../.."
# "build" complete helm chart by copying CRDs into the template folder
cp -r $P/api/config/crd/bases/* $P/helm/odigos/templates/crds/
helm upgrade --install odigos $P/helm/odigos --create-namespace --namespace odigos-test-ns --set image.tag=e2e-test
if [ "$MODE" = "cross-cloud-tests" ]; then
helm upgrade --install odigos $P/helm/odigos --create-namespace --namespace odigos-test-ns --set image.tag="$COMMIT_HASH" --set imagePrefix=public.ecr.aws/y2v0v6s7
else
helm upgrade --install odigos $P/helm/odigos --create-namespace --namespace odigos-test-ns --set image.tag=e2e-test
fi
kubectl label namespace odigos-test-ns odigos.io/system-object="true"
timeout: 60s
- name: Verify Odigos Installation
try:
- script:
timeout: 200s
content: |
echo "Starting Odigos version check..."
export ACTUAL_VERSION=$(../../../cli/odigos version --cluster)
if [ "$ACTUAL_VERSION" != "e2e-test" ]; then
echo "Odigos version is not e2e-test, got $ACTUAL_VERSION"
exit 1
echo "Actual Version: $ACTUAL_VERSION"

if [ "$MODE" = "cross-cloud-tests" ]; then
if [ "$ACTUAL_VERSION" != "$COMMIT_HASH" ]; then
echo "Odigos version is not the expected commit hash, got $ACTUAL_VERSION"
exit 1
fi

kubectl wait --for=condition=ready pods --all -n odigos-test-ns --timeout=40s

else
if [ "$ACTUAL_VERSION" != "e2e-test" ]; then
echo "Odigos version is not e2e-test, got $ACTUAL_VERSION"
exit 1
fi
fi
- assert:
file: assert-odigos-installed.yaml
Expand All @@ -51,20 +69,32 @@ spec:
- script:
timeout: 100s
content: |
docker pull keyval/odigos-demo-inventory:v0.1
docker pull keyval/odigos-demo-membership:v0.1
docker pull keyval/odigos-demo-coupon:v0.1
docker pull keyval/odigos-demo-inventory:v0.1
docker pull keyval/odigos-demo-frontend:v0.2
kind load docker-image keyval/odigos-demo-inventory:v0.1
kind load docker-image keyval/odigos-demo-membership:v0.1
kind load docker-image keyval/odigos-demo-coupon:v0.1
kind load docker-image keyval/odigos-demo-inventory:v0.1
kind load docker-image keyval/odigos-demo-frontend:v0.2
if [ "$MODE" != "cross-cloud-tests" ]; then
docker pull keyval/odigos-demo-inventory:v0.1
docker pull keyval/odigos-demo-membership:v0.1
docker pull keyval/odigos-demo-coupon:v0.1
docker pull keyval/odigos-demo-frontend:v0.2
kind load docker-image keyval/odigos-demo-inventory:v0.1
kind load docker-image keyval/odigos-demo-membership:v0.1
kind load docker-image keyval/odigos-demo-coupon:v0.1
kind load docker-image keyval/odigos-demo-frontend:v0.2
else
echo "Skipping docker pull and kind load for cross-cloud-tests mode"
fi
- apply:
file: 02-install-simple-demo.yaml
- script:
timeout: 100s
content: |
# Wait for the pods to be ready
tamirdavid1 marked this conversation as resolved.
Show resolved Hide resolved
kubectl wait --for=condition=ready pod -l app=frontend --timeout=50s
kubectl wait --for=condition=ready pod -l app=coupon --timeout=50s
kubectl wait --for=condition=ready pod -l app=inventory --timeout=50s
kubectl wait --for=condition=ready pod -l app=pricing --timeout=50s
kubectl wait --for=condition=ready pod -l app=membership --timeout=50s
- assert:
file: assert-apps-installed.yaml

- name: Detect Languages
try:
- apply:
Expand Down
Loading
Loading