-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E test for cpu scaler #2441
E2E test for cpu scaler #2441
Conversation
Hi @tomkerkhove @JorTurFer PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for doing it ❤️
Please, update the changelog (adding this PR under improvements section) and let me trigger it before merging :)
/run-e2e cpu.test.* |
/run-e2e cpu.test.* |
Hi, @JorTurFer , @zroubalik I've changed the image and updated the changelog. |
/run-e2e cpu.test.* |
@Ritikaa96 seems like the test is failing ^ |
Hi @zroubalik , I tried the same file and ran |
I don't think that |
/run-e2e cpu.test.* |
@zroubalik , Seems like now the job is deployed successfully whereas it wasn't in the previous run. But now the next step is getting failed. any suggestions where it can be going wrong? |
I can reproduce the error locally, let me check if I figure out any tip or solution |
In my local, the generated load expires too early and that's why never reaches 5 instances.
Another tip to avoid waiting for 8 minutes is using the advanced section of the ScaledObject, I have reduced it to 3 minutes only setting scaleDown.stabilizationWindowSeconds to 0. I have also been troubleshooting the test and my solution is this: import * as fs from 'fs'
import * as sh from 'shelljs'
import * as tmp from 'tmp'
import test from 'ava'
import { waitForDeploymentReplicaCount } from './helpers'
const testNamespace = 'cpu-test'
const deployMentFile = tmp.fileSync()
const triggerFile = tmp.fileSync()
test.before(t => {
sh.config.silent = true
sh.exec(`kubectl create namespace ${testNamespace}`)
fs.writeFileSync(deployMentFile.name, deployMentYaml)
t.is(
0,
sh.exec(`kubectl apply -f ${deployMentFile.name} --namespace ${testNamespace}`).code,
'Deploying php deployment should work.'
)
t.is(0, sh.exec(`kubectl rollout status deploy/php-apache -n ${testNamespace}`).code, 'Deployment php rolled out succesfully')
})
test.serial('Deployment should have 1 replicas on start', t => {
const replicaCount = sh.exec(
`kubectl get deployment.apps/php-apache --namespace ${testNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
t.is(replicaCount, '1', 'replica count should start out as 1')
})
test.serial(`Creating Job should work`, async t => {
fs.writeFileSync(triggerFile.name, triggerJob)
t.is(
0,
sh.exec(`kubectl apply -f ${triggerFile.name} --namespace ${testNamespace}`).code,
'creating job should work.'
)
})
test.serial(`Deployment should scale to 4 after 3 minutes`, async t => {
// check replicacount on constant triggering :
t.true(await waitForDeploymentReplicaCount(4, 'php-apache', testNamespace, 18, 10000), 'Replica count should be 5 after 180 seconds')
})
test.serial(`Deleting Job should work`, async t => {
fs.writeFileSync(triggerFile.name, triggerJob)
t.is(
0,
sh.exec(`kubectl delete -f ${triggerFile.name} --namespace ${testNamespace}`).code,
'Deleting job should work.'
)
})
test.serial(`Deployment should scale back to 1 after 3 minutes`, async t => {
// check for the scale down :
t.true(await waitForDeploymentReplicaCount(1, 'php-apache', testNamespace, 18, 10000), 'Replica count should be 1 after 8 minutes')
})
test.after.always.cb('clean up workload test related deployments', t => {
const resources = [
'deployment.apps/php-apache',
'jobs.batch/trigger-job',
]
for (const resource of resources) {
sh.exec(`kubectl delete ${resource} --namespace ${testNamespace}`)
}
sh.exec(`kubectl delete namespace ${testNamespace}`)
t.end()
})
const deployMentYaml = `apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
replicas: 1
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: k8s.gcr.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: cpu-scaledobject
labels:
run: php-apache
spec:
advanced:
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 0
maxReplicaCount: 4
scaleTargetRef:
name: php-apache
triggers:
- type: cpu
metadata:
type: Utilization
value: "50"`
const triggerJob = `apiVersion: batch/v1
kind: Job
metadata:
name: trigger-job
namespace: cpu-test
spec:
template:
spec:
containers:
- image: busybox
name: test
command: ["/bin/sh"]
args: ["-c", "for i in $(seq 1 600);do wget -q -O- http://php-apache.cpu-test.svc/;sleep 0.1;done"]
restartPolicy: Never
activeDeadlineSeconds: 600
backoffLimit: 3` Basically I have limited the max instances to 4, increased the load duration, deleted the load job after the scaling out and modify the HPA behavior to scale in faster reducing the timeout from 8 minutes to 3. Feel free to use it, ignore it, or take some parts. BTW, Thanks for helping with e2e tests ❤️❤️❤️❤️ |
afa6c78
to
91a5808
Compare
Hi @JorTurFer, Thanks for taking out the time to test it locally. Deleting the job in mid and modifying HPA behavior to scale in faster helped a lot in reducing the time and after corrections, the whole test merely takes a few minutes. Thanks a lot for the suggestions, I wonder why I didn't think of this before. As per the more logical bet, I've changed the constant expected final replica count check to one where we focus on whether it is scaling out or not, with:
Also on setting scaleDown.stabilizationWindowSeconds to 0 in ScaledObject, the deployment is going to immediately scale down to 0, consequentially creating a problem in its scaling up with the trigger, though adding PTAL! I hope crossed out the apparent reason the test failed last time. |
/run-e2e cpu.test.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, ❤️
Let's wait till test execution 🤞
Only 2 small things:
- Please add the missing resources that I said in the comment
- Is
waitForDeploymentScaleUp
really needed? I mean, you could setmaxReplicaCount
to 2, and you will not need it anymore.
91a5808
to
bead6d3
Compare
updated cpu.test.ts, Resolved conflicts from changelog.md
bd6fb97
to
46c0e15
Compare
Hi, @JorTurFer , PTAL. |
/run-e2e cpu.test.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for your contribution ❤️ ❤️ ❤️ ❤️
This PR add E2E tests for cpu scaler.
Checklist
Fixes #2219