Skip to content

Commit

Permalink
Add option to enable worker graceful shutdown
Browse files Browse the repository at this point in the history
  • Loading branch information
sdaberdaku committed Oct 3, 2024
1 parent 1956935 commit 35d729c
Show file tree
Hide file tree
Showing 11 changed files with 242 additions and 13 deletions.
15 changes: 13 additions & 2 deletions charts/trino/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -503,9 +503,11 @@ Fast distributed SQL query engine for big data analytics that helps you explore

Allows mounting additional Trino configuration files from Kubernetes secrets on the coordinator node.
Example:
```yaml
- name: sample-secret
secretName: sample-secret
path: /secrets/sample.json
```
* `worker.jvm.maxHeapSize` - string, default: `"8G"`
* `worker.jvm.gcMethod.type` - string, default: `"UseG1GC"`
* `worker.jvm.gcMethod.g1.heapRegionSize` - string, default: `"32M"`
Expand Down Expand Up @@ -559,12 +561,21 @@ Fast distributed SQL query engine for big data analytics that helps you explore
```
* `worker.lifecycle` - object, default: `{}`

To enable [graceful shutdown](https://trino.io/docs/current/admin/graceful-shutdown.html), define a lifecycle preStop like bellow, Set the `terminationGracePeriodSeconds` to a value greater than or equal to the configured `shutdown.grace-period`. Configure `shutdown.grace-period` in `additionalConfigProperties` as `shutdown.grace-period=2m` (default is 2 minutes). Also configure `accessControl` because the `default` system access control does not allow graceful shutdowns.
Worker container [lifecycle events](https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/) Setting `worker.lifecycle` conflicts with `worker.gracefulShutdown`.
Example:
```yaml
preStop:
exec:
command: ["/bin/sh", "-c", "curl -v -X PUT -d '\"SHUTTING_DOWN\"' -H \"Content-type: application/json\" http://localhost:8081/v1/info/state"]
command: ["/bin/sh", "-c", "sleep 120"]
```
* `worker.gracefulShutdown` - object, default: `{"enabled":false,"gracePeriodSeconds":120}`

Configure [graceful shutdown](https://trino.io/docs/current/admin/graceful-shutdown.html) Enabling this feature will: 1) Add a `preStop` lifecycle event to all worker Pods; 2) Set the `shutdown.grace-period` configuration property to `gracePeriod`; 3) Configure the workers' `accessControl` since the `default` system access control [does not allow graceful shutdowns](https://trino.io/docs/current/admin/graceful-shutdown.html). The user must set the `terminationGracePeriodSeconds` to a value of at least two times the configured `gracePeriodSeconds`. The worker that receives the graceful shutdown request [will sleep for `gracePeriod` twice](https://trino.io/docs/current/admin/graceful-shutdown.html#shutdown-behavior). Enabling `worker.gracefulShutdown` conflicts with `worker.lifecycle`. If you need to provide a custom `worker.lifecycle` configuration, and you want to enable `worker.gracefulShutdown`, you have to do so manually.
Example:
```yaml
gracefulShutdown:
enabled: true
gracePeriodSeconds: 120
```
* `worker.terminationGracePeriodSeconds` - int, default: `30`
* `worker.nodeSelector` - object, default: `{}`
Expand Down
22 changes: 22 additions & 0 deletions charts/trino/templates/configmap-access-control-worker.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{{- if .Values.worker.gracefulShutdown.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ template "trino.fullname" . }}-access-control-volume-worker
namespace: {{ .Release.Namespace }}
labels:
{{- include "trino.labels" . | nindent 4 }}
app.kubernetes.io/component: worker
data:
graceful-shutdown-rules.json: >-
{
"system_information": [
{
"allow": [
"write"
],
"user": "admin"
}
]
}
{{- end }}
3 changes: 3 additions & 0 deletions charts/trino/templates/configmap-coordinator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,9 @@ data:
jmx.rmiregistry.port={{- $coordinatorJmx.registryPort }}
jmx.rmiserver.port={{- $coordinatorJmx.serverPort }}
{{- end }}
{{- if .Values.worker.gracefulShutdown.enabled }}
shutdown.grace-period={{- .Values.worker.gracefulShutdown.gracePeriodSeconds -}}s
{{- end }}
{{- if .Values.server.coordinatorExtraConfig }}
{{- .Values.server.coordinatorExtraConfig | nindent 4 }}
{{- end }}
Expand Down
9 changes: 9 additions & 0 deletions charts/trino/templates/configmap-worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,19 @@ data:
jmx.rmiregistry.port={{- $workerJmx.registryPort }}
jmx.rmiserver.port={{- $workerJmx.serverPort }}
{{- end }}
{{- if .Values.worker.gracefulShutdown.enabled }}
shutdown.grace-period={{- .Values.worker.gracefulShutdown.gracePeriodSeconds -}}s
{{- end }}
{{- if .Values.server.workerExtraConfig }}
{{- .Values.server.workerExtraConfig | nindent 4 }}
{{- end }}
{{- if .Values.worker.gracefulShutdown.enabled }}
access-control.properties: |
access-control.name=file
security.config-file={{ .Values.server.config.path }}/access-control/graceful-shutdown-rules.json
{{- end }}

{{- if .Values.server.exchangeManager }}
exchange-manager.properties: |
exchange-manager.name={{ .Values.server.exchangeManager.name }}
Expand Down
2 changes: 1 addition & 1 deletion charts/trino/templates/deployment-coordinator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ spec:
metadata:
annotations:
{{- if and (eq .Values.accessControl.type "configmap") (not .Values.accessControl.refreshPeriod) }}
checksum/access-control-config: {{ include (print $.Template.BasePath "/configmap-access-control.yaml") . | sha256sum }}
checksum/access-control-config: {{ include (print $.Template.BasePath "/configmap-access-control-coordinator.yaml") . | sha256sum }}
{{- end }}
{{- if or .Values.catalogs .Values.additionalCatalogs }}
checksum/catalog-config: {{ include (print $.Template.BasePath "/configmap-catalog.yaml") . | sha256sum }}
Expand Down
33 changes: 33 additions & 0 deletions charts/trino/templates/deployment-worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,9 @@ spec:
checksum/catalog-config: {{ include (print $.Template.BasePath "/configmap-catalog.yaml") . | sha256sum }}
{{- end }}
checksum/worker-config: {{ include (print $.Template.BasePath "/configmap-worker.yaml") . | sha256sum }}
{{- if .Values.worker.gracefulShutdown.enabled }}
checksum/access-control-config: {{ include (print $.Template.BasePath "/configmap-access-control-worker.yaml") . | sha256sum }}
{{- end }}
{{- if .Values.worker.annotations }}
{{- tpl (toYaml .Values.worker.annotations) . | nindent 8 }}
{{- end }}
Expand Down Expand Up @@ -61,6 +64,11 @@ spec:
configMap:
name: {{ template "trino.fullname" . }}-jmx-exporter-config-worker
{{- end }}
{{- if .Values.worker.gracefulShutdown.enabled }}
- name: access-control-volume
configMap:
name: {{ template "trino.fullname" . }}-access-control-volume-worker
{{- end }}
{{- range .Values.configMounts }}
- name: {{ .name }}
configMap:
Expand Down Expand Up @@ -92,7 +100,11 @@ spec:
imagePullSecrets:
{{- toYaml .Values.imagePullSecrets | nindent 8 }}
{{- end }}
{{- if and .Values.worker.gracefulShutdown.enabled (gt (mulf 2.0 .Values.worker.gracefulShutdown.gracePeriodSeconds) .Values.worker.terminationGracePeriodSeconds) }}
{{- fail "The user must set the `worker.terminationGracePeriodSeconds` to a value of at least two times the configured `gracePeriodSeconds`." }}
{{- else }}
terminationGracePeriodSeconds: {{ .Values.worker.terminationGracePeriodSeconds }}
{{- end }}
containers:
- name: {{ .Chart.Name }}-worker
image: {{ include "trino.image" . }}
Expand All @@ -112,6 +124,10 @@ spec:
{{- end }}
- mountPath: {{ .Values.kafka.mountPath }}
name: schemas-volume
{{- if .Values.worker.gracefulShutdown.enabled }}
- mountPath: {{ .Values.server.config.path }}/access-control
name: access-control-volume
{{- end }}
{{- range .Values.configMounts }}
- name: {{ .name }}
mountPath: {{ .path }}
Expand Down Expand Up @@ -166,7 +182,24 @@ spec:
failureThreshold: {{ .Values.worker.readinessProbe.failureThreshold | default 6 }}
successThreshold: {{ .Values.worker.readinessProbe.successThreshold | default 1 }}
lifecycle:
{{- if .Values.worker.lifecycle }}
{{- if .Values.worker.gracefulShutdown.enabled }}
{{- fail "The `worker.lifecycle` configuration conflicts with `worker.gracefulShutdown`. Either disable `worker.gracefulShutdown` and apply the related configurations manually, or remove `worker.lifecycle`." }}
{{- end }}
{{- toYaml .Values.worker.lifecycle | nindent 12 }}
{{- else if .Values.worker.gracefulShutdown.enabled }}
preStop:
exec:
command:
- /bin/sh
- -c
- >-
curl -v -X PUT
-d '"SHUTTING_DOWN"'
-H 'Content-type: application/json'
-H 'X-Trino-User: admin'
http://localhost:{{- .Values.service.port -}}/v1/info/state
{{- end }}
resources:
{{- toYaml .Values.worker.resources | nindent 12 }}
{{- if $workerJmx.exporter.enabled }}
Expand Down
120 changes: 120 additions & 0 deletions charts/trino/templates/tests/test-graceful-shutdown.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
{{- if .Values.worker.gracefulShutdown.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "trino.fullname" . }}-pod-manager
namespace: {{ .Release.Namespace }}
labels:
{{- include "trino.labels" . | nindent 4 }}
app.kubernetes.io/component: test
test: graceful-shutdown
annotations:
"helm.sh/hook": test
"helm.sh/hook-weight": "0"
"helm.sh/hook-delete-policy": hook-succeeded
rules:
- apiGroups: [ "" ]
resources: [ "pods" ]
verbs: [ "get", "list", "delete" ]
- apiGroups: [ "" ]
resources: [ "pods/log" ]
verbs: [ "get" ]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "trino.fullname" . }}-pod-manager-sa
namespace: {{ .Release.Namespace }}
labels:
{{- include "trino.labels" . | nindent 4 }}
app.kubernetes.io/component: test
test: graceful-shutdown
annotations:
"helm.sh/hook": test
"helm.sh/hook-weight": "0"
"helm.sh/hook-delete-policy": hook-succeeded
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "trino.fullname" . }}-pod-manager-binding
namespace: {{ .Release.Namespace }}
labels:
{{- include "trino.labels" . | nindent 4 }}
app.kubernetes.io/component: test
test: graceful-shutdown
annotations:
"helm.sh/hook": test
"helm.sh/hook-weight": "1"
"helm.sh/hook-delete-policy": hook-succeeded
subjects:
- kind: ServiceAccount
name: {{ include "trino.fullname" . }}-pod-manager-sa
namespace: {{ .Release.Namespace }}
roleRef:
kind: Role
name: {{ include "trino.fullname" . }}-pod-manager
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Pod
metadata:
name: {{ include "trino.fullname" . }}-test-graceful-shutdown
labels:
{{- include "trino.labels" . | nindent 4 }}
app.kubernetes.io/component: test
test: graceful-shutdown
annotations:
"helm.sh/hook": test
"helm.sh/hook-weight": "2"
"helm.sh/hook-delete-policy": hook-succeeded
spec:
serviceAccountName: {{ include "trino.fullname" . }}-pod-manager-sa
initContainers:
- name: get-worker-pod
image: bitnami/kubectl:latest
command: [ "sh", "-c" ]
args:
- >-
kubectl get pods
--selector="app.kubernetes.io/name={{ include "trino.name" . }},app.kubernetes.io/instance={{ .Release.Name }},app.kubernetes.io/component=worker"
--output=jsonpath="{.items[0].metadata.name}"
--namespace={{ .Release.Namespace }}
> /pods/worker-pod.txt
volumeMounts:
- mountPath: /pods
name: worker-pod
containers:
- name: check-logs
image: bitnami/kubectl:latest
command: [ "sh", "-c" ]
args:
- >-
WORKER_POD=$(cat /pods/worker-pod.txt) &&
kubectl logs ${WORKER_POD}
--follow
--container=trino-worker
--namespace={{ .Release.Namespace }}
| grep --max-count=1 "Shutdown requested"
volumeMounts:
- mountPath: /pods
name: worker-pod
- name: trigger-graceful-shutdown
image: bitnami/kubectl:latest
command: [ "sh", "-c" ]
args:
- >-
sleep 5 &&
WORKER_POD=$(cat /pods/worker-pod.txt) &&
kubectl delete pod
${WORKER_POD}
--namespace={{ .Release.Namespace }}
volumeMounts:
- mountPath: /pods
name: worker-pod
restartPolicy: Never
volumes:
- name: worker-pod
emptyDir: {}

{{- end }}
42 changes: 33 additions & 9 deletions charts/trino/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -586,9 +586,11 @@ coordinator:
# files from Kubernetes secrets on the coordinator node.
# @raw
# Example:
# ```yaml
# - name: sample-secret
# secretName: sample-secret
# path: /secrets/sample.json
# ```

worker:
jvm:
Expand Down Expand Up @@ -661,20 +663,42 @@ worker:
# ```

lifecycle: {}
# worker.lifecycle -- To enable [graceful
# shutdown](https://trino.io/docs/current/admin/graceful-shutdown.html),
# define a lifecycle preStop like bellow, Set the
# `terminationGracePeriodSeconds` to a value greater than or equal to the
# configured `shutdown.grace-period`. Configure `shutdown.grace-period` in
# `additionalConfigProperties` as `shutdown.grace-period=2m` (default is 2
# minutes). Also configure `accessControl` because the `default` system
# access control does not allow graceful shutdowns.
# worker.lifecycle -- Worker container [lifecycle
# events](https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/)
#
# Setting `worker.lifecycle` conflicts with `worker.gracefulShutdown`.
#
# @raw
# Example:
# ```yaml
# preStop:
# exec:
# command: ["/bin/sh", "-c", "curl -v -X PUT -d '\"SHUTTING_DOWN\"' -H \"Content-type: application/json\" http://localhost:8081/v1/info/state"]
# command: ["/bin/sh", "-c", "sleep 120"]
# ```

gracefulShutdown:
enabled: false
gracePeriodSeconds: 120
# worker.gracefulShutdown -- Configure [graceful
# shutdown](https://trino.io/docs/current/admin/graceful-shutdown.html)
#
# Enabling this feature will:
# 1) Add a `preStop` lifecycle event to all worker Pods;
# 2) Set the `shutdown.grace-period` configuration property to `gracePeriod`;
# 3) Configure the workers' `accessControl` since the `default` system access control [does not allow graceful
# shutdowns](https://trino.io/docs/current/admin/graceful-shutdown.html).
# The user must set the `terminationGracePeriodSeconds` to a value of at least two times the configured `gracePeriodSeconds`.
# The worker that receives the graceful shutdown request [will sleep for `gracePeriod` twice](https://trino.io/docs/current/admin/graceful-shutdown.html#shutdown-behavior).
#
# Enabling `worker.gracefulShutdown` conflicts with `worker.lifecycle`. If you need to provide a custom
# `worker.lifecycle` configuration, and you want to enable `worker.gracefulShutdown`, you have to do so manually.
#
# @raw
# Example:
# ```yaml
# gracefulShutdown:
# enabled: true
# gracePeriodSeconds: 120
# ```

terminationGracePeriodSeconds: 30
Expand Down
6 changes: 6 additions & 0 deletions test-graceful-shutdown-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
worker:
gracefulShutdown:
enabled: true
gracePeriodSeconds: 60

terminationGracePeriodSeconds: 120
3 changes: 2 additions & 1 deletion test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ declare -A testCases=(
[overrides]="--set coordinatorNameOverride=coordinator-overridden,workerNameOverride=worker-overridden,nameOverride=overridden"
[access_control_properties_values]="--values test-access-control-properties-values.yaml"
[exchange_manager_values]="--values test-exchange-manager-values.yaml"
[graceful_shutdown]="--values test-graceful-shutdown-values.yaml"
)

function join_by {
Expand All @@ -23,7 +24,7 @@ NAMESPACE=trino-$(LC_ALL=C tr -dc 'a-z0-9' </dev/urandom | head -c 6 || true)
HELM_EXTRA_SET_ARGS=
CT_ARGS=(--charts=charts/trino --skip-clean-up --helm-extra-args="--timeout 2m")
CLEANUP_NAMESPACE=true
TEST_NAMES=(default single_node complete_values access_control_properties_values exchange_manager_values)
TEST_NAMES=(default single_node complete_values access_control_properties_values exchange_manager_values graceful_shutdown)

usage() {
cat <<EOF 1>&2
Expand Down

0 comments on commit 35d729c

Please sign in to comment.