Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from 0.15.0 or fresh install dynatrace-operator 1.0.0: oneagent unable to start container process #2987

Closed
dhorner71 opened this issue Apr 9, 2024 · 1 comment
Labels
support request request for further assistance with an issue

Comments

@dhorner71
Copy link

dhorner71 commented Apr 9, 2024

dynatrace support ticket #306247

Describe the bug
upgrading from 0.15.0 to 1.0.0 with helm or fresh install produces oneagent daemonset pod warnings:

pod event:
OCI runtime exec failed: exec failed: unable to start container process: exec: "/usr/bin/watchdog-healthcheck64": stat /usr/bin/watchdog-healthcheck64: no such file or directory: unknown

pod status:
type: Ready
status: 'False'
reason: ContainersNotReady
message: 'containers with unready status: [dynatrace-oneagent]'

To Reproduce
Steps to reproduce the behavior:

  1. either fresh install or upgrade from 0.15.0 helm chart

helm sets:
apiUrl = <our_customer_url>
apiToken =
dataIngestToken =
installDRD = true
webhook.hostNetwork = true
image = <private_repo>
customPullSecret = <private_repo_secret>
csidriver.enabled = false

dynakube 1.0.0:
apiVersion: dynatrace.com/v1beta1
kind: DynaKube
metadata:
annotations:
feature.dynatrace.com/automatic-kubernetes-api-monitoring: 'true'
name:
namespace: dynatrace
spec:
activeGate:
capabilities:
- routing
- kubernetes-monitoring
- dynatrace-api
- metrics-ingest
group: aws-eks
image: <private_repo>/docker/dynatrace/linux/activegate:latest
resources:
limits:
cpu: 1000m
memory: 1.5Gi
requests:
cpu: 500m
memory: 512Mi
apiUrl: https://<customer_id>.live.dynatrace.com/api
customPullSecret: <private_repo_secret_name>
networkZone: <custom_zone_name>
oneAgent:
classicFullStack:
args:
- '--set-host-group=<custom_host_group>'
env:
- name: ONEAGENT_ENABLE_VOLUME_STORAGE
value: 'false'
image: <private_repo>/docker/dynatrace/linux/oneagent:latest
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
skipCertCheck: true

dynakube 0.15.0:
apiVersion: dynatrace.com/v1beta1
kind: DynaKube
metadata:
annotations:
feature.dynatrace.com/automatic-kubernetes-api-monitoring: 'true'
name:
namespace: dynatrace
spec:
activeGate:
capabilities:
- routing
- kubernetes-monitoring
- dynatrace-api
- metrics-ingest
group: <custom_group>
image: <private_repo>/docker/dynatrace/linux/activegate:latest
resources:
limits:
cpu: 1000m
memory: 1.5Gi
requests:
cpu: 500m
memory: 512Mi
apiUrl: https://<customer_id>.live.dynatrace.com/api
customPullSecret: <private_repo_secret_name>
networkZone: <custom_zone>
oneAgent:
classicFullStack:
args:
- '--set-host-group=<custom_host_group>'
env:
- name: ONEAGENT_INSTALLER_SCRIPT_URL
value: https://<customer_id>.live.dynatrace.com/api/v1/deployment/installer/agent/unix/default/latest?arch=x86
- name: ONEAGENT_INSTALLER_DOWNLOAD_TOKEN
value:
- name: ONEAGENT_INSTALLER_SKIP_CERT_CHECK
value: 'true'
image: <private_repo>/docker/dynatrace/linux/oneagent:latest
oneAgentResources:
limits:
cpu: 300m
memory: 1.5Gi
requests:
cpu: 100m
memory: 512Mi
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
skipCertCheck: true

Expected behavior
after helm install, either upgrade or fresh, healthy pods reported with no warnings reported just like 0.15.0

Screenshots
n/a

Environment (please complete the following information):

  • EKS/Kubernetes 1.27
  • terraform provider hashicorp/helm v2.13.0
  • helm v3.9.4
  • dynatrace-operator 1.0.0 helm chart
  • oneagent imageID oneagent@sha256:35b90670a961e5fa932cbee14a87b60d6e583bb9883bb05dfbbe7e180f7131b2
  • activegate imageID activegate@sha256:2a1277acacfc81dc7e1a39bf24577c900026def26b740e61013549a0d17ddc0e

Additional context
-probably unimportant but with EKS autoscaler, version 1.0.0 does require an additional node (4 total) from 0.15.0 (3 total)
-could be related but even using latest tag on images and helm chart still produces dashboard warning: "The ActiveGate monitoring this Kubernetes cluster is outdated. Please make sure that all ActiveGates have [a minimum version of 1.279] to get the latest enhancements in Kubernetes monitoring."

@luhi-DT luhi-DT added the support request request for further assistance with an issue label Apr 9, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

Thank you for opening a Dynatrace Operator Issue. We've identified and tagged the issue as a "Support request".

Dynatrace responds to requests like these via Dynatrace ONE support rather than Github. This helps our team respond as quickly as possible using the support team's tools and procedures.

Thanks for your help!

@github-actions github-actions bot closed this as completed Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support request request for further assistance with an issue
Projects
None yet
Development

No branches or pull requests

2 participants