Skip to content

Troubleshooting

Steve Salas edited this page Dec 11, 2020 · 4 revisions

Use this page to resolve issues you may encounter when running Code Dx on Kubernetes (k8s).

Note: Commands found in a troubleshooting item will use the following placeholders that you should replace with values matching your deployment.

  • cdx-svc is the tool orchestration k8s namespace
  • minio-pod is the MinIO pod name
  • codedx-tool-orchestration-minio is the MinIO pod deployment name

MinIO is already stopped

If the MinIO pod does not come online, run the following command to view and follow the MinIO log (replace cdx-svc and minio-pod with your tool orchestration namespace and MinIO pod name).

kubectl -n cdx-svc logs -f minio-pod

If the log command ends with a final message that reads 'MinIO is already stopped' like in the example below, try removing the .minio.sys directory.

 18:08:49.99
 18:08:49.99 Welcome to the Bitnami minio container
 18:08:49.99 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-minio
 18:08:49.99 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-minio/issues
 18:08:50.00 Send us your feedback at containers@bitnami.com
 18:08:50.00
 18:08:50.00 INFO  ==> ** Starting MinIO setup **
 18:08:50.03 INFO  ==> Starting MinIO in background...
 18:09:00.03 INFO  ==> Adding local Minio host to 'mc' configuration...
 18:09:00.14 INFO  ==> MinIO is already stopped...

To remove the .minio.sys directory and restart MinIO:

  1. Scale the MinIO deployment to 0 replicas with this command (replace cdx-svc and codedx-tool-orchestration-minio with your tool orchestration namespace and MinIO deployment name):
kubectl -n cdx-svc scale --replicas=0 deployment/codedx-tool-orchestration-minio
  1. Access the MinIO Persistent Volume data and rename the .minio.sys directory with this command:
mv .minio.sys .minio.sys.old
  1. Scale the MinIO deployment to 1 replica with this command (replace cdx-svc and codedx-tool-orchestration-minio with your tool orchestration namespace and MinIO deployment name):
kubectl -n cdx-svc scale --replicas=1 deployment/codedx-tool-orchestration-minio

Code Dx is using certificates that will expire soon

Certificates requested by the setup script will have an expiration date, and Code Dx will stop working when the certificates expire. The setup script displays certificate expiration dates as it finishes.

To refresh certificates for Code Dx components before they expire:

  1. Determine whether your cluster's CA certificate has expired. That certificate is referenced by the clusterCertificateAuthorityCertPath parameter in your run-setup.ps1 command. If necessary, obtain a new version from your cluster.

  2. Rerun your run-setup.ps1 command, which will generate new certificates for your Code Dx components.

  3. Restart the Code Dx, MariaDB, Tool Service, and MinIO components by running the Code Dx Console script (at admin/console.ps1), with script parameter values suitable for your environment, and enter commands R1, R3, R2, and R4 respectively. If you are not using Code Dx Tool Orchestration, you can ignore the Tool Service (R2) and MinIO (R4) components.

Code Dx is using expired certificates

Certificates requested by the setup script will have an expiration date, and Code Dx will stop working when the certificates expire.

An expired MariaDB certificate will cause entries in the Code Dx log that look like this:

java.sql.SQLNonTransientConnectionException: ...
Could not connect to codedx-mariadb:3306 : ...
sun.security.validator.ValidatorException: ...
PKIX path validation failed: ...
java.security.cert.CertPathValidatorException: validity check failed

An expired MinIO certificate will cause entries in the Tool Service log that look like this:

mc: <ERROR> Cannot get service status. ...
Get https://codedx-tool-orchestration-minio.cdx-svc.svc.cluster.local:9000/minio/admin/v2/info: ...
x509: certificate has expired or is not yet valid.

One or more certificates generated by the setup.ps1 script may have expired.

To restart Code Dx components with new certificates:

  1. Shut down the Code Dx, MariaDB, Tool Service, and MinIO components by running the Code Dx Console script (at admin/console.ps1), with script parameter values suitable for your environment, and enter commands S1, S3, S2, and S4 respectively. If you are not using Code Dx Tool Orchestration, you can ignore the Tool Service (S2) and MinIO (S4) components.

  2. Determine whether your cluster's CA certificate has expired. That certificate is referenced by the clusterCertificateAuthorityCertPath parameter in your run-setup.ps1 command. If it's expired, obtain a new version from your cluster.

  3. Rerun your run-setup.ps1 command, which will generate new certificates for the Code Dx, MariaDB, Tool Service, and MinIO components.

  4. When the setup process reaches the point where it's waiting for the Tool Orchestration deployment to succeed (look for the "Fetching status of deployment named codedx-tool-orchestration in namespace..." message), rerun the Code Dx Console script (at admin/console.ps1), with script parameter values suitable for your environment, and enter R4 to start up the MinIO pod. The MinIO pod will not start up on its own, and the tool service pod(s) cannot initialize until MinIO is available.

The setup script will generate new certificates each time it runs, and new Code Dx pods will mount certificate data when they initialize. When the setup script completes, Code Dx will be back online using certificates that will expire in the future. The setup script will print certificate expiration dates post-setup. You can rerun the setup script and restart Code Dx pods before the expiration times to avoid entering an expired-certificate state again.