Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index') #3358

Open
edhenry opened this issue Jun 25, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@edhenry
Copy link

edhenry commented Jun 25, 2024

Describe the bug

When deploying the Actions Runner Controller and Runner Sets on a K8s cluster using the provided documentation I am meet with the following error:

image

Run '/home/runner/k8s/index.js'
  shell: /home/runner/externals/node16/bin/node {0}
Error: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')

To Reproduce
Steps to reproduce the behavior:

  1. Install Actions Runner Controller and Runner Sets using Helm charts as outlined here. Configuration for runner-sets below:
    gha-runner-scale-set:
    ## githubConfigUrl is the GitHub url for where you want to configure runners
    ## ex: https://github.com/myorg/myrepo or https://github.com/myorg
    githubConfigUrl: **REDACTED**
    
    ## githubConfigSecret is the k8s secrets to use when auth with GitHub API.
    ## You can choose to use GitHub App or a PAT token
    # githubConfigSecret:
    ### GitHub Apps Configuration
    ## NOTE: IDs MUST be strings, use quotes
    #github_app_id: ""
    #github_app_installation_id: ""
    #github_app_private_key: |
    
    ### GitHub PAT Configuration
    # github_token: ""
    ## If you have a pre-define Kubernetes secret in the same namespace the gha-runner-scale-set is going to deploy,
    ## you can also reference it via `githubConfigSecret: pre-defined-secret`.
    ## You need to make sure your predefined secret has all the required secret data set properly.
    ##   For a pre-defined secret using GitHub PAT, the secret needs to be created like this:
    ##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_token='ghp_your_pat'
    ##   For a pre-defined secret using GitHub App, the secret needs to be created like this:
    ##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_app_id=123456 --from-literal=github_app_installation_id=654321 --from-literal=github_app_private_key='-----BEGIN CERTIFICATE-----*******'
    githubConfigSecret: **REDACTED**
    
    ## proxy can be used to define proxy settings that will be used by the
    ## controller, the listener and the runner of this scale set.
    #
    # proxy:
    #   http:
    #     url: http://proxy.com:1234
    #     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
    #   https:
    #     url: http://proxy.com:1234
    #     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
    #   noProxy:
    #     - example.com
    #     - example.org
    
    ## maxRunners is the max number of runners the autoscaling runner set will scale up to.
    maxRunners: 5
    
    ## minRunners is the min number of idle runners. The target number of runners created will be
    ## calculated as a sum of minRunners and the number of jobs assigned to the scale set.
    minRunners: 1
    
    # runnerGroup: "default"
    
    ## name of the runner scale set to create.  Defaults to the helm release name
    # runnerScaleSetName: ""
    
    ## A self-signed CA certificate for communication with the GitHub server can be
    ## provided using a config map key selector. If `runnerMountPath` is set, for
    ## each runner pod ARC will:
    ## - create a `github-server-tls-cert` volume containing the certificate
    ##   specified in `certificateFrom`
    ## - mount that volume on path `runnerMountPath`/{certificate name}
    ## - set NODE_EXTRA_CA_CERTS environment variable to that same path
    ## - set RUNNER_UPDATE_CA_CERTS environment variable to "1" (as of version
    ##   2.303.0 this will instruct the runner to reload certificates on the host)
    ##
    ## If any of the above had already been set by the user in the runner pod
    ## template, ARC will observe those and not overwrite them.
    ## Example configuration:
    #
    githubServerTLS:
      certificateFrom:
        configMapKeyRef:
          name: ca-cm
          key: ca.crt
      runnerMountPath: /usr/local/share/ca-certificates/
    
    ## Container mode is an object that provides out-of-box configuration
    ## for dind and kubernetes mode. Template will be modified as documented under the
    ## template object.
    ##
    ## If any customization is required for dind or kubernetes mode, containerMode should remain
    ## empty, and configuration should be applied to the template.
    containerMode:
      type: "kubernetes" ## type can be set to dind or kubernetes
      ## the following is required when containerMode.type=kubernetes
      kubernetesModeWorkVolumeClaim:
        accessModes: ["ReadWriteOnce"]
        # For local testing, use https://github.com/openebs/dynamic-localpv-provisioner/blob/develop/docs/quickstart.md to provide dynamic provision volume with storageClassName: openebs-hostpath
        storageClassName: "local-path"
        resources:
          requests:
            storage: 10Gi
      # kubernetesModeServiceAccount:
      #   annotations:
    
    ## listenerTemplate is the PodSpec for each listener Pod
    ## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
    # listenerTemplate:
    #   spec:
    #     containers:
    #     # Use this section to append additional configuration to the listener container.
    #     # If you change the name of the container, the configuration will not be applied to the listener,
    #     # and it will be treated as a side-car container.
    #     - name: listener
    #       securityContext:
    #         runAsUser: 1000
    #     # Use this section to add the configuration of a side-car container.
    #     # Comment it out or remove it if you don't need it.
    #     # Spec for this container will be applied as is without any modifications.
    #     - name: side-car
    #       image: example-sidecar
    
    ## template is the PodSpec for each runner Pod
    ## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
    template:
      ## template.spec will be modified if you change the container mode
      ## with containerMode.type=dind, we will populate the template.spec with following pod spec
      ## template:
      ##   spec:
      ##     initContainers:
      ##     - name: init-dind-externals
      ##       image: ghcr.io/actions/actions-runner:latest
      ##       command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
      ##       volumeMounts:
      ##         - name: dind-externals
      ##           mountPath: /home/runner/tmpDir
      ##     containers:
      ##     - name: runner
      ##       image: ghcr.io/actions/actions-runner:latest
      ##       command: ["/home/runner/run.sh"]
      ##       env:
      ##         - name: DOCKER_HOST
      ##           value: unix:///var/run/docker.sock
      ##       volumeMounts:
      ##         - name: work
      ##           mountPath: /home/runner/_work
      ##         - name: dind-sock
      ##           mountPath: /var/run
      ##     - name: dind
      ##       image: docker:dind
      ##       args:
      ##         - dockerd
      ##         - --host=unix:///var/run/docker.sock
      ##         - --group=$(DOCKER_GROUP_GID)
      ##       env:
      ##         - name: DOCKER_GROUP_GID
      ##           value: "123"
      ##       securityContext:
      ##         privileged: true
      ##       volumeMounts:
      ##         - name: work
      ##           mountPath: /home/runner/_work
      ##         - name: dind-sock
      ##           mountPath: /var/run
      ##         - name: dind-externals
      ##           mountPath: /home/runner/externals
      ##     volumes:
      ##     - name: work
      ##       emptyDir: {}
      ##     - name: dind-sock
      ##       emptyDir: {}
      ##     - name: dind-externals
      ##       emptyDir: {}
      ######################################################################################################
      ## with containerMode.type=kubernetes, we will populate the template.spec with following pod spec
      ## template:
      # spec:
      #   metadata:
      #     annotations:
      #       sidecar.istio.io/inject: "false"
      #   securityContext:
      #     runAsUser: 1001
      #     runAsGroup: 123
      #   initContainers:
      #     - name: kube-init
      #       image: ghcr.io/actions/actions-runner:latest
      #       command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"]
      #       volumeMounts:
      #         - name: work
      #           mountPath: /home/runner/_work
      #   containers:
      #     - name: runner
      #       image: ghcr.io/actions/actions-runner:2.316.1
      #       command: ["/home/runner/run.sh"]
      #       env:
      #         - name: ACTIONS_RUNNER_CONTAINER_HOOKS
      #           value: /home/runner/k8s/index.js
      #         - name: ACTIONS_RUNNER_POD_NAME
      #           valueFrom:
      #             fieldRef:
      #               fieldPath: metadata.name
      #         - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
      #           value: "true"
      #         - name: ACTIONS_RUNNER_FORCED_INTERNAL_NODE_VERSION
      #           value: node20
      #       volumeMounts:
      #         - name: work
      #           mountPath: /home/runner/_work
      #   volumes:
      #     - name: work
      #       ephemeral:
      #         volumeClaimTemplate:
      #           spec:
      #             accessModes: ["ReadWriteOnce"]
      #             storageClassName: "local-path"
      #             resources:
      #               requests:
      #                 storage: 1Gi
      spec:
        containers:
          - name: runner
            image: ghcr.io/actions/actions-runner:latest
            command: ["/home/runner/run.sh"]
    
    ## Optional controller service account that needs to have required Role and RoleBinding
    ## to operate this gha-runner-scale-set installation.
    ## The helm chart will try to find the controller deployment and its service account at installation time.
    ## In case the helm chart can't find the right service account, you can explicitly pass in the following value
    ## to help it finish RoleBinding with the right service account.
    ## Note: if your controller is installed to only watch a single namespace, you have to pass these values explicitly.
    controllerServiceAccount:
      namespace: arc-systems
      name: gh-arc-gha-rs-controller
    ```
    
  2. Create a test job to list directories
    container-test-job:
      runs-on: [ arc-runner-set ]
      container:
        image: **REDACTED**
        env:
          NODE_ENV: development
        ports:
          - 80
        volumes:
          - my_docker_volume:/volume_mount
        options: --cpus 1
      steps:
        - uses: actions/checkout@v3
        - name: Check for dockerenv file
          run: (ls /.dockerenv && echo Found dockerenv) || (echo No dockerenv)

Expected behavior

Expected behavior would be for jobs to run.

Runner Version and Platform

Runner: 2.317.0 (tested with latest, as well)
Kubernetes: 1.29.3+k3s1

OS of the machine running the runner?

Rocky Linux with K8s layered on top

What's not working?

Jobs are failing out of the gate. See error logs and screenshot above in the bug description.

Job Log Output

We don't even get to the job execution as the container initialization fails.

Runner and Worker's Diagnostic Logs

The only thing that stands out in the runner logs would be shown below, though according to this issue this might be a red herring.

|│ runner [WORKER 2024-06-25 17:55:32Z INFO ScriptHandler] Fully qualified path: '/home/runner/externals/node16/bin/node'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] Starting process:
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   File name: '/home/runner/externals/node16/bin/node'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Arguments: '/home/runner/k8s/index.js'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Working directory: '/home/runner/_work/**redacted**/**redacted**'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Require exit code zero: 'False'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Encoding web name:  ; code page: ''
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Force kill process on cancellation: 'False'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Redirected STDIN: 'True'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Persist current code page: 'False'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   Keep redirected STDIN open: 'False'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]   High priority process: 'False'
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] Failed to update oom_score_adj for PID: 1125.
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] System.UnauthorizedAccessException: Access to the path '/proc/1125/oom_score_adj' is denied.
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]  ---> System.IO.IOException: Permission denied 
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    --- End of inner exception stack trace --- 
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.Strategies.OSFileStreamStrategy.Write(ReadOnlySpan`1 buffer)
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() 
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.Strategies.BufferedFileStreamStrategy.Dispose(Boolean disposing)
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.StreamWriter.CloseStreamFromDispose(Boolean disposing)          
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.StreamWriter.Dispose(Boolean disposing)                        
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at System.IO.File.WriteAllText(String path, String contents)
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper]    at GitHub.Runner.Sdk.ProcessInvoker.WriteProcessOomScoreAdj(Int32 processId, Int32 oomScoreAdj)
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] Process started with process id 1125, waiting for process exit.
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] Close STDIN after the first redirect finished.
││ runner [WORKER 2024-06-25 17:55:32Z INFO ProcessInvokerWrapper] STDIN stream write finished.
││ runner [WORKER 2024-06-25 17:55:32Z INFO JobServerQueue] Try to append 1 batches web console lines for record '9d9c286a-505c-47d6-8e59-2aa09f9f588e', success rate: 1/1.
││ runner [WORKER 2024-06-25 17:55:32Z INFO JobServerQueue] Try to append 1 batches web console lines for record '604dede9-3987-4a10-92b3-3012e1d84a28', success rate: 1/1.
││ runner [WORKER 2024-06-25 17:55:33Z INFO JobServerQueue] Try to upload 1 log files or attachments, success rate: 1/1.```

Because the jobs are never run there are no logs to collect.

If applicable, add relevant diagnostic log information.  Logs are located in the runner's `_diag` folder. The runner logs are prefixed with `Runner_` and the worker logs are prefixed with `Worker_`. Each job run correlates to a worker log.  All sensitive information should already be masked out, but please double-check before pasting here.
@edhenry edhenry added the bug Something isn't working label Jun 25, 2024
@ekaley
Copy link

ekaley commented Jun 25, 2024

👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants