Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

pod-checkpointer: Update to pod-checkpointer image that fixes #432 #498

Merged
merged 1 commit into from
Jun 2, 2020

Conversation

ipochi
Copy link
Member

@ipochi ipochi commented May 28, 2020

  • Fixes PodCheckpointer cannot talk to kubelet and logs errors #432.

  • pod-checkpointer pod logs errors as it cannot talk to the kubelet api
    as mentioned in the issue.

    This is becuase the pod-checkpointer queries the localhost at 10250 and
    10255 ports to talk to the kubelet API.

    Since in Lokomotive, we run the kubelet as a pod, this causes
    connection refused errors as nothing is listening at localhost.

    To rectify the above problem, the solution requires chanves in two
    places:

    1. Provide a way to access the pod's status.hostIP as an environment
      variable named HOST_IP to use instead of localhost. This is done by
      modifying the daemonset configuration of kubelet.

    2. Change the pod-checkpointer code to use the HOST_IP from above.
      This is done in the PR: https://github.com/kinvolk/bootkube/pull/2

  • Next we update the image tag to use the new pod-checkpointer image.
    The tag includes the commit id which made this change.

  • Lastly we update the generated assets.

Signed-off-by: Imran Pochi imran@kinvolk.io

@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch from b6a1130 to ce9041a Compare May 28, 2020 13:05
@ipochi ipochi requested a review from invidian May 28, 2020 13:05
@ipochi ipochi requested a review from surajssd May 28, 2020 13:05
@ipochi ipochi requested a review from iaguis May 28, 2020 13:05
@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch from ce9041a to 29e2a28 Compare May 28, 2020 13:06
@ipochi ipochi changed the title pod-checkpointer: Update to pod-checkpointer image that fixes #342 pod-checkpointer: Update to pod-checkpointer image that fixes #432 May 28, 2020
@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch from 29e2a28 to e622e63 Compare May 28, 2020 15:06
invidian
invidian previously approved these changes May 28, 2020
iaguis
iaguis previously approved these changes May 28, 2020
Copy link
Contributor

@iaguis iaguis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a small comment to https://github.com/kinvolk/bootkube/pull/2 but otherwise this LGTM.

@iaguis
Copy link
Contributor

iaguis commented May 28, 2020

You might wanna mention in the commit message the Kubelet running as a DaemonSet listens on the Host IP of the node and that's why you need the pod checkpointer to access the Host IP.

@ipochi ipochi dismissed stale reviews from iaguis and invidian via 7ea91f3 May 29, 2020 10:29
@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch 2 times, most recently from 7ea91f3 to c94ed88 Compare May 29, 2020 10:32
@ipochi
Copy link
Member Author

ipochi commented May 29, 2020

You might wanna mention in the commit message the Kubelet running as a DaemonSet listens on the Host IP of the node and that's why you need the pod checkpointer to access the Host IP.

done.

@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch from c94ed88 to e650fcc Compare May 29, 2020 12:05
@ipochi ipochi requested a review from iaguis May 29, 2020 12:05
iaguis
iaguis previously approved these changes May 29, 2020
Copy link
Contributor

@iaguis iaguis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

invidian
invidian previously approved these changes May 29, 2020
@ipochi ipochi dismissed stale reviews from invidian and iaguis via 6015488 May 29, 2020 15:06
@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch 2 times, most recently from 6015488 to 3dbfbfb Compare May 29, 2020 15:18
@surajssd
Copy link
Member

Baremetal pipline has been a PITA, @ipochi can you rebase and trigger CI again?

- Fixes #432.

- pod-checkpointer pod logs errors as it cannot talk to the kubelet api
  as mentioned in the issue.

  This is becuase the pod-checkpointer queries the localhost at 10250 and
  10255 ports to talk to the kubelet API.

  Since in Lokomotive, we run the kubelet as a pod, this causes
  `connection refused` errors as nothing is listening at localhost.

  To rectify the above problem, the solution requires chanves in two
  places:

  1. Provide a way to access the pod's `status.hostIP` as an environment
     variable named `HOST_IP` to use instead of localhost. This is done by
     modifying the daemonset configuration of kubelet.

  2. Change the pod-checkpointer code to use the `HOST_IP` from above.
     This is done in the PR: https://github.com/kinvolk/bootkube/pull/2

- Next we update the image tag to use the new pod-checkpointer image.
  The tag includes the commit id which made this change.

- Lastly we update the generated assets.

Signed-off-by: Imran Pochi <imran@kinvolk.io>
@ipochi ipochi force-pushed the imran/lokomotive-issue-432 branch from 3dbfbfb to 8714e85 Compare June 2, 2020 06:11
@ipochi
Copy link
Member Author

ipochi commented Jun 2, 2020

@surajssd @invidian @iaguis CI passes. Please re-review.

@ipochi ipochi requested review from iaguis and invidian June 2, 2020 06:44
@ipochi ipochi merged commit 5f73c24 into master Jun 2, 2020
@ipochi ipochi deleted the imran/lokomotive-issue-432 branch June 2, 2020 08:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PodCheckpointer cannot talk to kubelet and logs errors
4 participants