-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ImageListPullJob to simplify ImagePullJob #1222
Conversation
pkg/controller/imagelistpulljob/imagelistpulljob_controller_test.go
Outdated
Show resolved
Hide resolved
pkg/controller/imagelistpulljob/imagelistpulljob_controller_test.go
Outdated
Show resolved
Hide resolved
pkg/controller/imagelistpulljob/imagelistpulljob_event_handler.go
Outdated
Show resolved
Hide resolved
pkg/controller/imagelistpulljob/imagelistpulljob_controller_test.go
Outdated
Show resolved
Hide resolved
pkg/controller/imagelistpulljob/imagelistpulljob_controller_test.go
Outdated
Show resolved
Hide resolved
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## master #1222 +/- ##
==========================================
+ Coverage 50.04% 50.11% +0.07%
==========================================
Files 143 146 +3
Lines 19898 20131 +233
==========================================
+ Hits 9958 10089 +131
- Misses 8844 8928 +84
- Partials 1096 1114 +18
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@diannaowa Are you convenient to discuss it together at the zoom community meeting at 7:30 p.m. on 3.23? |
Thanks for your reply. If I have time next Thursday, I will attend the meeting. |
This ability is still a little complicated, so I still hope to communicate in meetings. |
Thanks, I try to make time for meetings. I have added the topic to the document of the biweekly meeting. |
As discussed on the biweekly, I updated the design of // ImageListPullJobStatus defines the observed state of ImageListPullJob
type ImageListPullJobStatus struct {
// Represents time when the job was acknowledged by the job controller.
// It is not guaranteed to be set in happens-before order across separate operations.
// It is represented in RFC3339 form and is in UTC.
// +optional
StartTime *metav1.Time `json:"startTime,omitempty"`
// Represents time when the all the image pull job was completed. It is not guaranteed to
// be set in happens-before order across separate operations.
// It is represented in RFC3339 form and is in UTC.
// +optional
CompletionTime *metav1.Time `json:"completionTime,omitempty"`
// The desired number of ImagePullJobs, this is typically equal to the number of len(spec.Images).
Desired int32 `json:"desired"`
// The number of actively running ImagePullJobs(status.ACTIVE>0).
// +optional
Active int32 `json:"active"`
// The number of ImagePullJobs which status.CompletionTime!=nil
// +optional
Completed int32 `json:"completed"`
// The number of image pull job which status.Failed==0.
// +optional
Succeeded int32 `json:"succeeded"`
// Succeeded/Completed
// +optional
Status string `json:"status"`
// The status of ImagePullJob which has the failed nodes(status.Failed>0) .
// +optional
FailedImagePullJobStatuses []FailedImagePullJobStatus `json:"failedImagePullJobs,omitempty"`
}
// FailedImagePullJobStatus the state of ImagePullJob which has the failed nodes(status.Failed>0)
type FailedImagePullJobStatus struct {
// The name of ImagePullJob which has the failed nodes(status.Failed>0)
// +optional
Name string `json:"name,omitempty"`
// The number of pulling tasks which reached phase Failed of this ImagePullJob.
// +optional
Failed int32 `json:"failed"`
// The nodes that failed to pull the image of this ImagePullJob.
// +optional
FailedNodes []string `json:"failedNodes,omitempty"`
} Any other idea on this would be helpful. Thanks. |
How about failedNodes:
- name: 32.234.2.1
images:
- busybox:1.32
- centos:latest Moreover, we should consider to limit the number of failed nodes in status. |
If the design about failedImagePullJobStatuses:
- name: imagepulljob-xxx-xxxx
failed: 27
failedNodes:
- name: 32.234.2.1
images:
- busybox:1.32
- name: 32.234.2.2
images:
- busybox:1.32
- name: 32.234.2.3
images:
- busybox:1.32 How about: failedImagePullJobStatuses:
- name: imagepulljob-xxx-xxxx
image: busybox:1.32
failed: 27
failedNodes:
- name: 32.234.2.1
- name: 1.1.1.1
- name: 2.2.2.2 the failedNodes will be same as ImagePullJob.Status.FailedNodes About the limit on the maximum number of failed nodes: |
@diannaowa good idea! |
@veophi To make ImageListPullJob more flexible. imageListPullJobSpec:
pullSecrets:
selector:
podSelector:
sandboxConfig:
completionPolicy:
imagePullJobTemplate:
- image: nginx:1.1.1
pullSecrets:
selector:
podSelector:
sandboxConfig:
completionPolicy:
pullPolicy:
parallelism:
- image: httpd:2.34
pullSecrets:
selector:
podSelector:
sandboxConfig:
CompletionPolicy:
pullPolicy:
parallelism:
- image: busybox
pullSecrets: null
selector: null
podSelector: null
sandboxConfig: null
CompletionPolicy:
pullPolicy:
parallelism: The imagePullJobTemplate.pullSecrets will be override imageListPullJobSpec.pullSecrets. |
5fe4e82
to
aa2db64
Compare
@zmberg PTAL |
8b832f8
to
3ec1630
Compare
/lgtm |
Signed-off-by: liuzhenwei <dui_zhang@163.com> calculate status for imagelistpulljob Signed-off-by: liuzhenwei <dui_zhang@163.com> make generate manifests Signed-off-by: liuzhenwei <dui_zhang@163.com> add imagelistpulljob.status.status Signed-off-by: liuzhenwei <dui_zhang@163.com> make generate manifests Signed-off-by: liuzhenwei <dui_zhang@163.com> regist webhook handler delete image pull job which is not existed in ImageListPullJob.Spec.Images Signed-off-by: liuzhenwei <dui_zhang@163.com> support the same behavior as image pull job for TTLSecondsAfterFinished and CompletionTime fields Signed-off-by: liuzhenwei <dui_zhang@163.com> resourceVersionExpectations Signed-off-by: liuzhenwei <dui_zhang@163.com> add ut Signed-off-by: liuzhenwei <dui_zhang@163.com> verify the maximum number of images cannot > 255 Signed-off-by: liuzhenwei <dui_zhang@163.com> make generate manifests Signed-off-by: liuzhenwei <dui_zhang@163.com> add failled image pull job status Signed-off-by: liuzhenwei <dui_zhang@163.com> simplify imageListPullJobStatus and spec Signed-off-by: liuzhenwei <dui_zhang@163.com> fix mdlint Signed-off-by: liuzhenwei <dui_zhang@163.com> define ImagePullJobTemplate & fix imageliststatus when completionPolicy.Type is Never Signed-off-by: liuzhenwei <dui_zhang@163.com> fix,some print info Signed-off-by: liuzhenwei <dui_zhang@163.com> trigger ci Signed-off-by: liuzhenwei <dui_zhang@163.com> fix some issues of code Signed-off-by: liuzhenwei <dui_zhang@163.com> fix some logic of Expectations Signed-off-by: liuzhenwei <dui_zhang@163.com> Check for duplicate values of spec.images Signed-off-by: liuzhenwei <dui_zhang@163.com> move proposal doc to other PR Signed-off-by: liuzhenwei <dui_zhang@163.com> trigger ci&& modify comment Signed-off-by: liuzhenwei <dui_zhang@163.com> add e2e Signed-off-by: liuzhenwei <dui_zhang@163.com> remove phase field from status and and remove the unnecessary deepcopy add ut for computeImagePullJobActions and fix some bugs Signed-off-by: liuzhenwei <dui_zhang@163.com>
@zmberg PTAL |
/lgtm |
/lgtm |
1 similar comment
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zmberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
calculate status for imagelistpulljob make generate manifests add imagelistpulljob.status.status make generate manifests regist webhook handler delete image pull job which is not existed in ImageListPullJob.Spec.Images support the same behavior as image pull job for TTLSecondsAfterFinished and CompletionTime fields resourceVersionExpectations add ut verify the maximum number of images cannot > 255 make generate manifests add failled image pull job status simplify imageListPullJobStatus and spec fix mdlint define ImagePullJobTemplate & fix imageliststatus when completionPolicy.Type is Never fix,some print info trigger ci fix some issues of code fix some logic of Expectations Check for duplicate values of spec.images move proposal doc to other PR trigger ci&& modify comment add e2e remove phase field from status and and remove the unnecessary deepcopy add ut for computeImagePullJobActions and fix some bugs Signed-off-by: liuzhenwei <dui_zhang@163.com>
calculate status for imagelistpulljob make generate manifests add imagelistpulljob.status.status make generate manifests regist webhook handler delete image pull job which is not existed in ImageListPullJob.Spec.Images support the same behavior as image pull job for TTLSecondsAfterFinished and CompletionTime fields resourceVersionExpectations add ut verify the maximum number of images cannot > 255 make generate manifests add failled image pull job status simplify imageListPullJobStatus and spec fix mdlint define ImagePullJobTemplate & fix imageliststatus when completionPolicy.Type is Never fix,some print info trigger ci fix some issues of code fix some logic of Expectations Check for duplicate values of spec.images move proposal doc to other PR trigger ci&& modify comment add e2e remove phase field from status and and remove the unnecessary deepcopy add ut for computeImagePullJobActions and fix some bugs Signed-off-by: liuzhenwei <dui_zhang@163.com>
calculate status for imagelistpulljob make generate manifests add imagelistpulljob.status.status make generate manifests regist webhook handler delete image pull job which is not existed in ImageListPullJob.Spec.Images support the same behavior as image pull job for TTLSecondsAfterFinished and CompletionTime fields resourceVersionExpectations add ut verify the maximum number of images cannot > 255 make generate manifests add failled image pull job status simplify imageListPullJobStatus and spec fix mdlint define ImagePullJobTemplate & fix imageliststatus when completionPolicy.Type is Never fix,some print info trigger ci fix some issues of code fix some logic of Expectations Check for duplicate values of spec.images move proposal doc to other PR trigger ci&& modify comment add e2e remove phase field from status and and remove the unnecessary deepcopy add ut for computeImagePullJobActions and fix some bugs Signed-off-by: liuzhenwei <dui_zhang@163.com>
calculate status for imagelistpulljob make generate manifests add imagelistpulljob.status.status make generate manifests regist webhook handler delete image pull job which is not existed in ImageListPullJob.Spec.Images support the same behavior as image pull job for TTLSecondsAfterFinished and CompletionTime fields resourceVersionExpectations add ut verify the maximum number of images cannot > 255 make generate manifests add failled image pull job status simplify imageListPullJobStatus and spec fix mdlint define ImagePullJobTemplate & fix imageliststatus when completionPolicy.Type is Never fix,some print info trigger ci fix some issues of code fix some logic of Expectations Check for duplicate values of spec.images move proposal doc to other PR trigger ci&& modify comment add e2e remove phase field from status and and remove the unnecessary deepcopy add ut for computeImagePullJobActions and fix some bugs
Define ImageListPullJob
ImageListPullJob:
Define ImageListPullJobStatus
About the deletion of ImageListPullJob
ImageListPullJob
will be deleted when this condition is met.Webhook
Added webhook for ImageListPullJob, validating-webhook is used to verify the validity of some values (such as spec.Images field), and mutating-webhook is used to set some default values. The webhook behavior here is basically similar to that of ImagePullJob.
Controller: Watch & Generate ImagePullJob
Added webhook for ImageListPullJob, validating-webhook is used to verify the validity of some values (such as spec.Images field), and mutating-webhook is used to set some default values. The webhook behavior here is basically similar to that of ImagePullJob.
Implementation
ImageListPullJob Event Handler: Watch the add, delete, update events
ImagePullJob Event Handler: Watch the ImagePullJob that has changed the status.
Create a new ImagePullJob according to spec.Images or delete redundant ImagePullJob (when the job is completed)
When the current ImageListPullJob.StatusCompletionTime!=nil, when CompletionPolicy.Type=Always && CompletionPolicy.TTLSecondsAfterFinished, try to delete the current ImageListPullJob
Get the ImagePullJobs currently owned by ImageListPullJob
Calculate the status of the current ImageListPullJob, the ImagePullJob that needs to be created, and the ImagePullJob that needs to be deleted(the spec.image not in ImageListPullJob.Spec.Images)
Synchronize ImagePullJob(create new ImagePullJob and delete the ImagePullJob which spec.image not in ImageListPullJob.Spec.Images)
Update ImageListPullJob to the latest status
The example is as follows
Create a new ImageListPullJob based on the above Yaml
The controller will create ImagePullJob with the Spec.Images
TOTAL: the number of ImagePullJob
ACTIVE: Indicates that there are 3 Active ImagePullJobs
STATUS: The format is Succeeded/Completed, which means {the number of successful ImagePullJobs}/{the number of completed ImagePullJobs}
Note:
Successful ImagePullJob definition: ImagePullJob.Status.Desired==ImagePullJob.Status.Succeeded (and ImagePullJob.Status.CompletionTime!=nil)
Ⅱ. Does this pull request fix one issue?
fixes #1211
Ⅲ. Describe how to verify it
Ⅳ. Special notes for reviews