Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Use PodTemplates to support ImageJobs #555

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions api/v1alpha1/imagejob_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ limitations under the License.
package v1alpha1

import (
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

Expand All @@ -30,12 +29,6 @@ type Image struct {
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.

// ImageJobSpec defines the desired state of ImageJob.
type ImageJobSpec struct {
// Specifies the job that will be created when executing an ImageJob.
JobTemplate v1.PodTemplateSpec `json:"template"`
}

// JobPhase defines the phase of an ImageJob status.
type JobPhase string

Expand Down Expand Up @@ -74,7 +67,6 @@ type ImageJob struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Spec ImageJobSpec `json:"spec,omitempty"`
Status ImageJobStatus `json:"status,omitempty"`
}

Expand Down
17 changes: 0 additions & 17 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7,137 changes: 0 additions & 7,137 deletions config/crd/bases/eraser.sh_imagejobs.yaml

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions config/rbac/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,18 @@ rules:
- list
- update
- watch
- apiGroups:
- ""
resources:
- podtemplates
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- batch
resources:
Expand Down
157 changes: 91 additions & 66 deletions controllers/imagecollector/imagecollector_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ var (
collectorImage = flag.String("collector-image", "", "collector image, empty value disables collect feature")
log = logf.Log.WithName("controller").WithValues("process", "imagecollector-controller")
repeatPeriod = flag.Duration("repeat-period", time.Hour*24, "repeat period for collect/scan process")
repeatImmediate = flag.Bool("repeat-immediate", true, "begin collect/scan process immediately")
deleteScanFailedImages = flag.Bool("delete-scan-failed-images", true, "whether or not to delete images for which scanning has failed")
scannerArgs = utils.MultiFlag([]string{})
collectorArgs = utils.MultiFlag([]string{})
Expand Down Expand Up @@ -154,7 +155,13 @@ func add(mgr manager.Manager, r reconcile.Reconciler) error {
return err
}

go func() {
delay := *repeatPeriod
if *repeatImmediate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? is this different than setting repeat period to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collect-scan-erase process can either begin immediately or after the specified delay. Our default has been to start immediately, with no option to change that behavior. This change makes it possible to do either.

The only reason I added this to this PR is because the collector pods were messing up the tests. Some tests (for example imagelist_rm_images) test the behavior of the imagelist, and have to jump through hoops to not conflict with the collector which begins immediately after deployment. example: https://github.com/Azure/eraser/blob/76aa58d70fb1cce6f2c61f466dd4b34dceb470b5/test/e2e/tests/imagelist_rm_images/eraser_test.go#L51

Ideally I'd like to move to a more robust scheduling system (example: allow the user to specify a cron string).

Copy link
Contributor Author

@pmengelbert pmengelbert Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? is this different than setting repeat period to 0?

Sorry, misunderstood your question. Yes, it's different. Here's how it works:

  • If repeatPeriod = 30 min and repeatImmediate = true, then we will run one job at startup, then the next job 30 minutes after that
  • If repeatPeriod = 30 min and repeatImmediate = false, then we will wait 30 mins after startup to run the first job

On the other hand, setting repeatPeriod to 0 would start immediately and repeat endlessly with no delay

Copy link
Member

@sozercan sozercan Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, repeat seems a bit misleading. perhaps we can name it something like schedule-immediate, queue-immediate or erase-immediate (or anything else you can think of)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

delay = 0 * time.Second
}

// runs the provided function after the specified delay
_ = time.AfterFunc(delay, func() {
log.Info("Queueing first ImageCollector reconcile...")
ch <- event.GenericEvent{
Object: &eraserv1alpha1.ImageJob{
Expand All @@ -163,13 +170,15 @@ func add(mgr manager.Manager, r reconcile.Reconciler) error {
},
},
}
log.Info("Queued first ImageCollector reconcile")
}()

log.Info("...Queued first ImageCollector reconcile")
pmengelbert marked this conversation as resolved.
Show resolved Hide resolved
})

return nil
}

//+kubebuilder:rbac:groups="batch",resources=jobs,verbs=get;list;create;delete;watch
Copy link
Member

@sozercan sozercan Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this or is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still see this, do we want to remove in this pr?

Copy link
Contributor Author

@pmengelbert pmengelbert Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, you meant the kubebuilder comment? Or the log.Info? Both have been removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, should have clarified. I meant kubebuilder marker

//+kubebuilder:rbac:groups="",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
Expand Down Expand Up @@ -233,75 +242,74 @@ func (r *Reconciler) createImageJob(ctx context.Context, req ctrl.Request, argsC
scanDisabled := *scannerImage == ""
startTime = time.Now()

job := &eraserv1alpha1.ImageJob{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "imagejob-",
Labels: map[string]string{
util.ImageJobOwnerLabelKey: ownerLabelValue,
jobTemplate := corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
{
// EmptyDir default
Name: "shared-data",
},
},
},
Spec: eraserv1alpha1.ImageJobSpec{
JobTemplate: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
{
// EmptyDir default
Name: "shared-data",
RestartPolicy: corev1.RestartPolicyNever,
Containers: []corev1.Container{
{
Name: "collector",
Image: *collectorImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Args: append(collectorArgs, "--scan-disabled="+strconv.FormatBool(scanDisabled)),
VolumeMounts: []corev1.VolumeMount{
{MountPath: "/run/eraser.sh/shared-data", Name: "shared-data"},
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
"cpu": resource.MustParse("7m"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we hardcoding these?

Copy link
Contributor Author

@pmengelbert pmengelbert Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have been hardcoded for some time, the code just moved due to indentation changes, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this part of the config we talked about before?

Copy link
Contributor Author

@pmengelbert pmengelbert Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be. I'll leave a link to this comment in the configmap issue #446

"memory": resource.MustParse("25Mi"),
},
Limits: corev1.ResourceList{
"memory": resource.MustParse("30Mi"),
},
},
RestartPolicy: corev1.RestartPolicyNever,
Containers: []corev1.Container{
},
{
Name: "eraser",
Image: *util.EraserImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Args: append(util.EraserArgs, "--log-level="+logger.GetLevel()),
VolumeMounts: []corev1.VolumeMount{
{MountPath: "/run/eraser.sh/shared-data", Name: "shared-data"},
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
"cpu": resource.MustParse("7m"),
"memory": resource.MustParse("25Mi"),
},
Limits: corev1.ResourceList{
"memory": resource.MustParse("30Mi"),
},
},
SecurityContext: utils.SharedSecurityContext,
Env: []corev1.EnvVar{
{
Name: "collector",
Image: *collectorImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Args: append(collectorArgs, "--scan-disabled="+strconv.FormatBool(scanDisabled)),
VolumeMounts: []corev1.VolumeMount{
{MountPath: "/run/eraser.sh/shared-data", Name: "shared-data"},
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
"cpu": resource.MustParse("7m"),
"memory": resource.MustParse("25Mi"),
},
Limits: corev1.ResourceList{
"memory": resource.MustParse("30Mi"),
},
},
Name: "OTEL_EXPORTER_OTLP_ENDPOINT",
Value: *util.OtlpEndpoint,
},
{
Name: "eraser",
Image: *util.EraserImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Args: append(util.EraserArgs, "--log-level="+logger.GetLevel()),
VolumeMounts: []corev1.VolumeMount{
{MountPath: "/run/eraser.sh/shared-data", Name: "shared-data"},
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
"cpu": resource.MustParse("7m"),
"memory": resource.MustParse("25Mi"),
},
Limits: corev1.ResourceList{
"memory": resource.MustParse("30Mi"),
},
},
SecurityContext: utils.SharedSecurityContext,
Env: []corev1.EnvVar{
{
Name: "OTEL_EXPORTER_OTLP_ENDPOINT",
Value: *util.OtlpEndpoint,
},
{
Name: "OTEL_SERVICE_NAME",
Value: "eraser",
},
},
Name: "OTEL_SERVICE_NAME",
Value: "eraser",
},
},
ServiceAccountName: "eraser-imagejob-pods",
},
},
ServiceAccountName: "eraser-imagejob-pods",
},
}

job := &eraserv1alpha1.ImageJob{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "imagejob-",
Labels: map[string]string{
util.ImageJobOwnerLabelKey: ownerLabelValue,
},
},
}

Expand Down Expand Up @@ -344,7 +352,7 @@ func (r *Reconciler) createImageJob(ctx context.Context, req ctrl.Request, argsC
},
},
}
job.Spec.JobTemplate.Spec.Containers = append(job.Spec.JobTemplate.Spec.Containers, scannerContainer)
jobTemplate.Spec.Containers = append(jobTemplate.Spec.Containers, scannerContainer)
}

configmapList := &corev1.ConfigMapList{}
Expand All @@ -359,18 +367,35 @@ func (r *Reconciler) createImageJob(ctx context.Context, req ctrl.Request, argsC
return reconcile.Result{}, err
}

for i := range job.Spec.JobTemplate.Spec.Containers {
job.Spec.JobTemplate.Spec.Containers[i].VolumeMounts = append(job.Spec.JobTemplate.Spec.Containers[i].VolumeMounts, exclusionMount...)
for i := range jobTemplate.Spec.Containers {
jobTemplate.Spec.Containers[i].VolumeMounts = append(jobTemplate.Spec.Containers[i].VolumeMounts, exclusionMount...)
}

job.Spec.JobTemplate.Spec.Volumes = append(job.Spec.JobTemplate.Spec.Volumes, exclusionVolume...)
jobTemplate.Spec.Volumes = append(jobTemplate.Spec.Volumes, exclusionVolume...)

err = r.Create(ctx, job)
if err != nil {
log.Info("Could not create collector ImageJob")
return reconcile.Result{}, err
}

namespace := utils.GetNamespace()
template := corev1.PodTemplate{
ObjectMeta: metav1.ObjectMeta{
Name: job.GetName(),
Namespace: namespace,
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(job, eraserv1alpha1.GroupVersion.WithKind("ImageJob")),
},
},
Template: jobTemplate,
}
err = r.Create(ctx, &template)
if err != nil {
log.Info("Could not create collector PodTemplate")
pmengelbert marked this conversation as resolved.
Show resolved Hide resolved
return reconcile.Result{}, err
}

log.Info("Successfully created collector ImageJob", "job", job.Name)
return reconcile.Result{}, nil
}
Expand Down
Loading