kim built images are asynchronously replicated into k8s.io namespace #84

milas · 2021-10-21T13:11:38Z

Current Behavior

The kim agent watches for containerd image events:

Lines 82 to 105 in e597b95

    
           func (a *Agent) syncImageContent(ctx context.Context, ctr *containerd.Client) { 
        
           	events, errors := ctr.EventService().Subscribe(ctx, `topic~="/images/"`) 
        
           	for { 
        
           		select { 
        
           		case <-ctx.Done(): 
        
           			return 
        
           		case err, ok := <-errors: 
        
           			if !ok { 
        
           				return 
        
           			} 
        
           			logrus.Errorf("sync-image-content: %v", err) 
        
           		case evt, ok := <-events: 
        
           			if !ok { 
        
           				return 
        
           			} 
        
           			if evt.Namespace != buildkitNamespace { 
        
           				continue 
        
           			} 
        
           			if err := handleImageEvent(ctx, ctr, evt.Event); err != nil { 
        
           				logrus.Errorf("sync-image-content: handling %#v returned %v", evt, err) 
        
           			} 
        
           		} 
        
           	} 
        
           }

On image create/update events, the handler copies the new/updated image to the k8s.io namespace so that it's visible to CRI/usable by kubelet.

This all happens asynchronously / in its own goroutine. kim build is unaware this is happening and does not block on it.

Desired Behavior

It'd be nice to be able to (optionally?) wait for the sync to have finished when calling kim build to guarantee the image is ready for use.

Context

As it stands, it's possible to build an image with kim and attempt to use it in a Deployment before the sync has finished, resulting in errors/retries on the K8s side.

We're seeing this with Tilt, where we have a kim_build extension - Tilt calls kim and then applies the updated YAML to the cluster, resulting in some retries/backoff because the sync might not be done yet.

The text was updated successfully, but these errors were encountered:

dweomer · 2021-10-22T04:50:20Z

I was thinking about this when I was working on #79 to fix #74 (one of the reason I hadn't merged #79 yet: for the edge case I was attempting to fix this asynchronicity became more pronounced). My idea to fix #79 is to refactor the content copy to happen on a calling context initiated by the client but still mediated by the backend agent, similar to how pull/fetch works. Then the default client implementation would be to block on copy progress (with a reasonable timeout) which would address the problem that you have encountered.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kim built images are asynchronously replicated into k8s.io namespace #84

kim built images are asynchronously replicated into k8s.io namespace #84

milas commented Oct 21, 2021

dweomer commented Oct 22, 2021 •

edited

Loading

kim built images are asynchronously replicated into k8s.io namespace #84

kim built images are asynchronously replicated into k8s.io namespace #84

Comments

milas commented Oct 21, 2021

Current Behavior

Desired Behavior

Context

dweomer commented Oct 22, 2021 • edited Loading

dweomer commented Oct 22, 2021 •

edited

Loading