Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update api call returns error frequently #403

Closed
jakesjohn opened this issue Apr 22, 2019 · 10 comments
Closed

Update api call returns error frequently #403

jakesjohn opened this issue Apr 22, 2019 · 10 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@jakesjohn
Copy link

Referring to #245, Is it still the case that "Create/Update/Delete operations are performed using non-caching clients and return before any caches are likely to have seen the changes." ?

In my reconciler function, workflow is : Get instance call ->modify instance->Update instance call.
I am seeing similar problem where Update in the reconciler function frequently returns error - "the object has been modified; please apply your changes to the latest version and try again".
Is this expected by design as Get returns a state entry and update is tried on a stale entry?

If this is expected and if reconciler function returns error due to this issue, I am seeing that error logs are flooded in the container logs due to https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/internal/controller/controller.go#L212

Anyway to handle this? Any help is appreciated

@DirectXMan12

@jakesjohn jakesjohn changed the title Update Update api call returns error frequently Apr 22, 2019
@ichekrygin
Copy link
Contributor

Hi, I am seeing a similar issue.
Update increments the object's ResourceVersion, while subsequent reconcile loop reads the "stale" value:

2019-05-14T13:36:22.396-0700    DEBUG    test.controller.test-controller    reconciling resource claims
2019-05-14T13:36:24.186-0700    INFO    test.controller.test-controller    *************
2019-05-14T13:36:24.186-0700    INFO    test.controller.test-controller    Before Update    {"meta": "606382"}
2019-05-14T13:36:24.187-0700    INFO    test.controller.test-controller    After Update   {"meta": "606386"} <--- NEW 
2019-05-14T13:36:24.187-0700    DEBUG    test.controller.test-controller    reconciling resource claims
2019-05-14T13:36:25.990-0700    INFO    test.controller.test-controller    *************
2019-05-14T13:36:25.991-0700    INFO    test.controller.test-controller    Before Update   {"meta": "606382"} <--- STALE
2019-05-14T13:36:25.995-0700    INFO    test.controller.test-controller    After Update   {"meta": "606382"}

@DirectXMan12
Copy link
Contributor

Hmm... what's triggering the reconcile where you read stale data? I wouldn't expect stale reads to be super common unless two controllers are reconciling the same resource based on the exact conditions.

One thing to check is what you're doing on an out-of-date error. If you're returning that error, you're going to see those as you try to back off. What you really want is to return a nil error and wait till you get a requeue from seeing the update. This is a bit of a foot-gun, and it's something I really want to fix, but it requires a bit of thought. See #377

Does that answer your question?

@DirectXMan12
Copy link
Contributor

/triage support

@rohantmp
Copy link

rohantmp commented Aug 23, 2019

I'm seeing this issue. Wasn't when I initially started using 0.2, but CreateOrUpdate attempting to Update always does.

@DirectXMan12
Copy link
Contributor

@rohantmp can you post an example reproducer?

@rohantmp
Copy link

rohantmp commented Aug 26, 2019

Sorry for not updating, I was accidentally running two controllers against the same resource. One running locally and one as a deployment. This was causing a race.

@DirectXMan12
Copy link
Contributor

ah, that would explain it :-)

@fanux
Copy link

fanux commented Oct 23, 2019

Concurrence update object using cache client seems has a problem:

`2019-10-18T03:01:39.039Z ERROR controller-runtime.controller Reconciler error
 {"controller": "iprequest", "request": "default/vm1-iprequest", "error": 
" Update failure for IpPool xwcheng-ippool in IpRequest vm1-iprequest, err: 
Operation cannot be fulfilled on ippools.infra.genos.io \"xwcheng-ippool\": 
the object has been modified; please apply your changes to the latest version and try again"}`

When using client without cache, everything fine.

func GetClient(scheme *runtime.Scheme) (client.Client, error){
	config := ctrl.GetConfigOrDie()
	if config == nil {
		return nil, fmt.Errorf("config is nil")
	}
	options := ctrl.Options{Scheme:scheme}

	client, err := client.New(config, client.Options{Scheme: options.Scheme})
	if err !=nil {
		return nil, err
	}
	return client,nil
}

@DirectXMan12
Copy link
Contributor

Right, you're going to have some cache lag so you'll get the old resource version with the cached client. That's expected.

@fanux
Copy link

fanux commented Nov 6, 2019

@DirectXMan12 so I think kubebuilder provides a REST readwrite client is better... #609

It is convenient for us to update CRD immediately. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

6 participants