Skip to content
This repository has been archived by the owner on Jun 8, 2022. It is now read-only.

Trait apply should use update instead of patch #272

Open
hongchaodeng opened this issue Oct 28, 2020 · 27 comments · Fixed by #283
Open

Trait apply should use update instead of patch #272

hongchaodeng opened this issue Oct 28, 2020 · 27 comments · Fixed by #283
Assignees
Labels
bug Something isn't working

Comments

@hongchaodeng
Copy link
Member

Problem Statement

Currently, when oam-runtime applies traits, it does MergePatch (code link1 and link2). Basically, it compares the trait to apply (A), and existing trait object in cache (B), and compare difference between A and B to create a patch to send to APIServer.

This will lead to the following problem:

  1. Trait initialized as:
auto: false
rules: [a, b, c]
  1. Trait updated as:
auto: true

In above scenario, we expect rules field to be removed, but when we apply it, the trait actually becomes:

auto: true
ruels: [a, b, c]

Proposed Solution

To solve the problem, we should just update the spec of trait instead of patch here.
For fields safety concerns, the traits are operational data so it will only be managed by OAM platform. Therefore updating is OK in this case. On the contrary, workloads could be changed by end users and we should consider more about how to update it.

@captainroy-hy
Copy link
Contributor

Maybe I can help, could you plz assign this issue to me?

@wonderflow
Copy link
Member

@captainroy-hy sure, thanks!

@ryanzhang-oss
Copy link
Collaborator

I wonder why patch can't do "remove"?

@wonderflow
Copy link
Member

@ryanzhang-oss 'patch' will not remove un-related fields

@sak0
Copy link

sak0 commented Nov 3, 2020

use APIUpdatingApplicator instead of APIPatchingApplicator
https://github.com/crossplane/crossplane-runtime/blob/v0.10.0/pkg/resource/api.go#L178-L208

crossplane-runtime 0.9.0 -> 0.10.0

@captainroy-hy
Copy link
Contributor

use APIUpdatingApplicator instead of APIPatchingApplicator
https://github.com/crossplane/crossplane-runtime/blob/v0.10.0/pkg/resource/api.go#L178-L208

crossplane-runtime 0.9.0 -> 0.10.0

thx for your kindly suggestion!

@hongchaodeng
Copy link
Member Author

ref kubevela/kubevela#876

@ryanzhang-oss
Copy link
Collaborator

just to be clear, the real problem is not just to remove the fields that are not present in the new CR but also preserve the fields that are not controlled by the trait.

If this is the real goal then I don't think anyone can solve this unless the trait declares what fields it owns.

@hongchaodeng
Copy link
Member Author

@ryanzhang-oss Right.

Let me add another example:

# user applies
auto: false
rules: [a, b, c]

# controller adds `others`
auto: false
rules: [a, b, c]
others: "abc"

# user applies again
auto: true

# user wants to see
auto: true
others: "abc"
```

@hongchaodeng
Copy link
Member Author

Per our discussion, this is a well known two-way merge problem: https://medium.com/flant-com/3-way-merge-patches-helm-werf-beb7eccecdfe

The correct way to solve this problem is via three-way merge, which is what kubectl apply does. Basically, we need to emulate kubectl apply in the Apply() function -- even though it is called Apply(), it is not yet.

Nowadays k8s supports server side apply which drastically simplifies the client side implementation. Here are some materials I could find related but they only provide examples dispersedly:

@captainroy-hy
Copy link
Contributor

Server-side apply can solve this problem after k8s v1.19. Before 1.19, it doesn't remove field in such a case shown here

@hongchaodeng
Copy link
Member Author

Before 1.19, it doesn't remove field in such a case shown here

I don't get what the comment said. Could you elaborate more detailedly? And provide an example?

@allenhaozi
Copy link

@ryanzhang-oss Right.

Let me add another example:

# user applies
auto: false
rules: [a, b, c]

# controller adds `others`
auto: false
rules: [a, b, c]
others: "abc"

# user applies again
auto: true

# user wants to see
auto: true
others: "abc"

about this case:

because this resource already exists, so if user want remove rules
user should send the following request,

# user applies again
auto: true
rules: null

if from patch to update

# user update again
auto: true
# result
auto: true

this is the semantic of update

update or patch depends on the business logic of traits.

Is there no uniform standard ?

@captainroy-hy
Copy link
Contributor

SS apply this deployment

apiVersion: apps/v1
kind: Deployment
...
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2

SS apply below with the same field manager

apiVersion: apps/v1
kind: Deployment
...
spec:
  # replicas: 3 <===== unset
  template:
    spec:
      containers:
      - name: nginx
        # image: nginx:1.14.2 <===== unset

In api-server v1.19, we will get

apiVersion: apps/v1
kind: Deployment
...
spec:
  replicas: 1 # <===== reset to default value
  template:
    spec:
      containers:
      - name: nginx
        # image: nginx:1.14.2 # <===== removed because it has no default value

But in api-server v1.16-v1.18, we will get

apiVersion: apps/v1
kind: Deployment
...
spec:
  replicas: 3 # <===== unchanged
  template:
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2 # <===== unchanged

Related to this fix merged into k8s v1.19

@hongchaodeng
Copy link
Member Author

Is there no uniform standard ?

@allenhaozi This is what three-way merge patch handles. This is how kubectl apply is implemented. We are going to align with upstream to fix the problem correctly.

@hongchaodeng
Copy link
Member Author

@captainroy-hy
I see. If this is well known upstream issue, we don't need to care. All we need to do is to implement kubectl apply on client side. On server side, it is the users who choose what versions (and thus what behaviors) they want.

@ryanzhang-oss
Copy link
Collaborator

@ryanzhang-oss Right.

Let me add another example:

# user applies
auto: false
rules: [a, b, c]

# controller adds `others`
auto: false
rules: [a, b, c]
others: "abc"

# user applies again
auto: true

# user wants to see
auto: true
others: "abc"

I don't think any patch method can solve this problem without what @allenhaozi described. The fact that the trait owns "auto" and "rules" but not "others" field has to be explicitly stated somewhere. This is not k8s related.

@hongchaodeng
Copy link
Member Author

@ryanzhang-oss

I don't think any patch method can solve this problem without what @allenhaozi described.

That's wrong assumption. I believe this is what kubectl apply address.

@captainroy-hy
Copy link
Contributor

working on this PR, refactor apply mechanism to compute a three-way diff in client side, just like how kubectl apply does under the hood, the last-applied-state is stored in annotation.

@hongchaodeng
Copy link
Member Author

Some related issue w.r.t. client side implementation of kubectl apply:

Note that this is an issue with kubectl since it is controlled by users directly. We can avoid and control the risk by isolating it from end users.

For now we can implement 3 way merge on client side by using our own keyword in annotation app.oam.dev/last-applied.

Once k8s 1.19 is more popular, we can switch the implementation to make use of server side apply.

@allenhaozi
Copy link

I met a problem that when restart oam-addon-sage causes Pods to restart
Pods belongs to the deployment which is controlled by ServerWorkload

oam-addon-sage contains ServerWorkload,TaskWorkload,ManualScaler,...

kubernetes-sigs/controller-runtime#1311

@hongchaodeng
Copy link
Member Author

I met a problem that when restart oam-addon-sage causes Pods to restart
Pods belongs to the deployment which is controlled by ServerWorkload

@allenhaozi Could you clarify the problem in more details? And why it is related here or maybe create another issue?

@allenhaozi
Copy link

In simple word
two controllers that control one resource
based on the implementation mechanism It's difficult to patch the resource

workload render the deployment

when restart the controllers will leads to the resource's pod restart

For example:
ServerWorkload create a deployment and set replicas to 0
ManualScalerTrait patch the replicas of this deployment from 0 to 3

when restart the controllers

if the deployment has exist we will find the following event

Events:
  Type    Reason             Age                  From                   Message
  ----    ------             ----                 ----                   -------
  Normal  ScalingReplicaSet  13s (x12 over 5h5m)  deployment-controller  Scaled down replica set simple-web-33-5d759bd4cf to 0
  Normal  ScalingReplicaSet  12s (x15 over 5h7m)  deployment-controller  Scaled up replica set simple-web-33-5d759bd4cf to 3

It's a problem of how can Workload and Trait work together to update a resource when can not patch the resource, so mention it.

@captainroy-hy
Copy link
Contributor

@allenhaozi A possible workaround: maybe the ServerWorkload controller should show no opinion on .spec.replicas when create/reconcile a Deployment, leaving .spec.replicas to ManualScalerTrait, just like how ContainerizedWorkload controller does.

But of course, it's still a problem to deal with "field management by multiple controllers", as @ryanzhang-oss said in this issue.

@hongchaodeng
Copy link
Member Author

Fixed in kubevela/kubevela#857
Any plan to port it to this repo too?

@Ananya-1106
Copy link

Hello!
how can I help?? I am new here but very curious to learn and explore this, can anyone guide me with what i should start ?

@wonderflow
Copy link
Member

Hi, @Ananya-1106 Welcome! This repo is a more like libariry of OAM implementation, we suggest you to get start from KubeVela and the oam-kubernetes-runtime lib was contained there.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants