Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): dehydrate workflow before deleting offloaded node status #6112

Merged
merged 1 commit into from
Jun 21, 2021

Conversation

copierrj
Copy link
Contributor

@copierrj copierrj commented Jun 9, 2021

Offloaded nodes are periodically garbage collected. The garbage collector first compiles a map of node status versions to keep (based on available k8s resources) and assumes that all other node statuses can be deleted. For workflows without a value for 'OffloadNodeStatusVersion' it is assumed that the nodes are not (or no longer) being offloaded.

Unfortunately this is not always correct: we observed (quite regularly) that node status information was deleted prematurely, destroying active workflows and/or making it impossible to review workflow results afterwards.

While investigating the issue we discovered that the workflows provided by the workflow lister are sometimes hydrated and because of that do not yet have a value for 'OffloadNodeStatusVersion'. However this doesn't necessarily mean that the nodes of that particular workflow are not actually offloaded and can be disregarded safely.

In order to resolve this issue we experimented with an Argo build where the garbage collector ensures workflow dehydration before deleting anything. We thoroughly tested this version and we couldn't reproduce the problem anymore. It looks like we did indeed find the culprit.

This PR suggests to implement this particular change in order to prevent losing any offloaded node information. We do not know however if this is actually the right approach. It could be that there is a root cause to be fixed somewhere else in the workflow controller. Because of the seemingly random nature of the problem it could perhaps be caused by some race condition?

Signed-off-by: Reijer Copier reijer.copier@kadaster.nl

Checklist:

@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

Merging #6112 (2dc7cca) into master (510b4a8) will decrease coverage by 0.07%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6112      +/-   ##
==========================================
- Coverage   47.66%   47.58%   -0.08%     
==========================================
  Files         249      250       +1     
  Lines       15783    15806      +23     
==========================================
- Hits         7523     7522       -1     
- Misses       7322     7343      +21     
- Partials      938      941       +3     
Impacted Files Coverage Δ
...ersist/sqldb/explosive_offload_node_status_repo.go 0.00% <ø> (ø)
persist/sqldb/offload_node_status_repo.go 3.50% <0.00%> (-0.07%) ⬇️
workflow/controller/controller.go 20.06% <0.00%> (-0.56%) ⬇️
workflow/controller/indexes/uid_index.go 0.00% <0.00%> (ø)
cmd/argoexec/commands/emissary.go 48.43% <0.00%> (-1.57%) ⬇️
cmd/argo/commands/get.go 57.09% <0.00%> (ø)
workflow/controller/operator.go 71.11% <0.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 510b4a8...2dc7cca. Read the comment docs.

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we have a race condition here. I’ve the workflow is being reconciled at the same
time, and both pieces of code have a pointer to the same struct. We could use the KeyLock to prevent that, but that alone might not be enough, and I think dehydrating it might be needed too, but dehydration without locking might produce or their race conditions . Can I ask you to add the KeyLock too please? Also, can I ask how you test this?

@copierrj
Copy link
Contributor Author

copierrj commented Jun 10, 2021

I've reworked the PR and workflows are now properly locked and unlocked by the garbage collector. The overall behavior is still the same: the GC still receives hydrated workflows from the lister sometimes but by dehydrating those first everything seems to work just fine.

These changes where tested by submitting a few workflows (< 5) containing a great number of individual steps (100s) that just wait for a short while (60s). Afterwards the log, database and k8s state where closely studied to check for any anomaly.

@@ -534,6 +534,65 @@ func (wfc *WorkflowController) signalContainers(namespace string, podName string
return time.Duration(*pod.Spec.TerminationGracePeriodSeconds) * time.Second, nil
}

func (wfc *WorkflowController) deleteOffloadedNodes() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this func below workflowGarbageCollector so there are less risk of Git conflicts?

}
defer func() {
for _, wf := range workflows {
key, err := cache.MetaNamespaceKeyFunc(wf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key is already in scope, so no need to repeat here

}
wfc.workflowKeyLock.Lock(key)
}
defer func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will lock all the workflows, potentially locking the controller and preventing reconciliation - this is a show stopper - instead create a new func (e.g. named deleteOffloadedNodesForWorkflow()) and lock/defer at the start of the func

workflow/controller/controller.go Outdated Show resolved Hide resolved
@alexec alexec added this to the v3.0 milestone Jun 11, 2021
@alexec alexec changed the title fix: dehydrate workflow before deleting offloaded node status fix(controller): dehydrate workflow before deleting offloaded node status Jun 11, 2021
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments about locking

@copierrj
Copy link
Contributor Author

I've made the requested changes.

@copierrj
Copy link
Contributor Author

I've reworked the PR like suggested. Some changes where required in deleteOffloadedNodesForWorkflow as it didn't take deleted workflows into account.

While testing I noticed that hydrated workflows are still encountered by the GC. Therefore dehydration is still necessary to prevent offloaded nodes from being removed prematurely.

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 small changes requested

if err != nil {
log.WithField("err", err).Error("Failed to list incomplete workflows")
continue
lenOldOffloads := 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this diagnostics output - just more code that needs to be tested and. maintained.

log.WithFields(log.Fields{"len_wfs": len(oldRecords), "len_old_offloads": lenOldOffloads}).Info("Deleting old offloads that are not live")
for uid, versions := range oldRecords {
if err := wfc.deleteOffloadedNodesForWorkflow(uid, versions); err != nil {
log.WithError(err).WithField("uid", err).Error("Failed to delete old offloaded nodes")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistake in this line - can you spot it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure: err -> uid

if err != nil {
return err
}
key, err := cache.MetaNamespaceKeyFunc(wf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor - just do namespace+"/"+name - simpler and no error check needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? Everywhere else in the controller cache.MetaNamespaceKeyFunc is being used to compute the key. Isn't duplicating the behavior of this function a maintenance hazard? What if cache.MetaNamespaceKeyFunc is changed in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but I don't expect that to happen in reality. Too much code depends on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See cache.ExplicitKey

Signed-off-by: Reijer Copier <reijer.copier@kadaster.nl>
@alexec alexec merged commit 9fe8c10 into argoproj:master Jun 21, 2021
@sarabala1979 sarabala1979 mentioned this pull request Jun 21, 2021
88 tasks
@copierrj copierrj deleted the fix_gc branch June 22, 2021 06:46
@alexec alexec mentioned this pull request Jun 28, 2021
9 tasks
alexec pushed a commit that referenced this pull request Jun 28, 2021
…atus (#6112)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
uturunku1 pushed a commit to newrelic-forks/argo-workflows that referenced this pull request Jul 22, 2021
…atus (argoproj#6112)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>
alexec added a commit that referenced this pull request Jul 26, 2021
* Update events.md (#6119)

Trying to use the argo workflows events and I noticed that some crucial explanations are missing here. I would like to add:
- A simple WorkflowTemplate bound to the WorkflowEventBinding, to show what is triggered by the curl that send the event
- Some infos about the process that bind the event to the workflow template:
   - template creation
   - event binding apply
   - api call to trigger the workflow template creation
Plus: there is a little mistake in the selector:  metadata["x-argo"] instead of metadata["X-Argo-E2E"] I would like to correct it in order to avoid mistakes during the curl.

Hope this is appreciated! ;)

Denis

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add note on the requirements of resource templates. Fixes #5566 (#6125)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6127)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: alexec <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* add troubleshooting notes section for running-locally docs (#6132)

Co-authored-by: uturunku1 <“21225410+uturunku1@users.noreply.github.com”>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(executor): Check whether any errors within checkResourceState() are transient. Fixes #6118. (#6134)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* build: Remove PNS_PRIVILEGED=true (#6138)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Document the extraction of data from a k8s resource (#6102)

* Document the extraction of data from a k8s resource

* remove reference to lines of a file that can be outdated

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>

* Remove yaml snippet and only keep the link to the example

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* build image output to docker (#6128)

Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* chore: Update stress rig and docs. Fixes #6136 (#6141)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* chore: Upgrade Alibaba OSS to use more secure ListObjectsV2() (#6142)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: Allow setting workflow input parameters in UI. Fixes #4234 (#5319)

* fix: Allow setting workflow input parameters in UI. Fixes #4234

Signed-off-by: Kenny Trytek <kenneth.g.trytek@gmail.com>

* fix: Allow setting workflow input parameters in UI. Fixes #4234

 - Allow workflow input parameters as well as entrypoint parameters.

Signed-off-by: Kenny Trytek <kenneth.g.trytek@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): Performance improvement for Sprig. Fixes #6135 (#6140)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* update from v0.19.6 to v0.20.4 and indirect dependencies

Signed-off-by: uturunku1 <“21225410+uturunku1@users.noreply.github.com”>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* exec.GetAuthenticator takes two arguments in the k8s-client-go v0.20.4

Signed-off-by: uturunku1 <“21225410+uturunku1@users.noreply.github.com”>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* update makefile to use code-generator@v0.20.4

Signed-off-by: uturunku1 <“21225410+uturunku1@users.noreply.github.com”>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Fix release-notes.md

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Update Graviti's website link (#6148)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(ui): Fix-up local storage namespaces. Fixes #6109 (#6144)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(executor): Capture emissary main-logs. Fixes #6145 (#6146)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(ui): Fix event-flow scrolling. Fixes #6133 (#6147)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* test: Fix logging test (#6159)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(ui): Add checkbox to check all workflows in list. Fixes #6069 (#6158)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Use 'depends' instead of 'dependencies' in examples (#6166)

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(server): Allow redirect_uri to be automatically resolved when using sso (#6167)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): Allow retry on transient errors when validating workflow spec. Fixes #6163 (#6178)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): dehydrate workflow before deleting offloaded node status (#6112)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6160)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: alexec <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Remove RBAC for SSO from Roadmap (Already implemented) (#6174)

It looks like RBAC for SSO is already implemented by #4198 so hopefully it can be removed from the roadmap as it is also documented? https://argoproj.github.io/argo-workflows/argo-server-sso/#sso-rbac

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6187)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: alexec <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Fix changelog order for .0 tags (#6188)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): Wrong validate order when validate DAG task's argument (#6190)

Signed-off-by: BOOK <book78987book@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix rebase conflict

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* run go mod tidy

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* refactor: Remove the need for pod annotations to be mounted as a volume (#6022)

Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: ContainerSets do not have 'depends' (#6199)

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: Fix security issues related to file closing and paths (G307 & G304) (#6200)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add links to Python examples to description annotations (#6202)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs(executor): document k8s executor behaviour with program warnings (#6212)

* docs(executor): document k8s executor behaviour with program warnings

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>

* docs(executor): fix typo

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: Fix certain sibling tasks not connected to parent (#6193)

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(ui): Add copy to clipboard shortcut (#6217)

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6220)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: alexec <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add KarrotPay in USERS.md (#6221)

Signed-off-by: Byungjin Park <posquit0.bj@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* run go mod tidy

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add workflow-count-resourcequota.yaml example (#6225)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: Reduce argoexec image size (#6197)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(conttroller): Always set finishedAt dote. Fixes #6135 (#6139)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat: Add support for deletion delay when using PodGC (#6168)

Signed-off-by: Stefan Sedich <stefan.sedich@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: update bug report template (#6236)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6242)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: alexec <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(executor): emissary - make argoexec executable from non-root containers. Fixes #6238 (#6247)

Signed-off-by: Yuan Gong <gongyuan94@gmail.com>

Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat: Introduce when condition to retryStrategy (#6114)

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* ci: Add Go code security scanner via gosec. Fixes #6203 (#6232)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: fix end of files, new lines and remove multiple lines (#6240)

Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: add json destructuring example (#6250)


Signed-off-by: Michael Crenshaw <michael@crenshaw.dev>

Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(executor): Tolerate docker re-creating containers. Fixes #6244 (#6252)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(executor): emissary - make /var/run/argo files readable from non-root users. Fixes #6238 (#6304)

Signed-off-by: Yuan Gong <gongyuan94@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs(controller): add missing emissary executor (#6291)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: docs and hacks improvements (#6310)

Signed-off-by: Michael Crenshaw <michael@crenshaw.dev>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(cli): Only list needed fields. Fixes #6000 (#6298)

* fix(cli): Only list needed fields

Signed-off-by: Alex Collins <alex_collins@intuit.com>

* ok

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Fix typo (#6311)

Signed-off-by: Byungjin Park <posquit0.bj@gmail.com>

Co-authored-by: Saravanan Balasubramanian <33908564+sarabala1979@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* require sso redirect url to be an argo url (#6211)

Signed-off-by: Brandon Goode <brandon.goode@cox.com>

Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: code format (#6269)

- Add yaml rendering
- Add bash rendering

Co-authored-by: Simon Behar <simbeh7@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(controller): Store artifact repository in workflow status. Fixes #6255 (#6299)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: document using ingress with TLS enabled (#6324)

Signed-off-by: valorl <11498571+valorl@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: document how to access hyphenated steps in expression templates (#6318)

Signed-off-by: Michael Crenshaw <michael@crenshaw.dev>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(controller): Differentiate CronWorkflow submission vs invalid spec error metrics (#6309)

* feat(controller): Differentiate CronWorkflow submission vs invalid spec error metrics

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Address feedback

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* chore: deleted wft.yaml

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* ci: only run Snyk once a day on master

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): Not updating StoredWorkflowSpec when WFT changed during workflow running (#6342)

Signed-off-by: Saravanan Balasubramanian <sarabala1979@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(cli): v3.1 Argo Auth Token (#6344)

* fix(cli): v3.1 Argo Auth Token

Signed-off-by: Saravanan Balasubramanian <sarabala1979@gmail.com>

* update

Signed-off-by: Saravanan Balasubramanian <sarabala1979@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add Alibaba Group to USERS.md (#6353)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(crd): temp fix 34s timeout bug for k8s 1.20+ (#6350)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: updated CHANGELOG.md (#6348)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: sarabala1979 <sarabala1979@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs(users): Add WooliesX (#6358)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(cli): Overridding name/generateName when creating CronWorkflows if specified (#6308)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(controller): sortDAGTasks supports sort by field Depends (#6307)

Signed-off-by: BOOK <book78987book@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(fields): handle nexted fields when excluding (#6359)

Signed-off-by: AntoineDao <antoinedao1@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* feat(controller): Allow configurable host name label key when retrying different hosts (#6341)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* pull argo-events changes
update versions in go.mod and go.sum

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* run go mod tidy

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): allow workflow.duration to pass validator (#6376)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): fix retry on transient errors when validating workflow spec (#6370)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>

Co-authored-by: Saravanan Balasubramanian <33908564+sarabala1979@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: examples/ci.yaml indent (#6328)

Signed-off-by: kungho.back <kungho.back@naverlabs.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* chore: import grafana dashboard (#6365)

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(gcs): throw argo not found error if key not exist (#6393)

Signed-off-by: AntoineDao <antoinedao1@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* Revert "fix: examples/ci.yaml indent (#6328)"

This reverts commit 3f72fe5.

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix: Server crash when opening timeline tab for big workflows (#6369)

Signed-off-by: Alexander Matyushentsev <AMatyushentsev@gmail.com>

Co-authored-by: Saravanan Balasubramanian <33908564+sarabala1979@users.noreply.github.com>
Co-authored-by: Alex Collins <alexec@users.noreply.github.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add 4intelligence (#6400)

Signed-off-by: Thiago Gil <t.gil@4intelligence.com.br>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Add note on additional required permission for createBucketIfNotPresent for OSS driver (#6378)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): allow initial duration to be 0 instead of current_time-0 (#6389)


Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix(controller): Mark workflows wait for semaphore as pending. Fixes #6351 (#6356)

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* docs: Updating upgrading.md. Closes #6314

Signed-off-by: Alex Collins <alex_collins@intuit.com>
Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* not need to convert to unstructured.unstructured

I was getting this error controller_test.go: pkg/mod/k8s.io/client-go@v0.20.4/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: item[0]: can't assign or convert unstructured.Unstructured into v1alpha1.Workflow

Based on this comment, it seems like the conversion is not needed: kubernetes-sigs/controller-runtime#524 (comment)

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* run make pre-commit -B

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix potential file inclusion via variable lint error

there is a risk that an unintended file path will be specified. So uuse filepath.Clean() to clean up possible bad paths

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

* fix format issue

Signed-off-by: uturunku1 <luces.huayhuaca@gmail.com>

Co-authored-by: Denis Bellotti <denis.bellotti.android@gmail.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: alexec <alexec@users.noreply.github.com>
Co-authored-by: uturunku1 <“21225410+uturunku1@users.noreply.github.com”>
Co-authored-by: Christophe Blin <christophe.blin@free.fr>
Co-authored-by: meijin <859037421@qq.com>
Co-authored-by: kennytrytek <kenneth.g.trytek@gmail.com>
Co-authored-by: Caden <32856921+CadenOf@users.noreply.github.com>
Co-authored-by: Simon Behar <simbeh7@gmail.com>
Co-authored-by: Stefan Sedich <stefan.sedich@gmail.com>
Co-authored-by: Reijer Copier <copierrj@users.noreply.github.com>
Co-authored-by: Brandon High <highb@users.noreply.github.com>
Co-authored-by: BOOK <book78987book@gmail.com>
Co-authored-by: Antony Chazapis <chazapis@ics.forth.gr>
Co-authored-by: Tianchu Zhao <evantczhao@gmail.com>
Co-authored-by: Byungjin Park (Claud) <posquit0.bj@gmail.com>
Co-authored-by: Yuan (Bob) Gong <4957653+Bobgy@users.noreply.github.com>
Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com>
Co-authored-by: Michael Crenshaw <michael@crenshaw.dev>
Co-authored-by: Saravanan Balasubramanian <33908564+sarabala1979@users.noreply.github.com>
Co-authored-by: brgoode <86316314+brgoode@users.noreply.github.com>
Co-authored-by: Valér Orlovský <11498571+valorl@users.noreply.github.com>
Co-authored-by: Alex Collins <alex_collins@intuit.com>
Co-authored-by: sarabala1979 <sarabala1979@users.noreply.github.com>
Co-authored-by: Antoine Dao <antoinedao1@gmail.com>
Co-authored-by: KUNG HO BACK <bkh751@gmail.com>
Co-authored-by: Zadkiel <zadkiel.aharonian@gmail.com>
Co-authored-by: Alexander Matyushentsev <Alexander_Matyushentsev@intuit.com>
Co-authored-by: Thiago Bittencourt Gil <79285506+thiago4int@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants