Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix repository sync aborts after a signle repo failed #651

Conversation

jdoubleu
Copy link
Contributor

@jdoubleu jdoubleu commented Jan 1, 2022

This PR fixes the issue, where the whole repository sync process would be canceled if only one repository fails to sync.
It thereby partly fixes, or circumvents, #648 as described (see item 1).

In order to display an additional status/error message after syncing, I had to change the API response types/schemas to include an optional message string (see RepoList types both in the frontend and the server). I'm not 100% happy with how I named that type though.

@6543 6543 added the breaking will break existing installations if no manual action happens label Jan 2, 2022
@6543 6543 added this to the 0.15.0 milestone Jan 2, 2022
@6543
Copy link
Member

6543 commented Jan 2, 2022

NOTE: We have to look at the api go client too ... so we do not break some CLI client features
(I did not checked that jet)

@jdoubleu
Copy link
Contributor Author

jdoubleu commented Jan 2, 2022

Thanks for the info! I'll have a look myself.

Unfortunately, the tests didn't report anything (i.e. succeeded).

@6543
Copy link
Member

6543 commented Jan 2, 2022

@jdoubleu linting failed

@6543 6543 added the enhancement improve existing features label Jan 2, 2022
@jdoubleu jdoubleu force-pushed the fix/repo-sync-aborts-all-when-failed branch from 6af1d96 to 8f3d8b9 Compare January 2, 2022 15:47
Comment on lines +92 to +94
log.Debug().Msgf("could not fetch permission of repo '%s': %v", repo.FullName, err)
hasErrors = true
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this code solves a somehow "different" problem, I would suggest to just log the error as debug and ignore it otherwise. So we just keep these 3 lines of code for now.

Suggested change
log.Debug().Msgf("could not fetch permission of repo '%s': %v", repo.FullName, err)
hasErrors = true
continue
// TODO: report errors again after #648 got solved
log.Debug().Msgf("could not fetch permission of repo '%s': %v", repo.FullName, err)
continue

@6543
Copy link
Member

6543 commented Jan 2, 2022

I would go even future - sory that I think your current way to approach well need refactor once we do it properly ... :/

so I would propose:

  • only modify server/remote/gitlab/* code
  • on the sync determine repos that are in subgroups and just skip them

for later (of course you can start with now - but it's feature freeze and so It wont make it in upcomming release):
-> change how repos are idendifyed by refactoring the repo model and remote interface

@6543
Copy link
Member

6543 commented Jan 2, 2022

tldr: for a hotfix this is a lot of code that will be useless once done properly

@jdoubleu
Copy link
Contributor Author

jdoubleu commented Jan 3, 2022

on the sync determine repos that are in subgroups and just skip them

okay, I can do this in a separate PR and just close this one.

change how repos are idendifyed by refactoring the repo model and remote interface

yes, this definitely needs some thinking. I don't think I'm currently able to fix that, because I'm not familiar with the internal models/structure, yet.

@anbraten the reason I chose to always report an error is because I'd like to give the user feedback. At least display a generic error if something failed, otherwise nothing would happen, which is very confusing.

tldr: for a hotfix this is a lot of code that will be useless once done properly

I don't think so, since it is a mechanism to catch any error (e.g. timeouts or other permission errors) and report it back to the user. Sure, when #648 is fixed, there are no more errors for me, hopefully. What if the sync fails for someone else for whatever reason?

I'm not 100% confident with my changes, specifically introducing the RepoList response message which optionally includes a string. This feels somewhat unfitting for the API and is now a special return type compared to all other API endpoints. It would be nice to have a different channel to send messages back to the user somehow. Obviously this is a much greater undertaking; Just wanted to write down my thoughts.

@jdoubleu
Copy link
Contributor Author

jdoubleu commented Jan 3, 2022

I've pushed the fix in the API client to this branch.

It would be nice to be able to share types between the server and client and not have them in different files:

@anbraten
Copy link
Member

anbraten commented Jan 3, 2022

@jdoubleu I had a short look on the error with sub-group-repos and will create a PR to fix it.

the reason I chose to always report an error is because I'd like to give the user feedback. At least display a generic error if something failed, otherwise nothing would happen, which is very confusing.

We should find a solution for that. ATM we either throw an error or return some data. In this specific case I would throw an error if any of the repos can not be loaded, normally there shouldn't be an error with only some repos otherwise it is probably an error we need to fix like #648.

@6543
Copy link
Member

6543 commented Jan 3, 2022

If we do it the "rest api way" we should not change data struct but return different http status code with an error struct.

so either 2XX & data OR 4/5XX and error data

@6543
Copy link
Member

6543 commented Jan 3, 2022

I would propose to invent a generic error type:

type ErrorResponse struct {
  Error string // error name like "timeout" ...
  Message string // optional, describe things if posible
}

☝️ with can be extended by optional fields for specific needs if needed

status 500 has only a generic error anymore and only a message if in debug mode or auth user is admin -> it should only be reported back if the system got an error that is caused by issues only instance admins can fix (database is down, ...)

@6543
Copy link
Member

6543 commented Jan 3, 2022

It would be nice to be able to share types between the server and client and not have them in different files:

* https://github.com/woodpecker-ci/woodpecker/blob/master/woodpecker-go/woodpecker/types.go#L15

* https://github.com/woodpecker-ci/woodpecker/blob/master/server/model/repo.go#L26

We either share types with client and api OR api and database.
If we share structs from database with client it can not be used outside of this project

@6543
Copy link
Member

6543 commented Jan 3, 2022

tldf: your change to the api is nice for javascript ... just a add an attribute ... but not for api architecture

@jdoubleu
Copy link
Contributor Author

jdoubleu commented Jan 3, 2022

Thank you for the feedback!

In this specific case I would throw an error if any of the repos can not be loaded

My concern is that the sync process can be fragile, especially when you try to sync lots of repos (500+ in my case). While I didn't experience any errors besides the reported, I can totally imagine one (GitLab) API request to timeout or fail for whatever reason. If that happens too often, maybe due to network congestion, it prevents a user from syncing their repos at all.

If we share structs from database with client it can not be used outside of this project

I can understand that's a problem. Is it possible to employ some build steps for the client, which can output/copy the types to the client lib? On the other hand, that would make it more complex to build/use.

I would propose to invent a generic error type [..] status 500 has only a generic error anymore

So when there's an error, the API should return a 500 status code and the JSON, which includes a generic Error message and any data (e.g. repo list) in Message, if there is any? That doesn't feel RESTy either. I think to either return a successful response with data or an error message with a failed response is the best practice, isn't it?

@6543
Copy link
Member

6543 commented Jan 3, 2022

the example "timeout" should be handled by woodpecker and not reported back ... in this case "with sleep & retry's"

@6543
Copy link
Member

6543 commented Jan 3, 2022

Is it possible to employ some build steps for the client, which can output/copy the types to the client lib?

not a bad idea, we should do it similar to code format:

  • enforce it by CI via linting
  • add a make target to create it automatically

☝️ wana create an issue ;) - else I'll do

@anbraten
Copy link
Member

anbraten commented Jan 3, 2022

Regarding the shared-types discussion I would link to #292 as that would allow us to automatically create a go and js / ts api client with proper types.

I think to either return a successful response with data or an error message with a failed response is the best practice, isn't it?

Totally agree on that. It helps to keep code simple as well.

@jdoubleu
Copy link
Contributor Author

jdoubleu commented Jan 3, 2022

wana create an issue ;) - else I'll do

please, go ahead @6543

I was thinking on something similar to TypeScript which can produce a types.d.ts out of a code base, although I never used it.
How is the client lib published anyways? Are the binaries pushed to the go package repository or something? An additional build step makes it unusable if someone clones the repo or copies the source from the folder to use. To be fair, that should be very rare.

@6543
Copy link
Member

6543 commented Jan 3, 2022

@jdoubleu not done jet but it's done via git tagging

@6543
Copy link
Member

6543 commented Jan 3, 2022

tag woodpecker-go/woodpecker/v0.15.0 would export the client lib as v0.15.0 for the commit the tag refer to

example: https://gitea.com/gitea/go-sdk/tags

@6543 6543 mentioned this pull request Jan 3, 2022
2 tasks
@6543 6543 closed this in #652 Jan 3, 2022
6543 pushed a commit that referenced this pull request Jan 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking will break existing installations if no manual action happens enhancement improve existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants