-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Also sync DB branches on push if necessary #28361
Conversation
…ly but not visit UI after upgrading from v1.20 -> v1.21
@wolfogre done. |
modules/repository/branch.go
Outdated
// UpdateBranch updates the branch information in the database. If the branch exist, it will update latest commit of this branch information | ||
// If it doest not exist, insert a new record into database | ||
func UpdateBranch(ctx context.Context, repoID, pusherID int64, branchName string, commit *git.Commit) error { | ||
cnt, err := git_model.UpdateBranch(ctx, repoID, pusherID, branchName, commit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That function is completely wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I move the function to service layer and renamed function name to syncBranchToDB
. Maybe it fixed your problem?
modules/repository/branch.go
Outdated
// if user haven't visit UI but directly push to a branch after upgrading from 1.20 -> 1.21, | ||
// we cannot simply insert the branch but need to check we have branches or not | ||
hasBranch, err := db.Exist[git_model.Branch](ctx, git_model.FindBranchOptions{ | ||
RepoID: repoID, | ||
IsDeletedBranch: util.OptionalBoolFalse, | ||
}.ToConds()) | ||
if err != nil { | ||
return err | ||
} | ||
if !hasBranch { | ||
if _, err = SyncRepoBranches(ctx, repoID, pusherID); err != nil { | ||
return fmt.Errorf("repo_module.SyncRepoBranches %d:%s failed: %v", repoID, branchName, err) | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of trouble understanding this precondition: I don't think this is the correct behavior we are doing here.
Shouldn't the workflow be:
Push -> Add all branches added by the push if they don't exist already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your idea is OK to sync. But it will compare all branches every time. It will become a performance bottleneck for big repositories.
My method is for every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it's non-sync state. otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours.
I was unable to create a backport for 1.21. @lunny, please send one manually. 🍵
|
Fix go-gitea#28056 This PR will check whether the repo has zero branch when pushing a branch. If that, it means this repository hasn't been synced. The reason caused that is after user upgrade from v1.20 -> v1.21, he just push branches without visit the repository user interface. Because all repositories routers will check whether a branches sync is necessary but push has not such check. For every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it will be assumed as non-sync state. Otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours. For the implementation, we in fact will try to update the branch first, if updated success with affect records > 0, then all are done. Because that means the branch has been in the database. If no record is affected, that means the branch does not exist in database. So there are two possibilities. One is this is a new branch, then we just need to insert the record. Another is the branches haven't been synced, then we need to sync all the branches into database.
* giteaofficial/main: [skip ci] Updated licenses and gitignores Actually recover from a panic in cron task (go-gitea#28409) Fix missing check (go-gitea#28406) Also sync DB branches on push if necessary (go-gitea#28361) Remove stale since giteabot has similiar feature (go-gitea#28401) [skip ci] Updated translations via Crowdin
Fix #28056 Backport #28361 This PR will check whether the repo has zero branch when pushing a branch. If that, it means this repository hasn't been synced. The reason caused that is after user upgrade from v1.20 -> v1.21, he just push branches without visit the repository user interface. Because all repositories routers will check whether a branches sync is necessary but push has not such check. For every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it will be assumed as non-sync state. Otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours. For the implementation, we in fact will try to update the branch first, if updated success with affect records > 0, then all are done. Because that means the branch has been in the database. If no record is affected, that means the branch does not exist in database. So there are two possibilities. One is this is a new branch, then we just need to insert the record. Another is the branches haven't been synced, then we need to sync all the branches into database.
@lunny There is a pretty significant performance regression in this PR. Most integration tests creating a branch now take minutes to complete. You can see the effect of this in CI runtime. In a PR merged before this one, sqlite integration tests take 16min In this PR and after, sqlite integration tests take >30min |
I will investigate it. |
#28361 introduced `syncBranchToDB` in `CreateNewBranchFromCommit`. This PR will revert the change because it's unnecessary. Every push will already be checked by `syncBranchToDB`. This PR also created a test to ensure it's right.
go-gitea#28361 introduced `syncBranchToDB` in `CreateNewBranchFromCommit`. This PR will revert the change because it's unnecessary. Every push will already be checked by `syncBranchToDB`. This PR also created a test to ensure it's right.
Fix go-gitea#28056 This PR will check whether the repo has zero branch when pushing a branch. If that, it means this repository hasn't been synced. The reason caused that is after user upgrade from v1.20 -> v1.21, he just push branches without visit the repository user interface. Because all repositories routers will check whether a branches sync is necessary but push has not such check. For every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it will be assumed as non-sync state. Otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours. For the implementation, we in fact will try to update the branch first, if updated success with affect records > 0, then all are done. Because that means the branch has been in the database. If no record is affected, that means the branch does not exist in database. So there are two possibilities. One is this is a new branch, then we just need to insert the record. Another is the branches haven't been synced, then we need to sync all the branches into database.
go-gitea#28361 introduced `syncBranchToDB` in `CreateNewBranchFromCommit`. This PR will revert the change because it's unnecessary. Every push will already be checked by `syncBranchToDB`. This PR also created a test to ensure it's right.
Fix go-gitea#28056 This PR will check whether the repo has zero branch when pushing a branch. If that, it means this repository hasn't been synced. The reason caused that is after user upgrade from v1.20 -> v1.21, he just push branches without visit the repository user interface. Because all repositories routers will check whether a branches sync is necessary but push has not such check. For every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it will be assumed as non-sync state. Otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours. For the implementation, we in fact will try to update the branch first, if updated success with affect records > 0, then all are done. Because that means the branch has been in the database. If no record is affected, that means the branch does not exist in database. So there are two possibilities. One is this is a new branch, then we just need to insert the record. Another is the branches haven't been synced, then we need to sync all the branches into database.
go-gitea#28361 introduced `syncBranchToDB` in `CreateNewBranchFromCommit`. This PR will revert the change because it's unnecessary. Every push will already be checked by `syncBranchToDB`. This PR also created a test to ensure it's right.
Fix #28056
This PR will check whether the repo has zero branch when pushing a branch. If that, it means this repository hasn't been synced.
The reason caused that is after user upgrade from v1.20 -> v1.21, he just push branches without visit the repository user interface. Because all repositories routers will check whether a branches sync is necessary but push has not such check.
For every repository, it has two states, synced or not synced. If there is zero branch for a repository, then it will be assumed as non-sync state. Otherwise, it's synced state. So if we think it's synced, we just need to update branch/insert new branch. Otherwise do a full sync. So that, for every push, there will be almost no extra load added. It's high performance than yours.
For the implementation, we in fact will try to update the branch first, if updated success with affect records > 0, then all are done. Because that means the branch has been in the database. If no record is affected, that means the branch does not exist in database. So there are two possibilities. One is this is a new branch, then we just need to insert the record. Another is the branches haven't been synced, then we need to sync all the branches into database.