Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Merged PRs have their commits and files changed metadata 'zeroed'... sometimes. #10766

Closed
2 of 7 tasks
jamesorlakin opened this issue Mar 18, 2020 · 31 comments · Fixed by #10908
Closed
2 of 7 tasks
Labels
Milestone

Comments

@jamesorlakin
Copy link
Contributor

  • Gitea version (or commit ref): v1.11.3
  • Git version: 2.24.1
  • Operating system: Ubuntu 18.04 (used as a Docker host for Gitea)
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist: N/A

Description

As of recent (within the last month I'd estimate), many PRs have their commits and files changed counters zeroed out after merging. This happens most of the time but weirdly not always on my instance now. Deleting the branch afterwards doesn't make seem to make a difference if that makes a difference.

I'm not sure how a lot of the internal Git plumbing works for Gitea so I haven't managed to take a look at the code directly. I know that #10618 seemed to be somewhat relevant - and the linked PR example in there does indeed show the behaviour below! I'm confused as to why this is shown for me as both branches definitely still exist.

Screenshots

image

@lunny lunny added the type/bug label Mar 19, 2020
@lunny lunny added this to the 1.11.4 milestone Mar 19, 2020
@lunny
Copy link
Member

lunny commented Mar 19, 2020

It should be related with #10618. @zeripath

@zeripath
Copy link
Contributor

Nope. Merged PRs are displayed by a different set of code.

gitea/routers/repo/pull.go

Lines 303 to 328 in 7225453

// PrepareMergedViewPullInfo show meta information for a merged pull request view page
func PrepareMergedViewPullInfo(ctx *context.Context, issue *models.Issue) *git.CompareInfo {
pull := issue.PullRequest
setMergeTarget(ctx, pull)
ctx.Data["HasMerged"] = true
compareInfo, err := ctx.Repo.GitRepo.GetCompareInfo(ctx.Repo.Repository.RepoPath(),
pull.MergeBase, pull.GetGitRefName())
if err != nil {
if strings.Contains(err.Error(), "fatal: Not a valid object name") {
ctx.Data["IsPullRequestBroken"] = true
ctx.Data["BaseTarget"] = pull.BaseBranch
ctx.Data["NumCommits"] = 0
ctx.Data["NumFiles"] = 0
return nil
}
ctx.ServerError("GetCompareInfo", err)
return nil
}
ctx.Data["NumCommits"] = compareInfo.Commits.Len()
ctx.Data["NumFiles"] = compareInfo.NumFiles
return compareInfo
}

@zeripath
Copy link
Contributor

So the problem seems to be that the mergebase is incorrect - it's somehow ending up being the head!

@zeripath
Copy link
Contributor

zeripath commented Mar 19, 2020

This means that v128.go is incorrect.

The trouble is that:

pr.MergeBase, err = git.NewCommand("merge-base", "--", pr.MergedCommitID+"^", gitRefName).RunInDir(repoPath)

Appears to have selected the gitRefName if it's a parent of MergedCommitID instead of searching for the common root for the other parent.

@jamesorlakin
Copy link
Contributor Author

jamesorlakin commented Mar 19, 2020

Would you reckon this is a migration problem? This has been happening on new PRs straight after merging.

I tried running it in debug mode (to use breakpoints) with SQLite in VS Code on my Windows machine but hit issues around needing gcc setup properly. I may try and migrate the existing DB to MySQL before investigating further if I get time tomorrow 😃

@zeripath
Copy link
Contributor

It's probably the same issue if something recalculates the mergebase.

@zeripath
Copy link
Contributor

zeripath commented Mar 19, 2020

We need to calculate the merge base from the other parents of the mergecommitid (if there are multiple parents)

git rev-list --parents -n 1 MERGE_COMMIT_ID

Will give the commit id followed by the parents. Can get the merge base from those.

Damn sorry for this bug the original code seemed to work in testing not sure why it didn't work in practice.

@jamesorlakin
Copy link
Contributor Author

jamesorlakin commented Mar 20, 2020

EDIT: Ignore me. It was the debugger... Printing it out to the console was fine.

@jamesorlakin
Copy link
Contributor Author

Aah I think I've gotcha, so this bug will appear if the merge commit created by a PR has a parent that's also a merge commit? That would explain it not always appearing!

@zeripath
Copy link
Contributor

zeripath commented Mar 20, 2020

@jamesorlakin - ah I see how you're getting the problem of the randomly updating merge base.

We're back to the race between the checker and merger - which is mostly fixed in 1.12

--

Nope that can't happen because we check if it is an ancestor...

@zeripath
Copy link
Contributor

Are you sure you're definitely getting this after merging PRs? Could you check what the contenst of the version table is in your DB?

@jamesorlakin
Copy link
Contributor Author

jamesorlakin commented Mar 23, 2020

This instance (v1.11.3) is currently on DB version 118:
image

The MergeBase of the Pull model ends up being the commit that gitRefName points to:
image
image

It's very strange, and doesn't happen all the time. Some PRs are just fine after merging. I can't seem to figure out any consistent behaviour that would cause this - my theory about merge commits doesn't hold out, as we use a protected branch workflow so every commit the branch points to is a merge commit from a previous PR...

@zeripath
Copy link
Contributor

Have they been marked manually merged?

@jamesorlakin
Copy link
Contributor Author

...Marked as manually merged? We just use the big green Merge Pull Request button - no rebasing or anything fancy... if that helps at all!

@zeripath
Copy link
Contributor

In the database pull_request table the status column do the broken ones have status == 3.

There's a check goroutine that has almost zero locking in 1.11 - I've mostly fixed this in master (1.12) - that goes around checking if PRs have been merged. If the checker starts during the PR merging phase and runs at or around the same as the PR being merged - it can see the commit in the git repository and think that the commit has been merged manually. At which point I could imagine it might incorrectly update the MergeBase.

@jamesorlakin
Copy link
Contributor Author

Aah right, indeed there is a status == 3 on a recent 'bad' one.
image

These two PRs were created and merged today. 556 contained one commit from a branch into dev, and 557 was another PR to merge that into aws which contained a commit and the merge commit from 556. No branches have been deleted at this stage.

@zeripath
Copy link
Contributor

OK that means it is very likely to be the checker.

@jamesorlakin
Copy link
Contributor Author

jamesorlakin commented Mar 23, 2020

Aah okay, would you expect to see a lot of manually merged PRs? (what triggers it?)

For reference I have quite a few!
image

@zeripath
Copy link
Contributor

OK so I think I've worked it out:

for {
select {
case prID := <-pullRequestQueue.Queue():
log.Trace("TestPullRequests[%v]: processing test task", prID)
pullRequestQueue.Remove(prID)
id := com.StrTo(prID).MustInt64()
pr, err := models.GetPullRequestByID(id)
if err != nil {
log.Error("GetPullRequestByID[%s]: %v", prID, err)
continue
} else if pr.HasMerged {
continue
} else if manuallyMerged(pr) {
continue
} else if err = TestPatch(pr); err != nil {
log.Error("testPatch[%d]: %v", pr.ID, err)
pr.Status = models.PullRequestStatusError
if err := pr.UpdateCols("status"); err != nil {
log.Error("Unable to update status of pr %d: %v", pr.ID, err)
}
continue
}
checkAndUpdateStatus(pr)
case <-ctx.Done():
pullRequestQueue.Close()
log.Info("PID: %d Pull Request testing shutdown", os.Getpid())
return
}
}
}

This is the checker from release/v1.11. As I've said before there is a race in this code - it's possible to run this as the PR is being merged or just after it is merged.

Line 203:

} else if err = TestPatch(pr); err != nil {

Calls TestPatch here:

// TestPatch will test whether a simple patch will apply
func TestPatch(pr *models.PullRequest) error {
// Clone base repo.
tmpBasePath, err := createTemporaryRepo(pr)
if err != nil {
log.Error("CreateTemporaryPath: %v", err)
return err
}
defer func() {
if err := models.RemoveTemporaryPath(tmpBasePath); err != nil {
log.Error("Merge: RemoveTemporaryPath: %s", err)
}
}()
gitRepo, err := git.OpenRepository(tmpBasePath)
if err != nil {
return fmt.Errorf("OpenRepository: %v", err)
}
defer gitRepo.Close()
pr.MergeBase, err = git.NewCommand("merge-base", "--", "base", "tracking").RunInDir(tmpBasePath)
if err != nil {
var err2 error
pr.MergeBase, err2 = gitRepo.GetRefCommitID(git.BranchPrefix + "base")
if err2 != nil {
return fmt.Errorf("GetMergeBase: %v and can't find commit ID for base: %v", err, err2)
}
}
pr.MergeBase = strings.TrimSpace(pr.MergeBase)
tmpPatchFile, err := ioutil.TempFile("", "patch")
if err != nil {
log.Error("Unable to create temporary patch file! Error: %v", err)
return fmt.Errorf("Unable to create temporary patch file! Error: %v", err)
}
defer func() {
_ = os.Remove(tmpPatchFile.Name())
}()
if err := gitRepo.GetDiff(pr.MergeBase, "tracking", tmpPatchFile); err != nil {
tmpPatchFile.Close()
log.Error("Unable to get patch file from %s to %s in %s/%s Error: %v", pr.MergeBase, pr.HeadBranch, pr.BaseRepo.MustOwner().Name, pr.BaseRepo.Name, err)
return fmt.Errorf("Unable to get patch file from %s to %s in %s/%s Error: %v", pr.MergeBase, pr.HeadBranch, pr.BaseRepo.MustOwner().Name, pr.BaseRepo.Name, err)
}
stat, err := tmpPatchFile.Stat()
if err != nil {
tmpPatchFile.Close()
return fmt.Errorf("Unable to stat patch file: %v", err)
}
patchPath := tmpPatchFile.Name()
tmpPatchFile.Close()
if stat.Size() == 0 {
log.Debug("PullRequest[%d]: Patch is empty - ignoring", pr.ID)
pr.Status = models.PullRequestStatusMergeable
pr.ConflictedFiles = []string{}
return nil
}
log.Trace("PullRequest[%d].testPatch (patchPath): %s", pr.ID, patchPath)
pr.Status = models.PullRequestStatusChecking
_, err = git.NewCommand("read-tree", "base").RunInDir(tmpBasePath)
if err != nil {
return fmt.Errorf("git read-tree %s: %v", pr.BaseBranch, err)
}
prUnit, err := pr.BaseRepo.GetUnit(models.UnitTypePullRequests)
if err != nil {
return err
}
prConfig := prUnit.PullRequestsConfig()
args := []string{"apply", "--check", "--cached"}
if prConfig.IgnoreWhitespaceConflicts {
args = append(args, "--ignore-whitespace")
}
args = append(args, patchPath)
pr.ConflictedFiles = make([]string, 0, 5)
stderrReader, stderrWriter, err := os.Pipe()
if err != nil {
log.Error("Unable to open stderr pipe: %v", err)
return fmt.Errorf("Unable to open stderr pipe: %v", err)
}
defer func() {
_ = stderrReader.Close()
_ = stderrWriter.Close()
}()
conflict := false
err = git.NewCommand(args...).
RunInDirTimeoutEnvFullPipelineFunc(
nil, -1, tmpBasePath,
nil, stderrWriter, nil,
func(ctx context.Context, cancel context.CancelFunc) {
_ = stderrWriter.Close()
const prefix = "error: patch failed:"
const errorPrefix = "error: "
conflictMap := map[string]bool{}
scanner := bufio.NewScanner(stderrReader)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, prefix) {
conflict = true
filepath := strings.TrimSpace(strings.Split(line[len(prefix):], ":")[0])
conflictMap[filepath] = true
} else if strings.HasPrefix(line, errorPrefix) {
conflict = true
for _, suffix := range patchErrorSuffices {
if strings.HasSuffix(line, suffix) {
filepath := strings.TrimSpace(strings.TrimSuffix(line[len(errorPrefix):], suffix))
if filepath != "" {
conflictMap[filepath] = true
}
break
}
}
}
// only list 10 conflicted files
if len(conflictMap) >= 10 {
break
}
}
if len(conflictMap) > 0 {
pr.ConflictedFiles = make([]string, 0, len(conflictMap))
for key := range conflictMap {
pr.ConflictedFiles = append(pr.ConflictedFiles, key)
}
}
_ = stderrReader.Close()
})
if err != nil {
if conflict {
pr.Status = models.PullRequestStatusConflict
log.Trace("Found %d files conflicted: %v", len(pr.ConflictedFiles), pr.ConflictedFiles)
return nil
}
return fmt.Errorf("git apply --check: %v", err)
}
pr.Status = models.PullRequestStatusMergeable
return nil
}

Which updates the pr.MergeBase - but as I say above this can end up being after the PR is merged.

Finally the checker calls:

checkAndUpdateStatus(pr)

// checkAndUpdateStatus checks if pull request is possible to leaving checking status,
// and set to be either conflict or mergeable.
func checkAndUpdateStatus(pr *models.PullRequest) {
// Status is not changed to conflict means mergeable.
if pr.Status == models.PullRequestStatusChecking {
pr.Status = models.PullRequestStatusMergeable
}
// Make sure there is no waiting test to process before leaving the checking status.
if !pullRequestQueue.Exist(pr.ID) {
if err := pr.UpdateCols("merge_base", "status", "conflicted_files"); err != nil {
log.Error("Update[%d]: %v", pr.ID, err)
}
}
}

which will update the PR without checking if it is merged or not.

The situation is similar on 1.12 but happens less frequently because we're attempting to prevent some of the raceyness.

I think the answer is to change from pr.UpdateCols to:

// UpdateColsIfNotMerged updates specific fields of a pull request if it has not been merged
func (pr *PullRequest) UpdateColsIfNotMerged(cols ...string) error {
	_, err := x.Where("id = ? AND has_merged = ?", pr.ID, false).Cols(cols...).Update(pr)
	return err
}

zeripath added a commit to zeripath/gitea that referenced this issue Mar 31, 2020
Fix go-gitea#10766

Signed-off-by: Andrew Thornton <art27@cantab.net>
zeripath added a commit to zeripath/gitea that referenced this issue Mar 31, 2020
Fix go-gitea#10766

Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath
Copy link
Contributor

would it be possible for any of you to test those PRs? I think that they should fix the problem.

@jamesorlakin
Copy link
Contributor Author

I can try and test the backport by building a new Docker image. I'll try and report back tonight or tomorrow if it's worked. 👍

lafriks added a commit that referenced this issue Apr 1, 2020
Fix #10766

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
lafriks pushed a commit that referenced this issue Apr 1, 2020
* Only update merge_base if not already merged

Fix #10766

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Prevent race in transfer pull request

* Update services/pull/pull.go
@jamesorlakin
Copy link
Contributor Author

7 PRs down and I've not seen a zeroed one yet. Thanks Zeripath!

@zeripath
Copy link
Contributor

zeripath commented Apr 1, 2020

Still sorry that I broke it. The broken PRs will be fixed when the migration in #10786 runs - but you might be able to backport the migration manually or fix the broken PRs in the db.

In case you're interested. The correct git command for finding the merge base is: (With $MERGE_COMMIT_ID and $PR_ID as appropriate)

git merge-base $(git rev-list --parents -n 1 $MERGE_COMMIT_ID) refs/pull/$PR_ID/head

@jamesorlakin
Copy link
Contributor Author

jamesorlakin commented Apr 2, 2020

There's nothing to be sorry about, we all make small mistakes here and there. I'm very appreciative of all the work done in an open source project - voluntary work is very respectable and I can see it being a bit of a thankless job at times. Where our team doesn't pay for Gitea (and I'm unfortunately not somebody with financial power to sponsor/donate) I can't expect a support team dedicated to fix things for me 🙂

That said I try and 'pay my debts' to the community that's helped me, so making PRs and generally improving Gitea is in my interest too. Win win!

Oh, and you've also taught me about the merge-base command and some of Gitea's pull logic. ❤️

@tanrui8765
Copy link

Hello friends, I found this problem seems like still happens in Gitea v1.11.4

My environment is:
- Gitea version: v1.11.4
- Git version: 2.23.0.windows.1
- Operating system: Windows Server 2008 R2 64bit
- Database: MySQL

A recent pull request:
image

An old pull request:
image

@6543
Copy link
Member

6543 commented Apr 17, 2020

you have to manualy fix them - a gitea doctor task for this will be in 1.11.5

@6543
Copy link
Member

6543 commented Apr 17, 2020

@tanrui8765 #10991
"Add non-default recalculate merge bases check/fixer to doctor"

@tanrui8765
Copy link

ok, got it, thanks~~ @6543

@tanrui8765
Copy link

tanrui8765 commented May 11, 2020

@6543

Hello friend, I deployed the gitea version V1.11.5, and tried to use gitea doctor task to check this problem, the results was OK, but the problem still exist on our gitea repo.

Could you please tell me what to do to manually fix these problems? I'm so confused...
Many Thanks

image

image

image

@6543
Copy link
Member

6543 commented May 11, 2020

./gitea doctor --all --fix @tanrui8765

@tanrui8765
Copy link

Thank you very much @6543, it worked~

@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants