Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit of 600 PR ? #10

Open
pborreli opened this issue Mar 2, 2017 · 11 comments
Open

limit of 600 PR ? #10

pborreli opened this issue Mar 2, 2017 · 11 comments

Comments

@pborreli
Copy link

pborreli commented Mar 2, 2017

https://showmyprs.com/user/pborreli indicates Total pull requests sent: 600

It should be ~1204 instead

And it says : Total contributed repositories: 459

It should be >800

@dmitshur
Copy link

dmitshur commented Mar 2, 2017

https://showmyprs.com/user/shurcooL says 722 PRs, so that means 600 cannot be an explicit limit. There may be another issue causing not all PRs to be displayed though.

Where did you get the 1204 number from?

@pborreli
Copy link
Author

pborreli commented Mar 2, 2017

from https://github.com/pulls

image

@dmitshur
Copy link

dmitshur commented Mar 2, 2017

Got it. For me, it lines up pretty well. GitHub shows 7 open + 718 closed PRs created, which adds up to 725. showmyprs gives me 722, so there is a discrepancy of 3 PRs that I can't explain.

image

@dmitshur
Copy link

dmitshur commented Mar 2, 2017

Here's an idea, are any of your PRs made in private repositories? showmyprs wouldn't be able to show those, since it uses public data only.

@dmitshur
Copy link

dmitshur commented Mar 2, 2017

Yeah, I think that's exactly it. If I select "Public repositories only" under Repository visibility" filter, then GitHub shows me exactly 722 PRs in total (3 are in private repos), which matches my showmyprs count perfectly.

image

Can you try the same and see what you get?

@pborreli
Copy link
Author

pborreli commented Mar 2, 2017

I have only 4 PR in private repos

image

@dmitshur
Copy link

dmitshur commented Mar 2, 2017

I see, then there's definitely some sort of issue in showmyprs.

I can confirm that https://github.com/search?q=type%3Apr+author%3Apborreli+is%3Apublic&type=Issues finds 1210 public PRs for your username:

image

I'll leave this to @karanjthakkar to investigate further.

It seems the issue should be somewhere in this code:

showmyprs.com/main.go

Lines 87 to 181 in a651f26

// Search options to override the default 30 and fetch max 100 per page
opt := &github.SearchOptions{
ListOptions: github.ListOptions{
PerPage: 100,
},
}
var allRepos []Repo
var allPrs []PullRequest
// Continuously fetch all PR's
for {
prs, resp, _ := client.Search.Issues(
// Search query to find PR's
fmt.Sprintf("type:pr author:%s is:public", username),
opt,
)
// Iterate over all closed pull requests to
// see which of them is merged and which one isn't
// Also for each PR we are going to fetch the actual
// repo's stats such as stars, pr's etc.
for _, githubPrObject := range prs.Issues {
parsedPrUrl := parseGithubPrUrl(*githubPrObject.HTMLURL)
// Get stats
repoUrl := strings.Join(
[]string{
"https://api.github.com",
parsedPrUrl.Owner,
parsedPrUrl.Repo,
},
"/",
)
// Cache repo stats and only make calls for new ones
var repo Repo
if _, isCached := CachedRepo[repoUrl]; isCached == false {
repoData, _, _ := client.Repositories.Get(parsedPrUrl.Owner, parsedPrUrl.Repo)
repo = Repo{
Stars: *repoData.StargazersCount,
Forks: *repoData.ForksCount,
Name: *repoData.FullName,
Url: *repoData.HTMLURL,
PullRequests: []PullRequest{},
}
CachedRepo[repoUrl] = repo
} else {
repo = CachedRepo[repoUrl]
}
// Get merged status
if *githubPrObject.State == "closed" {
isPrMerged, _, _ := client.PullRequests.IsMerged(
parsedPrUrl.Owner,
parsedPrUrl.Repo,
parsedPrUrl.Number,
)
if isPrMerged {
*githubPrObject.State = "merged"
}
}
pr := PullRequest{
Url: *githubPrObject.HTMLURL,
Title: *githubPrObject.Title,
State: *githubPrObject.State,
RepoUrl: repoUrl,
}
allPrs = append(allPrs, pr)
}
if resp.NextPage == 0 {
break
}
opt.ListOptions.Page = resp.NextPage
}
allRepos = groupByRepo(allPrs, CachedRepo)
currentTime := time.Now()
data := Response{
LastSyncedAt: currentTime.Unix(),
LastSyncedAtString: getTimeAgo(currentTime.Unix()),
LastSyncedAtStringVerbose: currentTime.Format("Jan 2, 2006 at 3:04pm (MST)"),
Username: username,
TotalRepos: len(allRepos),
TotalPrs: len(allPrs),
AllRepos: allRepos,
}

However, I'm not spotting any issues there just by looking at it. TotalPrs is set to len(allPrs), and allPrs should correctly contain all PRs (I don't see any reasons why it wouldn't).

It's also possible some network connectivity issue caused the for loop to break out early (because NextPage would be 0 if there was an error getting an actual response).

@pborreli
Copy link
Author

pborreli commented Mar 2, 2017

thanks @shurcooL for the help 💯👍

@karanjthakkar
Copy link
Owner

@pborreli Sorry for getting back to you so late about this. I have been travelling thee past few days.

  1. The probable reason why the first time you were shown a total of only 600, was because I'm using my personal access token for Github PR search and it has a limit of 30 for a window of 15 minutes with a max per page results of 100. So it is possible that when I started fetching list of PR's for your username, I had already consumed 24 searches for that window.

  2. Having refreshed your profile on my end, I found that it is now showing a total of 1000 PR's sent. This is a limit on Github's end. Since you have a total of 1200 something PR's I have to search 13 times with different page numbers. On the 11th page, Github says: Only the first 1000 search results are available

In case 1, I am planning to fix by giving the user an option to login with their account so that I can use their access token for it. This should fix the incomplete PR count issue.

In case 2, however, there is no way for me to bypass the 1000 search results hard limit set by Github. I will investigate more on whether there are other ways to fetch your PR's without querying the search API.

Sorry if I can't fix this sooner but I hope this gives you more clarity on what the issue is.

cc: @shurcooL Just FYI :) Thanks for chipping in before and helping investigate this with @pborreli 💯

@pborreli
Copy link
Author

pborreli commented Mar 6, 2017

wow thanks for the investigation,

I would log without problem is the scope is readonly and public,

Maybe the solution without the need of login would be the search the api looping on year ? like in this answer ?

@karanjthakkar
Copy link
Owner

@pborreli Thanks for searching for a workaround this issue. I think this is doable. I might need to refactor some logic of recursively fetching newer search pages. I'll add this to my list of todos. Alternatively, if you would like to take a shot at a PR for this, I'd be more than happy to help you out with that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants