-
Notifications
You must be signed in to change notification settings - Fork 147
Pervasive lag issue with label/milestone changes in issues and PRs #78
Comments
Looking into that. |
At first glance all seems OK.
I need to check them manually to see what is happening. |
First of them: kubernetes/kubernetes#52444 So the question is: |
This is the final issues list (links). |
The second one has v1.9 milestone and is closed now, but it was closed |
But the third one had v1.9 milestone that was later removed by bot: This may be the bug. |
This one is very interesting. |
The other (potential) issue can be:
Detecting removed labels is handled here |
In first case I don't see any milestone removal on the GitHub UI, but indeed - it have milestone removed, database contains full history:
|
And this is the case when Issue had SIG label, but it was removed before |
Not good, I see that when k8s robot is removing the milestone events records with milestone not yet removed. And the next event is one month later, and this is after |
I need to go really deep - I'll download and save this event's JSON and see what data I can get from GitHub, because now I can see on the GitHub UI that "k8s-merge-robot removed milestone v1.9" but GHA database event is recorded with that milestone present, and the next event happens one month later (and that one has no milestone). |
JSON does have milestone in "remove milestone" event. I'll try this approach now (in addition to standard milestone detect, which detects removed milestone but on the NEXT event, not removing event itself). |
The problem is that for every event that modifies the milestone - we only have current milestone, not the new
This is probably why it now shows 36.. struggling more. |
I will try the really crazy approach with finding milestones by always using next event on the same issue (if present). |
Seems like this trick may work. |
I think this is OK now, see on the test server |
Prod also updated, I think this is very close to what we need, but due to special
Let me know what do you think @jberkus |
Damn. Ok, that's pretty problematic. Have you filed a bug with Github? |
So, as I understand it, issues which were taken out of the milestone won't be removed from the count until another event happens to that issue? And the same with PRs, correct? In the future milestone automation will de-facto remove this issue, but it would be nice if github fixed it. |
Question: do other, manual changes to labels generate events? Or only comments/open/close? |
I've used trick, that uses next event. If there is no event, I fallback to current event.
The problem is that we only have "old milestone" while we should have two fields (both nullable): Anyway, my tricks makes it work quite good atm imho. |
Right, what I'm saying is that the removal of the milestone doesn't, by itself, generate an event, correct? So my question is: does the addition or removal of labels generate an event on its own? BTW, checking issue burndown records, this means that issue counts are about 10-15% higher than history, and trail a day or two behind, which we'll want to note in the eventual documentation. |
There is no separate event like
This makes me wonder what if:
I'll check this tomorrow and report here. |
I will open myself (and will check if that is true tomorrow).
Seems like we have a problem here. |
@jberkus what do you think about this: I think the nice "workaround" would also be: k8s-*-bot create additional comment after changing/updating milestone/label, something like "Note: milestone updated to abc", or "Note: label xyz removed" |
Changed from |
@jberkus any updates on this on the K8s side? @dankohn @jberkus what do you think about spending few days researching new data source: GitHub API (in addition to already existing GHA & git)?
@dankohn can I investigate this task? |
Sure, but it seems easier to modify the mungebot to change labels in a way
that records events in a way we can deal with.
…--
Dan Kohn <dan@linuxfoundation.org>
Executive Director, Cloud Native Computing Foundation https://www.cncf.io
+1-415-233-1000 https://www.dankohn.com
On Wed, Mar 21, 2018 at 4:32 PM, Łukasz Gryglicki ***@***.***> wrote:
@jberkus <https://github.com/jberkus> any updates on this on the K8s side?
@dankohn <https://github.com/dankohn> @jberkus
<https://github.com/jberkus> what do you think about spending few days
researching new data source: GitHub API (in addition to already existing
GHA & git)?
- I think I can write yet another data source that will periodically
query GitHub using API - just to get Issues/PRs current label state (that
would eliminate need for another GHA event happening after somebody added
the label from the GitHub UI).
- This would certainly work for *current* issues/PR, I don't know if
this is possible to query past state usingGitHub API - I think this is not
possible, but I can double check it.
- This separate data source would have to run in a separate process,
because it can block when we're out of GitHub API points.
- We're already querying GitHub API to get new releases tags
(annotations/releases) but this is using very few GitHub API points. This
runs every hour and uses just a few points out of 5000 available.
- So new "labels" state API calls should always happen after the
annotations part, because it can potentially run out of API points.
- If we go this way, we may want to think again about GitHub OAuth
token that is used by DevStats. Currently it uses my private GitHub OAuth
token.
@dankohn <https://github.com/dankohn> can I investigate this task?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<cncf/devstats#78 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8MBiqoIfUxCE70Fk-4JwwUMxUGVyGkks5tgrjcgaJpZM4SXnGY>
.
|
Yes, I've already suggested that. |
mungebot is open source. They will accept pull requests if it doesn't slow
anyone there down. Please research that as well.
…--
Dan Kohn <dan@linuxfoundation.org>
Executive Director, Cloud Native Computing Foundation https://www.cncf.io
+1-415-233-1000 https://www.dankohn.com
On Wed, Mar 21, 2018 at 4:43 PM, Łukasz Gryglicki ***@***.***> wrote:
Yes, I've already suggested that.
But I think this won't be that easy to change k8s process to help Devstats.
Devstats is the tool to help K8s not the opposite :p
Ok, I'll do reasearch then and will see what I can do without touching
current k8s workflow.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<cncf/devstats#78 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8MBk9oQ3i9xqniAtDArUWz5OPDlTnXks5tgruLgaJpZM4SXnGY>
.
|
OK will check this too. |
@lukaszgryglicki there's two issues with using the API:
Frankly, I think the best next step is to talk to someone at Github. |
We have a good relationship with GitHub and can ask for more API tokens.
But could we please investigate first whether a small change to Mungegithub
would provide all the data DevStats needs to avoid using the API. The API
will always be more brittle that GitHub Archives data.
Lukasz, can you state again the state that an issue can get in which is
unknowable. I'd like to understand if we could just right a munge plugin
that looks for that state and corrects it.
…--
Dan Kohn <dan@linuxfoundation.org>
Executive Director, Cloud Native Computing Foundation https://www.cncf.io
+1-415-233-1000 https://www.dankohn.com
On Fri, Mar 23, 2018 at 10:53 PM, Josh Berkus ***@***.***> wrote:
@lukaszgryglicki <https://github.com/lukaszgryglicki> there's two issues
with using the API:
1.
Kubernetes is constantly running out of API "tokens", so anything that
requires a lot of additional API calls is just out.
2.
I checked API data, and in the API it's also true that issues/PRs that
have only had labels or milestones changed do not show up as "updated" in
the API either. So we'd be in a position of polling all the issues/PRs in
some way, which is a LOT of API calls.
Frankly, I think the best next step is to talk to someone at Github.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<cncf/devstats#78 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8MBg6UAC2tANeCtSoJLOkyqlssPNDMks5thbUsgaJpZM4SXnGY>
.
|
@dankohn it's not the technical difficulty, which is negligable. It's that any method which involves increasing github notification traffic just to support devstats is a total nonstarter. |
I agree, but I'm trying to understand if the problem occurs in regular
workflow or is a corner case.
…--
Dan Kohn <dan@linuxfoundation.org>
Executive Director, Cloud Native Computing Foundation https://www.cncf.io
+1-415-233-1000 https://www.dankohn.com
On Fri, Mar 23, 2018 at 11:26 PM, Josh Berkus ***@***.***> wrote:
@dankohn <https://github.com/dankohn> it's not the technical difficulty,
which is negligable.
It's that any method which involves increasing github notification traffic
*just* to support devstats is a total nonstarter.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<cncf/devstats#78 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8MBgXBCWia5rTmC4-y8EJ_A_jZICY3ks5thbzxgaJpZM4SXnGY>
.
|
I almost have the working solution. And this is a quite fast and straightforward process - I've added 'ghapi2db' tool to support that, I only need to comment it. I can give you working solution tomorrow, without touching mungegithub at all and it will need about 200 API point/hour, which is a lot less than 5000. mungegithub often adds or removes labels as a last operation, just after creating comment, so this situation happens often IMHO. |
Actually I've just connected |
Seems like all is working OK, so data quality will increase all the time, starting from yesterday. |
Wow, great work, @lukaszgryglicki |
And we don't need to touch mungebot. |
Now, I do think it's worth talking about having prow write a log of its actions that devstats can access. That would give us the data WITHOUT adding to the github notification burden. |
No problem for me anymore. I already have a tool that gets the ifno it needs. |
Final version is on the test server, here: https://k8s.cncftest.io/d/22/open-issues-prs-by-milestone?orgId=1
The same problem with detecting closed & reopened issues/PRs (and performance issues too) also happens for:
Now I'll work on the remaining dashboards - all on the test now. |
All problems described above are now fixed and gone. |
This looks good to me, I've thrown it in #devstats to see if I can get more eyeballs on it. OK if you want to wait for a day just so more people can look for obvious glitches. |
Currently we don't have any data newer than 2018-04-02 14:00 UTC, due to GitHub archives outage: https://github.com/cncf/devstats/issues/91 |
Outage fixed on the GHA side, DevStats has all the data again. |
No longer blocked, now just need to confirm that it works ok. |
I'm closing this, please reopen if you find lag/bug. |
Lukasz,
If you look here:
https://k8s.devstats.cncf.io/d/IIUa5kezk/open-issues-prs-by-milestone?orgId=1&from=1509407831268&to=1511830631269&var-sig_name=All&var-sig=all&var-milestone_name=v1.9&var-milestone=v1_9&var-repo_name=kubernetes%2Fkubernetes&var-repo=kubernetes_kubernetes&var-full_name=Kubernetes
... it says that as of Nov 27, we had 90ish open issues against 1.9. However, if you look at the burndown report, we actually had 28 issues open on that date.
What's the reason for the extremely different counts?
The text was updated successfully, but these errors were encountered: