-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACTION REQUIRED: workflow change for merging changes to nodejs/node #2598
Comments
👍 |
Do these merges add the committer line of whoever performed the merge? |
I was told they will, and they need to for us to use it. |
Yes, the information is taken from your full name and email address as set in Jenkins. I will update the wiki to reflect that. |
What about documentation-only changes? I believe we don't yet have documentation-related tests, do we? |
To ensure you committer info is set correctly, go to https://jenkins-iojs.nodesource.com/, login, click on your username on the top-right, then on the left click on 'Configure'. Double-check 'Full Name' and 'E-mail address'. |
Use node-accept-pull-request, and you can set NODES_SUBSET to |
Doesn't work for me. I am logged in and there is nothing like «Build with parameters». Can I have a screenshot? I remeber it being ok in |
No, it's definitely not there. |
@ChALkeR did you go to https://jenkins-iojs.nodesource.com/job/node-accept-pull-request/ first ? |
@orangemocha Yes, of course. |
@orangemocha also, it still points by default to the joyent org, not to nodejs. :) |
@Fishrock123 there are dropdowns, in node-accept-pull-request. Maybe you are looking at node-merge-commit? |
@Fishrock123 I will update the defaults from joyent/node in the 2-hour window between the move of the joyent/node repo and this workflow change 😄 |
@ChALkeR : do you see "Build with Parameters" for other jobs, like node-test-pull-request ? |
@orangemocha oops, you're right. :) |
The steps required for small corrections like modifying the commit message or squashing commits seem unwieldly. I feel like this is taking quite a bit flexibilty away. How about introducing an alternative CLI-only workflow like this:
One could do multiple target branches for different subsets of tests to run. A way of feedback to the user could be a bot that posts CI results on the issue through the GitHub API. As a die-hard CLI user, I sure would appreciate something like this. |
@ChALkeR : if you are still having problems with the missing "Build with Parameters", I would suggest trying the following in order:
|
@silverwind : I am definitely not against additional tooling to make things even easier. We can work together on this if you want, or we could even let you mess with Jenkins if you are interested. Having said that.. are the steps for modifying the commit message or squashing commits really more complicated under this workflow? For doing that, you'd still have to pull the changes locally and make your edits (I simply described a possible procedure in the wiki), and finally push a branch to GitHub (which in the pre-Jenkins way could be the final branch). The only additional steps are now a) filling the form to kick off node-accept-pull-request, and b) deleting the branch after the fact. The time it takes for filling the form should be compensated by the ease of filling in reviewers with the drop-downs. And we could add a checkbox to node-merge-commit to have it delete the temporary source branch after the merge. |
@orangemocha so just to be clear, I can already use this for nodejs/node, right now? |
@Fishrock123 : technically yes, but if anybody lands another pull request manually in the middle of your run, it will break it. |
But nothing fatal that another run can't fix? |
@orangemocha I'm a bit confused. Do you still have to start off What I like to see is an alternative entry point to (Also, I'm not really motivated to learn Jenkins internals, so I think I'll decline your offer 😉) |
@Fishrock123 : correct. You can try it if you want. |
@orangemocha https://jenkins-iojs.nodesource.com/job/node-test-commit-arm/380/ I think the Pi1s aren't depending on the other task correctly? |
@orangemocha On last Friday I landed two commits. You can find each of those original commits listed in their respective PRs. Later that day @Fishrock123 landed another patch using Jenkins. For some reason the two patches I had landed earlier were left stranded. This lead me to the assumption that Jenkins was responsible. |
Folks, thank you for your patience with this. The last couple days definitely didn't go as well as I had hoped. There were a lot of random test failures all over the place, preventing people from landing PRs. There have been so many unexpected ones, that it makes me wonder whether those failures could be caused by an underlying bug in node itself. We have been discussing this today within the build WG and the sane thing to do would seem to halt using the automated merge temporarily, until we can get things to a stable place. Hence I am re-opening this issue to elicit people's feedback. Right now we are operating under this plan:
/cc @nodejs/collaborators @nodejs/tsc |
I am worried we are going to have a high potential of starting to ignoring legitimate failures due to current hardware issues. A while ago (1-2 months?) we were having all green runs. What happened to that? |
In order to win some time, would it be possible to automatically cancel a merge job as soon as one of the test jobs fails ? |
@targos : yes, no harm in doing that if you don't need complete results from the run. |
That would be a tough question to answer precisely. The high level guess is that due the fast pace of development the quality is slowly drifting down (no offense to anyone, I hope). It does highlight the importance of having a system like this in place. The other side is that some of these can be due to faulty hardware or bad configuration of the slaves. We do check for that, and while this is true in some cases, there are still a majority of flaky failures that cannot be clearly attributed to issues with the slaves. |
We are already making those calls when we say "not related to this change" and try again (or land it manually). Once we make that decision and move forward with the PR, the flakiness is now in master. There is little harm in marking those tests as flaky, and at least we are opening GitHub issues to track the problems and follow up. This might be an argument against nodejs/build#182 though. |
Ok this is just incorrect. They were failing BEFORE THAT. And we cleaned it up besides what was still in our issue tracker. Some of these failures have only manifested in the last few days. I've been keeping an eye on the CI since the beginning of io.js.
Anything with failures relating to reset connections or port binding is a) relatively new, and b) appeared around some CI changes about 2 or so months ago. Also now the
We can also parse if the output is or isn't actually flaky. |
Also for clarification, I am quite sure we have had green runs for over a week, at a couple intervals, on master. (This was after the point where we were accidentally ignoring windows failures.) |
I doubt that the failures are related to the shift from iojs+any+pr+multi to node-test-pull-request. They differ in the arguments they take and the way they fetch things, but internally they are just calling the test runner in the same way. I have to agree that this started in the last few days. Before proceeding with opening this issue, we had several test runs with node-accept-pull-request where things were stably yellow. That means that there might be a real bug in node. If someone has time to start investigating those flaky tests locally, it would be helpful. |
OK, after chasing after failures for the last few days, I am convinced that we need to suspend this experiment. 😞 This level of flakiness in the tests is unprecedented, and I believe it started very recently (this week). While there are occasional failures due to misconfigured machines, those are easy to spot. The vast majority of failures seem due reasons outside of the Jenkins/CI realm, and seem to indicate a real problem with the state of the master branch. The recent libuv upgrade (a161594) is high on the list of suspects. What we can do is run some test jobs on recent commits and try to bisect it that way. If you have a chance to investigate some of those failures locally, that would be helpful too. We can resume this experiment after we have found the underlying cause of all this instability. I guess that at this point, we'll also take the time to fix a few other high priority issues with the CI itself (like making the build faster), and restart only when it looks really solid. Sorry for the headaches that this has caused. /cc @nodejs/collaborators @nodejs/tsc |
These flaky tests have been plaguing us a lot longer than a week, and the failures were almost never reproduceable locally, so I think we indeed should take a close look on the CI itself. Are other projects using Jenkins also experiencing this? Maybe we're doing something fundamentally wrong? |
Jenkins is just running a bunch of scripts on a variety of machines. Sure, it has its own flukes, but I don't see how it could be causing some of the issues that we have seen lately. Maybe to get a better shot at reproducing locally, we should try running make run-ci. The top failing platforms seems to be armv7-wheezy, RPis, and centos5. |
Wild guess, probably wrong, but hey, that's what the Internet is for, so here you go: Didn't we move a test or three from |
That's not to be excluded @Trott . I already started a few runs to try and bisect commits in CI, so that if there was one offending commit, we can find it. |
Also, I just checked PRs that landed in v0.12 recently. About 10 PRs landed in the last couple of weeks using node-accept-pull-request from the new Jenkins, using the same exact Jenkins infra, except they don't run on ARM. No records of people having to retry runs. This also seems to confirm that the problem is in the current master branch. |
Anything that deals with Related: I saw two unavailable ports after like 50 test suite runs on my OS X 10.10.5 machine. None on my remote Ubuntu 15 testing box. (It did has a weird fs |
@orangemocha what do you think about delaying the accept-pullrequest project? It looks like it is blocking us from making progress at the moment, and considering the upcoming release - it is very troublesome. We can always return to running the experiments with this at some later point, when things will be more stable. |
@indutny yes see above #2598 (comment) |
@orangemocha argh, right! sorry, I didn't get it. |
Closing for now. I will open a new issue when we have addressed the stability and performance concerns. |
Based on previous discussion, feedback, and the approval by the TSC (#2434), we are adopting a new workflow for merging pull request in nodejs/node, and making any changes to the code branches at nodejs/node in general. The new workflow needs to be adopted by all @nodejs/collaborators for all changes to be merged to nodejs/node, starting this coming Monday at 2pm UTC: http://www.timeanddate.com/worldclock/fixedtime.html?msg=Start+merging+changes+to+nodejs%2Fnode+with+Jenkins&iso=20150831T14&p1=1440
From that time on, please refrain from pushing changes to nodejs/node manually, and instead use the workflow documented at https://github.com/nodejs/node/wiki/Merging-pull-requests-with-Jenkins. I also added a wiki page specifically on how to deal with flaky tests.
The workflow for testing (but not landing) pull requests with Jenkins is unchanged, and now documented here: https://github.com/nodejs/node/wiki/Testing-pull-requests-with-Jenkins
I'll be monitoring Jenkins to make sure things keep working smoothly. Let me and @nodejs/jenkins-admins know if you encounter any issues with the CI infrastructure or to share your feedback. Thank you!
The text was updated successfully, but these errors were encountered: