Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Handling on multiple commits #7576

Closed
Daltz333 opened this issue Oct 19, 2020 · 10 comments
Closed

Improve Handling on multiple commits #7576

Daltz333 opened this issue Oct 19, 2020 · 10 comments
Labels
Accepted Accepted issue on our roadmap Needed: replication Bug replication is required Support Support question

Comments

@Daltz333
Copy link

Details

Expected Result

When merging multiple PRs in a short time-spam, ReadTheDocs often freaks out and gets stuck on the "triggered" page, refusing to build anything for a long period of time (at the time of writing, 35 minutes+)

image

Actual Result

Cleanly cancel the old builds, and build the new one without breaking.

@stsewd stsewd added the Feature New feature label Oct 19, 2020
@Daltz333
Copy link
Author

Additionally force-pushes massively break PRs, and they are incredibly common due to rebase merges. I'd consider this a bug and not a feature request.

@humitos
Copy link
Member

humitos commented Oct 20, 2020

I'm not sure exactly what happened in this case. We should probably need more debugging on this to find out a solution. However, I can tell that the build you linked (https://readthedocs.org/projects/frc-docs/builds/12136540/) has been "killed" because we didn't get back an answer regarding its state (#3312). This usually happens when the build fails for some reason that we weren't able to catch when it happened.

Cleanly cancel the old builds, and build the new one without breaking.

I think the issue was marked as a feature request because of this. We already attempt to cancel a running build and we found out that it wasn't simple, #7031

Unfortunately, I don't have a good answer/workaround to give at the moment.

@humitos humitos added Accepted Accepted issue on our roadmap Needed: replication Bug replication is required Support Support question and removed Feature New feature labels Oct 20, 2020
@humitos
Copy link
Member

humitos commented Jan 27, 2022

I think this problem was already solved when we implemented "duplicated build error" that are immediately canceled. Have you experimented this problem recently, @Daltz333 ?

@humitos humitos added the Needed: more information A reply from issue author is required label Jan 27, 2022
@Daltz333
Copy link
Author

Yes. Even just recently.

@no-response no-response bot removed the Needed: more information A reply from issue author is required label Jan 27, 2022
@Daltz333
Copy link
Author

Daltz333 commented Feb 2, 2022

I pushed multiple commits to multiple PRs. What tends to happen, still, is that one PR will get started, and the rest will be "queued" saying there is 4 already building (even if there is not).

In the below screenshot, you can see one build failed (it says due to inactivity, but that's a lie) and two are currently queued. There is 2 possible outcomes that will happen, eventually RTD will figure it out and things will be built again, or these two builds will fail to "inactivity".

It seems like the check that happens every 5 minutes to determine concurrent builds is invalid or incorrect in some way.

Here is a link to the below failed build, https://readthedocs.org/projects/frc-docs/builds/15963261/. I would like to say this is a rare occurrence, but realistically it is uncommon at best. I do think this is something to do with the logic behind detecting the concurrent builds not be correct (and thus causing builds to infinitely think things are being built when they are not)

image

@humitos
Copy link
Member

humitos commented Feb 3, 2022

I plan to come back to this issue in the near future and try to fix it. Your description is pretty good, but I'd like to have a programmatic way to reproduce it, so I can create a test case and execute it locally over and over again until I find exactly where is our problem.

Do you know the exact git commands that I would need to execute starting from an empty/new repository to reproduce this behavior? I think I'm probably asking too much to you, but since you understand better than myself this issue at this point I didn't want to miss the opportunity of asking this to you 😄

@Daltz333
Copy link
Author

Daltz333 commented Feb 3, 2022

I'll try to set up a minimal test repo and see if I can reliably trigger it.

@Daltz333
Copy link
Author

Daltz333 commented Feb 3, 2022

To be fair on RTD, our documentation project is probably one of the most complex ones out there. We hit 25 minutes of build time building HTML and PDF. We have multiple custom extensions and are adding more, so we are incredibly grateful for the RTD platform to provide this to us. We are working on giving back to the Sphinx and RTD ecosystem by developing several extensions that enhance documentation that are already used by hundreds of projects:

  • dynamic responsive images
  • turning sphinx websites into offline PWA (this may have a bit more intensive impact on RTD servers, but we are investigating multiple caching strategies to minimize this)
  • opengraph metadata
  • dynamic translation metadata for search engines
  • extensive rediraffe redirect extension
  • and a lot more.

So again, thank everyone at RTD for making this possible!

@humitos
Copy link
Member

humitos commented Mar 2, 2022

In #8850 we implemented a way to cancel a build. Besides, in #8866 we plan to cancel old builds automatically when a new push is done to the same branch.

With the ability to manually cancel a build I think this issue is partially solved and it will be fully solved once automatic canceling is implemented. Please, let me know if the current solution is enough for you.

@humitos humitos added the Needed: more information A reply from issue author is required label Mar 2, 2022
@Daltz333
Copy link
Author

Daltz333 commented Mar 2, 2022

That sounds good to me!

@no-response no-response bot removed the Needed: more information A reply from issue author is required label Mar 2, 2022
@humitos humitos closed this as completed Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Needed: replication Bug replication is required Support Support question
Projects
None yet
Development

No branches or pull requests

3 participants