Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to calculate the number of actions in execution phase #3582

Open
rahul-malik opened this issue Aug 18, 2017 · 17 comments
Open

Option to calculate the number of actions in execution phase #3582

rahul-malik opened this issue Aug 18, 2017 · 17 comments
Labels
not stale Issues or PRs that are inactive but not considered stale P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Performance Issues for Performance teams type: feature request

Comments

@rahul-malik
Copy link
Contributor

Description of the problem / feature request / question:

Feature Request: Add an option to calculate the number of actions that will be performed in the execution phase.

Problem: The progress indication from Bazel does not give you a sense of the current progress because the number of actions typically increases throughout the execution phase.

It would be helpful to have a way to know how many actions will be performed so we can develop better UI/UX for progress which is less confusing for developers that work on a code base built with Bazel but are not familiar with it's internals.

If possible, provide a minimal example to reproduce the problem:

Build any large project.

Environment info

  • Operating System:
    macOS

  • Bazel version (output of bazel info release):
    0.5.3

Have you found anything relevant by searching the web?

(e.g. StackOverflow answers,
GitHub issues,
email threads on the bazel-discuss Google group)
No

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

@ittaiz
Copy link
Member

ittaiz commented Aug 18, 2017 via email

@softprops
Copy link

softprops commented Aug 18, 2017 via email

@iirina iirina added type: feature request P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Aug 18, 2017
@philwo
Copy link
Member

philwo commented Sep 12, 2017

We could probably just change it to report a percentage, if that's less confusing? The "current actions / total actions" should be monotonically increasing already.

@rahul-malik
Copy link
Contributor Author

We've tried converting it to a percentage but it still seems strange because the total actions count is nearly reached before it increases in my experience which gives the false impression the build is almost complete.

@ittaiz
Copy link
Member

ittaiz commented Sep 12, 2017 via email

@ulfjack
Copy link
Contributor

ulfjack commented Sep 14, 2017

The problem is that Skyframe does not eagerly walk the action graph, but it does it lazily. The reason for that is performance, since the action graph can be rather large and this was previously a blocking operation (where Bazel would just hang for some time). The downside is that all threads that walk the action graph block on actions that they execute, which delays discovery of remaining actions. That's why the number keeps going up during the build.

I'd be interested in seeing an attempt to do a concurrent action discovery, i.e., a thread that has the solve purpose of walking the action graph in parallel to the (many) threads that do execution. That would make the action count go up to the 'final' number more quickly. The question is whether it would add too much code complexity or have a significant performance impact.

Also, even if we do that there'll still be increases due to flaky test re-runs, which we most likely wouldn't count unless we have to do them, and we only know that when a test actually fails.

@pauldraper
Copy link
Contributor

pauldraper commented Aug 25, 2018

Another data point: tensorflow/tensorflow#14294

The explanation makes complete sense, but the UI is intuitive in the extreme. I'd rather have a single number with no indication a proportion of progress than two numbers that strongly suggest a proportion of progress and are not.

I feel like I'm doing a Windows file transfer in 2003.

@mhsmith
Copy link

mhsmith commented Sep 17, 2018

I'd rather have a single number with no indication a proportion of progress than two numbers that strongly suggest a proportion of progress and are not.

The second number is better than nothing: at least you have a lower bound on how many actions are left. This may allow you to decide to go and do something else rather than watching the build.

@ulfjack
Copy link
Contributor

ulfjack commented Nov 19, 2018

I have a patch that may address this by making action execution not block skyframe threads - at least, that should make the number go up to the max more quickly. (Well, the patch only adds infrastructure to do so, but should be straightforward to extend.)

@m01
Copy link

m01 commented Oct 30, 2019

Did the patch you mentioned (to speed up discovery of the max number of actions) make it in?

@ulfjack
Copy link
Contributor

ulfjack commented Oct 31, 2019

There's no simple answer to that question, I'm afraid. It isn't a simple patch, but an extensive series of patches. While the new code can be enabled with a flag (--experimental_async_execution), it doesn't do anything in Bazel right now because neither local nor remote execution support async execution, which makes it transparently fall back to legacy semantics. I know exactly what needs to be done, but I've had very little time to work on it for most of this year.

@jin jin added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc and removed category: misc > misc labels May 11, 2020
@meisterT meisterT added untriaged and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) labels May 12, 2020
@janakdr janakdr added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Aug 18, 2020
@elklein
Copy link
Contributor

elklein commented Sep 24, 2020

@ulfjack just wondering if you had had any time to look at this this year. My organization is in the process of moving rather a lot of people over from cmake/ninja to bazel, and this is one of the frequent bits of feedback that we're getting, that the moving count is a pain point.

If you find that this isn't something you'll be able to get to, is it something that somebody else could take on? Even a community member like myself? I'm fairly competent at using bazel, and have made some stabs and hacking things about in the core bazel java code, but can make no great claims at being a master of the internals of bazel. All the same, this change I think would be greatly welcomed by a lot of people.

Thanks!

@sgowroji sgowroji added the stale Issues or PRs that are stale (no activity for 30 days) label Feb 17, 2023
@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen (or ping me to reopen) if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

@sgowroji sgowroji closed this as not planned Won't fix, can't repro, duplicate, stale Feb 17, 2023
@fmeum
Copy link
Collaborator

fmeum commented Feb 17, 2023

@sgowroji This would still be useful to have.

@sgowroji sgowroji added not stale Issues or PRs that are inactive but not considered stale and removed stale Issues or PRs that are stale (no activity for 30 days) labels Feb 17, 2023
@sgowroji sgowroji reopened this Feb 17, 2023
@meisterT
Copy link
Member

@coeuvre's work on potentially re-adding async execution might help here as well.

@fmeum
Copy link
Collaborator

fmeum commented May 13, 2023

@coeuvre Do you still plan to add async execution back in some way? If not, it could make sense to explore alternatives to solve this problem.

@meisterT
Copy link
Member

Yes, we are actively working to upgrade the embedded JDK to a modern JDK and will then work on adding async execution with Loom, see #6394 (comment)

@meisterT meisterT added team-Performance Issues for Performance teams and removed team-Core Skyframe, bazel query, BEP, options parsing, bazelrc labels Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not stale Issues or PRs that are inactive but not considered stale P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Performance Issues for Performance teams type: feature request
Projects
None yet
Development

No branches or pull requests