-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX recursive_will_execute performance (simple ~300x performance increase} #2852
Conversation
#2666 related |
Here is a simpler version for quick patching after an update:
Edit: this also needs a memo={} created where it is called (for best performance), but this PR is now using this |
I switched to the version that has minimal code changes so that any external calls to this function will still work as expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notes
@@ -194,8 +194,12 @@ def recursive_execute(server, prompt, outputs, current_item, extra_data, execute | |||
|
|||
return (True, None, None) | |||
|
|||
def recursive_will_execute(prompt, outputs, current_item): | |||
def recursive_will_execute(prompt, outputs, current_item, memo={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be backwards compatible with any external callers of this function.
@@ -377,7 +382,8 @@ def execute(self, prompt, prompt_id, extra_data={}, execute_outputs=[]): | |||
|
|||
while len(to_execute) > 0: | |||
#always execute the output that depends on the least amount of unexecuted nodes first | |||
to_execute = sorted(list(map(lambda a: (len(recursive_will_execute(prompt, self.outputs, a[-1])), a[-1]), to_execute))) | |||
memo = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memo should be created outside the lambda so it is reused for entire sorting algorithm
Sure! I'll create a minimal workflow that uses only basic nodes and has this problem. btw: I'm excited to see your PR, nested workflows/components are the killer feature (we just have to solve performance) |
This is a workflow that demonstrates the problem. You can also easy change the complexity level with |
|
It's nice to see this [finally] getting some attention. This PR doesn't quite solve the entire issues here, and still suffers from millions of wasted cycles and hundreds of seconds--especially on re-execution--for complex workflows. The PR I sent months ago does, though is a bite more complex; perhaps it need not be (#1503 for issue #1502). Without combing through the differences in the memoization itself, the broad differences are:
@ricklove Do you think you could do this (since, outside PRs seem to fall on deaf ears). Tens of thousands of folks have already using my PR for months, as it's integrated from the outside in rgthree-comfy. I don't know if you intended/checked that it didn't break all those workflows, but it didn't, so thanks for that :) |
…ease} (comfyanonymous#2852) * FIX recursive_will_execute performance * Minimize code changes * memo must be created outside lambda
This solves a major performance problem that makes large graphs impossible to use in ComfyUI. It also speeds up even medium size graphs that have a long chain of dependent nodes.
Performance improvement example:
Measurement code (in
execution.py
):functions: