Cache initialization between each run is very slow when having many calls to `folder_paths.get_filename_list` in `INPUT_TYPES` #4882

Trung0246 · 2024-09-11T10:36:14Z

Expected Behavior

The load time should be optimally fast before #2666. The problematic line is this:

Lines 474 to 476 in 9f4daca

    
           for cache in self.caches.all: 
        
               cache.set_prompt(dynamic_prompt, prompt.keys(), is_changed_cache) 
        
               cache.clean_unused()

Actual Behavior

Load time is very slow, as shown in log it took a whole minute (I have added debug code). Not tested on which kind of big O growth is here, maybe exponentially.

        with torch.inference_mode():
            dynamic_prompt = DynamicPrompt(prompt)
            is_changed_cache = IsChangedCache(dynamic_prompt, self.caches.outputs)
            logging.info("Setting prompt for cache")
            prompt_keys = prompt.keys()
            for cache in self.caches.all:
                cache.set_prompt(dynamic_prompt, prompt_keys, is_changed_cache)
                cache.clean_unused()
            logging.info("Setted prompt for cache")

Steps to Reproduce

Have a decently large workflow with over >50 nodes. Can test with https://perilli.com/ai/comfyui/

Debug Logs

2024-09-11 03:30:28,124- root:522- INFO- got prompt
2024-09-11 03:30:28,133- root:540- INFO- Validating prompt
2024-09-11 03:30:28,134- root:542- INFO- Validated prompt
2024-09-11 03:30:28,171- root:127- INFO- Executing prompt e81c5045-d2e6-49dd-abc2-2a22406aca33
2024-09-11 03:30:28,172- root:468- INFO- Setting prompt for cache
2024-09-11 03:31:42,298- root:473- INFO- Setted prompt for cache
2024-09-11 03:32:48,690- root:106- INFO- model weight dtype torch.float16, manual cast: None
2024-09-11 03:32:48,926- root:115- INFO- model_type EPS
2024-09-11 03:32:49,855- root:271- INFO- Using xformers attention in VAE

Other

         14136356 function calls (9707273 primitive calls) in 65.405 seconds

   Ordered by: cumulative time
   List reduced from 221 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   65.405   65.405 {built-in method builtins.exec}
        1    0.000    0.000   65.404   65.404 <string>:1(<module>)
        2    0.000    0.000   65.404   32.702 caching.py:266(set_prompt)
        3    0.000    0.000   65.403   21.801 caching.py:143(set_prompt)
        2    0.000    0.000   65.402   32.701 caching.py:66(__init__)
        2    0.008    0.004   65.402   32.701 caching.py:75(add_keys)
      574    0.021    0.000   65.394    0.114 caching.py:85(get_node_signature)
    18416    0.206    0.000   59.277    0.003 caching.py:93(get_immediate_node_signature)
     7610    0.037    0.000   58.709    0.008 folder_paths.py:238(get_filename_list)
     7610    0.247    0.000   58.608    0.008 folder_paths.py:216(cached_filename_list_)
   116987   58.195    0.000   58.195    0.000 {built-in method nt.stat}
    94282    0.116    0.000   50.780    0.001 <frozen genericpath>:53(getmtime)
     4170    0.012    0.000   33.303    0.008 nodes.py:521(INPUT_TYPES)
     3304    0.016    0.000   25.226    0.008 nodes.py:604(INPUT_TYPES)
    22705    0.045    0.000    7.584    0.000 <frozen genericpath>:39(isdir)
1793074/574    1.903    0.000    5.950    0.010 caching.py:36(to_hashable)
435026/574    0.304    0.000    5.937    0.010 caching.py:44(<listcomp>)
137880/5716    0.287    0.000    4.651    0.001 caching.py:42(<listcomp>)
5125254/3109390    0.874    0.000    3.414    0.000 {built-in method builtins.isinstance}
  1007932    0.453    0.000    2.746    0.000 typing.py:1304(__instancecheck__)
  1007932    0.612    0.000    2.293    0.000 typing.py:1579(__subclasscheck__)
  1007932    0.960    0.000    1.475    0.000 {built-in method builtins.issubclass}
1008346/1007932    0.267    0.000    0.516    0.000 <frozen abc>:121(__subclasscheck__)
1008346/1007932    0.227    0.000    0.248    0.000 {built-in method _abc._abc_subclasscheck}
      574    0.001    0.000    0.143    0.000 caching.py:115(get_ordered_ancestry)
18416/574    0.062    0.000    0.143    0.000 caching.py:121(get_ordered_ancestry_internal)
    18416    0.009    0.000    0.138    0.000 execution.py:37(get)
   150866    0.092    0.000    0.132    0.000 graph_utils.py:1(is_link)
   330/80    0.001    0.000    0.129    0.002 utils.py:346(new_func)
       60    0.000    0.000    0.119    0.002 IPAdapterPlus.py:647(INPUT_TYPES)
       40    0.000    0.000    0.113    0.003 execution.py:135(_map_node_over_list)
       40    0.000    0.000    0.112    0.003 execution.py:149(process_inputs)
   174788    0.090    0.000    0.090    0.000 {built-in method builtins.sorted}
       21    0.000    0.000    0.081    0.004 nodes.py:2962(IS_CHANGED)
  28032/1    0.044    0.000    0.080    0.080 copy.py:128(deepcopy)
   2868/1    0.009    0.000    0.080    0.080 copy.py:227(_deepcopy_dict)
        8    0.000    0.000    0.063    0.008 load.py:102(INPUT_TYPES)
5438/1182    0.008    0.000    0.061    0.000 copy.py:201(_deepcopy_list)
        3    0.000    0.000    0.058    0.019 folder_paths.py:203(get_filename_list_)
        7    0.001    0.000    0.056    0.008 folder_paths.py:145(recursive_search)
        1    0.000    0.000    0.032    0.032 nodes.py:1374(IS_CHANGED)
    53/24    0.001    0.000    0.029    0.001 <frozen os>:345(_walk)
       18    0.023    0.001    0.023    0.001 {built-in method nt.scandir}
   112054    0.022    0.000    0.022    0.000 <frozen _collections_abc>:315(__subclasshook__)
    18804    0.021    0.000    0.021    0.000 {built-in method builtins.hasattr}
        1    0.019    0.019    0.019    0.019 {method 'read' of '_io.BufferedReader' objects}
   136222    0.018    0.000    0.018    0.000 {method 'append' of 'list' objects}
      287    0.003    0.000    0.018    0.000 <frozen ntpath>:741(relpath)
        8    0.000    0.000    0.017    0.002 nodes_upscale_model.py:18(INPUT_TYPES)
    37693    0.016    0.000    0.016    0.000 graph.py:30(has_node)

Recorded by using cProfile and wrap around the problematic lines.

The text was updated successfully, but these errors were encountered:

Trung0246 · 2024-09-11T10:36:43Z

Maybe @guill would be interested in this.

Trung0246 · 2024-09-11T11:41:17Z

After looking around a bit, looks like LoraLoader and CheckpointLoaderSimple is the culprit since they calls folder_paths.get_filename_list repeatedly during cache key initialization. One simple fix could be caching the output and and dump the cache if user explicitly click Refresh over the web UI. But looks like the cache is there already, guess calls to isdir and getmtime nott cached.

guill · 2024-09-11T14:52:50Z

Nice find and thanks for the ping! I'll try to get this fixed by this weekend.

Trung0246 · 2024-10-13T00:07:58Z

This pull solved this issue for me: #4969

LiChangyi · 2024-11-06T09:17:27Z

This pull solved this issue: #4969

@Trung0246 I have updated to this commit id, but the problem still exists. clean_unused with take 4min+, I directly used the PromptExecutor.execute method and did not start the HTTP request.

Trung0246 added the Potential Bug User is reporting a bug. This should be tested. label Sep 11, 2024

Trung0246 changed the title ~~Cache initialization between each run is very slow for large workflow.~~ Cache initialization between each run is very slow when having many calls to folder_paths.get_filename_list in INPUT_TYPES Sep 11, 2024

Trung0246 mentioned this issue Sep 11, 2024

Reduce repeated calls of get_immediate_node_signature in cache #4871

Merged

JettHu mentioned this issue Sep 14, 2024

Reduce repeated calls of INPUT_TYPES in cache #4922

Merged

Trung0246 closed this as completed Oct 13, 2024

Trung0246 reopened this Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache initialization between each run is very slow when having many calls to `folder_paths.get_filename_list` in `INPUT_TYPES` #4882

Cache initialization between each run is very slow when having many calls to `folder_paths.get_filename_list` in `INPUT_TYPES` #4882

Trung0246 commented Sep 11, 2024 •

edited

Loading

Trung0246 commented Sep 11, 2024

Trung0246 commented Sep 11, 2024 •

edited

Loading

guill commented Sep 11, 2024

Trung0246 commented Oct 13, 2024 •

edited

Loading

LiChangyi commented Nov 6, 2024 •

edited

Loading

Cache initialization between each run is very slow when having many calls to folder_paths.get_filename_list in INPUT_TYPES #4882

Cache initialization between each run is very slow when having many calls to folder_paths.get_filename_list in INPUT_TYPES #4882

Comments

Trung0246 commented Sep 11, 2024 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Trung0246 commented Sep 11, 2024

Trung0246 commented Sep 11, 2024 • edited Loading

guill commented Sep 11, 2024

Trung0246 commented Oct 13, 2024 • edited Loading

LiChangyi commented Nov 6, 2024 • edited Loading

Cache initialization between each run is very slow when having many calls to `folder_paths.get_filename_list` in `INPUT_TYPES` #4882

Cache initialization between each run is very slow when having many calls to `folder_paths.get_filename_list` in `INPUT_TYPES` #4882

Trung0246 commented Sep 11, 2024 •

edited

Loading

Trung0246 commented Sep 11, 2024 •

edited

Loading

Trung0246 commented Oct 13, 2024 •

edited

Loading

LiChangyi commented Nov 6, 2024 •

edited

Loading