`@gen_cluster` has become 1s slower; CI takes 50% longer #6632

crusaderky · 2022-06-26T10:57:18Z

Since #6603, a test decorated with @gen_cluster:

@gen_cluster()
async def test1(s, a, b):
    pass

has increased in runtime on my host from 220ms to 1220ms.

With the same PR, the CI test suites (ci1) have increased:
Ubuntu: 19m23s -> 34m18s
Windows: 23m36s -> 48m55s
MacOSX: 32m15s -> 46m39s

CC @graingert

The text was updated successfully, but these errors were encountered:

graingert · 2022-06-27T08:09:39Z

the issue is the scheduler.close() call is now always taking at least 1s to finish:

distributed/distributed/utils_test.py

Lines 992 to 994 in c82bba5

    
           await asyncio.gather(*(end_worker(w) for w in workers)) 
        
           await s.close()  # wait until scheduler stops completely 
        
           s.stop()

the workers are already closed so there should be no remaining comm_handlers left:

distributed/distributed/core.py

Lines 881 to 886 in c82bba5

    
           # TODO: Deal with exceptions 
        
           await self._ongoing_background_tasks.stop(timeout=1) 
        
           # TODO: Deal with exceptions 
        
           await self._ongoing_comm_handlers.stop(timeout=1)

however looking at the tasks in self._ongoing_background_tasks._ongoing_tasks:

distributed/distributed/scheduler.py

Lines 4279 to 4292 in c82bba5

    
           async def remove_worker_from_events(): 
        
               # If the worker isn't registered anymore after the delay, remove from events 
        
               if address not in self.workers and address in self.events: 
        
                   del self.events[address] 
        
           cleanup_delay = parse_timedelta( 
        
               dask.config.get("distributed.scheduler.events-cleanup-delay") 
        
           ) 
        
           self._ongoing_background_tasks.call_later( 
        
               cleanup_delay, remove_worker_from_events 
        
           ) 
        
           logger.debug("Removed worker %s", ws)

there's a pair of tasks waiting to call remove_worker_from_events, these get a 1 second grace period to stop and are then cancelled

distributed/distributed/scheduler.py

Lines 4279 to 4294 in c82bba5

    
           async def remove_worker_from_events(): 
        
               # If the worker isn't registered anymore after the delay, remove from events 
        
               if address not in self.workers and address in self.events: 
        
                   del self.events[address] 
        
           cleanup_delay = parse_timedelta( 
        
               dask.config.get("distributed.scheduler.events-cleanup-delay") 
        
           ) 
        
           self._ongoing_background_tasks.call_later( 
        
               cleanup_delay, remove_worker_from_events 
        
           ) 
        
           logger.debug("Removed worker %s", ws) 
        
           return "OK"

I think there should not be a grace period of 1s for background tasks in Server.close and I think remove_worker_from_events should be called immediately when a worker reports that it is closing.

hendrikmakait · 2022-06-27T09:54:11Z

I'd be open to dropping the grace period and directly cancelling tasks. Are there any tasks for which we'd actually prefer graceful shutdown? In that case we could adjust the signature of call_soon to include a graceful=False keyword that would get stored together with the task itself and we could wait only for graceful shutdown of those tasks (likely with a shorter grace period).

fjetter · 2022-06-27T11:36:23Z

remove_worker_from_events should be called immediately when a worker reports that it is closing.

We need to remove workers from events, eventually. Depending on the deployment, this can cause a memory leak otherwise. I introduced this ~2 years ago because we were using clusters that were always-on but we scaled them up and down based on demand, i.e. over time this events dict became huge. Eventually this dict should be cleaned up. The solution back then was to introduce a timeout
It should be delayed such that the data is available after a while for debugging. The hour was very arbitrary.

If this causes big problems we can drop it. Some information about entering/leaving workers is also stored in the all topic but the worker specific topic is more granular.

Is there a problem with just canceling this task?

fjetter · 2022-06-27T11:37:14Z

I agree that we don't necessarily want a grace period for when we're closing. I'm fine with cancelling stuff right away

crusaderky added the regression label Jun 26, 2022

crusaderky mentioned this issue Jun 26, 2022

Replace loop.call_later and loop.add_callback with background tasks added to Server. #6603

Merged

graingert mentioned this issue Jun 27, 2022

remove server close background task grace period #6633

Merged

2 tasks

fjetter self-assigned this Jun 30, 2022

crusaderky closed this as completed in #6633 Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`@gen_cluster` has become 1s slower; CI takes 50% longer #6632

`@gen_cluster` has become 1s slower; CI takes 50% longer #6632

crusaderky commented Jun 26, 2022

graingert commented Jun 27, 2022 •

edited

Loading

hendrikmakait commented Jun 27, 2022

fjetter commented Jun 27, 2022

fjetter commented Jun 27, 2022

@gen_cluster has become 1s slower; CI takes 50% longer #6632

@gen_cluster has become 1s slower; CI takes 50% longer #6632

Comments

crusaderky commented Jun 26, 2022

graingert commented Jun 27, 2022 • edited Loading

hendrikmakait commented Jun 27, 2022

fjetter commented Jun 27, 2022

fjetter commented Jun 27, 2022

`@gen_cluster` has become 1s slower; CI takes 50% longer #6632

`@gen_cluster` has become 1s slower; CI takes 50% longer #6632

graingert commented Jun 27, 2022 •

edited

Loading