Skip to content

pyyaks.logger with multiprocessing

Javier Gonzalez edited this page Aug 6, 2021 · 2 revisions

For the python logging fans 🙄... another case for always using the spawn start method with multiprocessing. I'm writing it here so we do not forget, since I spent some time tracking this down.

In the script below, I call a function with multiprocessing.Pool. I log some messages within that function to a logger named 'logger'. This logger is initialized in the parent process using pyyaks.logger. I would normally expect one of two:

  • the logger in the process is initialized like in the parent process. In principle, this should be the case if the start method is fork (the default in Linux).
  • the logger in the child process is not initialized. This is the behavior if the start method is spawn

In any case, on Linux using fork, I get something inconsistent.

This is the script:

from multiprocessing import Pool, set_start_method
import pyyaks.logger
import logging

def process():
    logging.getLogger('logger').debug('before')
    logging.debug('')  # try commenting me out!
    logging.getLogger('logger').debug('after')

def main():
    set_start_method('spawn')  # try commenting me out!
    logger = pyyaks.logger.get_logger(name='logger', level='DEBUG')
    logger.debug('Starting')
    with Pool(processes=1) as pool:
        pool.apply(process)

if __name__ == '__main__':
    main()

This is the output from the script (this is OK):

Starting

This is the output on Linux if I comment out the set_start_method('spawn') line (note how the lines after calling logging.debug are repeated with two different formats):

Starting
before
after
DEBUG: logger:after

and this is the output if I also comment the logging.debug('') line:

Starting
before
after
Clone this wiki locally