Pickling error of config function #441

SumNeuron · 2019-03-11T12:56:49Z

In a previous Issue I showed requested support for the multiprocessing library to help launch multiple experiments.

# this works
def run_ex(print_usage=False, *args, **kwargs):
    # assumes kwargs is a strict subset of your config
    ex.run(config_updates=kwargs)
    if print_usage:
        usage = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
        print('resources used: {}'.format(usage))
    gc.collect()

# processes default to 1 as if using GPU backend, TF will likely crash
def pool_run(*args, processes=1, print_usage=False, **kwargs):
    with multiprocessing.Pool(processes=processes) as pool:
        results = pool.apply(run_ex, (print_usage, *args), kwargs)
    gc.collect()
    return results 

pool_run(print_usage=True, **options)

If I make the slight change to this:

# now I pass the experiment function so that these functions can be re-used

def run_ex(ex, print_usage=False, *args, **kwargs):
    ex.run(config_updates=kwargs)
    if print_usage:
        usage = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
        print('resources used: {}'.format(usage))
    gc.collect()

def pool_run(ex, *args, processes=1, print_usage=False, **kwargs):
    with multiprocessing.Pool(processes=processes) as pool:
        results = pool.apply(run_ex, (ex, print_usage, *args), kwargs)
    gc.collect()
    return results

pool_run(ex, print_usage=True, **options) 
# throws error

the error I get is:


PicklingError                             Traceback (most recent call last)
<ipython-input-5-2bab0a1545e1> in <module>

---> 14 pool_run(ex, processes=1, print_usage=True, **opts)


~/Projects/my/module/file.py in pool_run(ex, processes, print_usage, *args, **kwargs)
     10 def pool_run(ex, *args, processes=1, print_usage=False, **kwargs):
     11     with multiprocessing.Pool(processes=processes) as pool:
---> 12         results = pool.apply(run_ex, (ex, print_usage, *args), kwargs)
     13     gc.collect()
     14     return results

/anaconda3/lib/python3.6/multiprocessing/pool.py in apply(self, func, args, kwds)
    257         '''
    258         assert self._state == RUN
--> 259         return self.apply_async(func, args, kwds).get()
    260 
    261     def map(self, func, iterable, chunksize=None):

/anaconda3/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
    642             return self._value
    643         else:
--> 644             raise self._value
    645 
    646     def _set(self, i, obj):

/anaconda3/lib/python3.6/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
    422                         break
    423                     try:
--> 424                         put(task)
    425                     except Exception as e:
    426                         job, idx = task[:2]

/anaconda3/lib/python3.6/multiprocessing/connection.py in send(self, obj)
    204         self._check_closed()
    205         self._check_writable()
--> 206         self._send_bytes(_ForkingPickler.dumps(obj))
    207 
    208     def recv_bytes(self, maxlength=None):

/anaconda3/lib/python3.6/multiprocessing/reduction.py in dumps(cls, obj, protocol)
     49     def dumps(cls, obj, protocol=None):
     50         buf = io.BytesIO()
---> 51         cls(buf, protocol).dump(obj)
     52         return buf.getbuffer()
     53 

PicklingError: Can't pickle <function config at 0x1c35b18ea0>: it's not the same object as my.module.experiment.config

The text was updated successfully, but these errors were encountered:

JarnoRFB · 2019-03-11T20:44:56Z

Could you provide a minimal complete script of the not working example including your ex setup? I thought I had successfully forked processes passing the experiment object...

SumNeuron · 2019-03-12T09:20:45Z

https://colab.research.google.com/drive/1oJDfvIWnf7sY5uFcJ7XhTSEdA1n_3Jxv

SumNeuron · 2019-03-20T09:23:30Z

@JarnoRFB any ideas?

Qwlouse · 2019-03-21T14:21:37Z

This problem is probably related to the fact that wrapt functions cannot be pickled. Since sacred uses wrapt internally for the captured functions, that will make an experiment (or even captured functions) unpicklable.

SumNeuron · 2019-03-21T14:34:23Z

@Qwlouse ok... do you have suggestion for another approach for pool_run then? That functionality is something I would like to compartmentalize away from writing of an experiment itself, so having to have the ex in scope is a bit of an annoyance

SumNeuron · 2019-03-22T12:35:19Z

@Qwlouse or could Sacred not use wrapt functions? I want to combine ray tune with sacred and that requires a pickleable function...

flukeskywalker · 2019-03-22T12:41:44Z

Sacred works very well with ray tune. You can import the experiment inside the trainable function, apply config updates provided by tune, and then run the experiment.

SumNeuron · 2019-03-22T12:42:37Z

@flukeskywalker can you provide an example / colab?

flukeskywalker · 2019-03-22T12:56:20Z

Here's an example adapted from code I've used successfully, assuming train.py is a regular sacred experiment file in the same directory that constructs an experiment ex. I don't use the tune reporter since reporting is done through sacred.

Note: the sleep line is a hack I use to prevent multiple experiments from being assigned the same ID. This issue should no longer exist according to sacred devs, but somehow I still faced it so I use this simple fix.

import ray
import ray.tune as tune
from sacred.observers import MongoObserver


def train(config, reporter):
    import time, random
    time.sleep(random.uniform(0.0, 10.0))
    from train import ex
    ex.observers.append(MongoObserver.create(db_name='my_db'))
    config['verbose'] = False
    ex.run(config_updates=config)
    result = ex.current_run.result
    print(f'Types of result is {type(result)}')
    reporter(result=result, done=True)

ray.init(num_cpus=64, num_gpus=0)

if __name__ == '__main__':
    tune.register_trainable("train_func", train)
    tune.run_experiments({
        'my_experiment': {
            'run': 'train_func',
            'stop': {'result': 1000},
            'config': {
                'n_layers': tune.grid_search([5, 6]),
                'batch_size': tune.grid_search([256, 1024, 2048]),
            },
            'resources_per_trial': {"cpu": 1, "gpu": 0},
            'num_samples': 10,
        }
    })

SumNeuron · 2019-03-26T13:01:08Z

@flukeskywalker so I made a simple project based on your guidance and it works! Thanks :) However it only works if everything is in the same level module.

Here is a more "complete" project setup, and currently tune isn't running.

https://gitlab.com/SumNeuron/extune

please take a look

flukeskywalker · 2019-04-08T09:57:49Z

Sorry, I don't have time to look into the code these days. My guess is that this is related to the module being in the Python path for each ray actor, but that's all I can say at this point.

I personally do not prefer organizing code the way you are doing in extune. I usually have the Sacred experiment and the tune script(s) in the project's base dir. This is simple and avoids another level of organization. Of course, you may like things differently.

stale · 2019-05-08T10:31:26Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

flukeskywalker mentioned this issue Apr 8, 2019

MongoObserver possible race condition for multiple runs #452

Closed

stale bot added the stale label May 8, 2019

stale bot closed this as completed May 15, 2019

thequilo mentioned this issue Jun 27, 2022

Make Experiment objects picklable #874

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pickling error of config function #441

Pickling error of config function #441

SumNeuron commented Mar 11, 2019

JarnoRFB commented Mar 11, 2019

SumNeuron commented Mar 12, 2019

SumNeuron commented Mar 20, 2019

Qwlouse commented Mar 21, 2019

SumNeuron commented Mar 21, 2019

SumNeuron commented Mar 22, 2019

flukeskywalker commented Mar 22, 2019

SumNeuron commented Mar 22, 2019

flukeskywalker commented Mar 22, 2019

SumNeuron commented Mar 26, 2019

flukeskywalker commented Apr 8, 2019

stale bot commented May 8, 2019

Pickling error of config function #441

Pickling error of config function #441

Comments

SumNeuron commented Mar 11, 2019

JarnoRFB commented Mar 11, 2019

SumNeuron commented Mar 12, 2019

SumNeuron commented Mar 20, 2019

Qwlouse commented Mar 21, 2019

SumNeuron commented Mar 21, 2019

SumNeuron commented Mar 22, 2019

flukeskywalker commented Mar 22, 2019

SumNeuron commented Mar 22, 2019

flukeskywalker commented Mar 22, 2019

SumNeuron commented Mar 26, 2019

flukeskywalker commented Apr 8, 2019

stale bot commented May 8, 2019