-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MongoObserver possible race condition for multiple runs #452
Comments
I have faced this exact issue, but in my case it really was an ID overlap issue. To confirm you can try the hack I usually use, demonstrated here: #441 (comment) Basically, I insert a random delay between the creation of Sacred experiments to make sure that there are no overlapping experiment IDs. I have not faced the issue in question since I started using this trick. It is possible that the delay range needs to be bigger if you are launching a lot more experiments. For a few hundred experiments, a delay range of 0 to 60 seconds works fine for me. |
Hi, thanks for the proposition. I think the overlap display comes from Omniboard which appears to display random plots when |
For me, Mongo ID collisions disappeared after the merge of #254, without resorting to such hacks. Maybe you could try disabling them and reporting a new bug if you ever see ID collisions again. |
@F-Barto: I doubt that that's the issue. The metrics don't have much to do with the You can directly inspect the metrics for the problematic runs using pymongo. If you still see the same issue, the problem is not coming from Omniboard, and it is likely an ID issue (make sure that you're using the latest Sacred). If your problem is indeed coming from Omniboard, it is probably best to open an issue in the corresponding repo. @vnmabus: Yes that's been on my TODO list, and I'll check soon. |
Okay finally found it, The person having the issue was using pymongo and not omniboard to delete the runs. At the same time, he did not delete the corresponding documents in the metrics collection. Hence the overlap of id on the metrics at some point. Still, the fact that run['info'] is empty when the metrics documents already exist is weird. Thx for your help all |
Hi and thanks for your awesome work,
It seems MongoObserver have some race conditionswhen logging metrics.
Context:
The issue:
We ran ~100 experiments (same code, just different hyperparams for the architecture). When looking at the results in Omniboard some experiments seem to have overlapped metrics plots:
while some are ok:
When digging a bit with pymongo it appears that the one with weird plot have their run['info'] dict empty:
While the ones with 'ok plots' have their info dicts
possible related issues:
#309
#345
#317
The text was updated successfully, but these errors were encountered: