Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imdb instance sampling is non-deterministic #1352

Closed
yifanmai opened this issue Feb 8, 2023 · 3 comments
Closed

imdb instance sampling is non-deterministic #1352

yifanmai opened this issue Feb 8, 2023 · 3 comments
Assignees
Labels
bug Something isn't working p1 Priority 1 (Required for release) scenarios

Comments

@yifanmai
Copy link
Collaborator

yifanmai commented Feb 8, 2023

The reason for this is that the instance order is dependent on the order of files returned by os.listdir() here, which is arbitrary. This bug already existed in v0.1.0.

This is a subtle bug because running multiple times on the same download folder usually results in the same order due to scenario download caching (the order seems to be based on some OS-dependent order). But deleting the scenario download folder and re-running it causes the order to change.

The long term fix is to use sorted(os.listdir()) instead.

@yifanmai yifanmai added bug Something isn't working p1 Priority 1 (Required for release) scenarios labels Feb 8, 2023
@yifanmai yifanmai self-assigned this Feb 8, 2023
@yifanmai
Copy link
Collaborator Author

yifanmai commented Feb 9, 2023

code has the same problem. We should audit the other scenarios as well.

@yifanmai
Copy link
Collaborator Author

ice has the same problem.

@yifanmai
Copy link
Collaborator Author

yifanmai commented Apr 8, 2023

Unfortunately we recently migrated the scenarios between disks and this caused the os.listdir to be shuffled, so now we don't have access to the original os.listdir order. I might be able to reconstruct the original order based on the published evaluations, but it'll be tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p1 Priority 1 (Required for release) scenarios
Projects
None yet
Development

No branches or pull requests

1 participant