Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong test acc because redundant data in ddp mode #4732

Closed
xiadingZ opened this issue Nov 18, 2020 · 4 comments
Closed

wrong test acc because redundant data in ddp mode #4732

xiadingZ opened this issue Nov 18, 2020 · 4 comments
Labels
duplicate This issue or pull request already exists feature Is an improvement or enhancement help wanted Open to be worked on

Comments

@xiadingZ
Copy link

If I have 499 videos to test set, in ddp model, It will load 512 videos to test, maybe by copying videos to match the batch size. But it will cause wrong test accuracy. Now I need to save each videos' predictions and calculate acc by my own. Is there any way to solve this problem?

@xiadingZ xiadingZ added feature Is an improvement or enhancement help wanted Open to be worked on labels Nov 18, 2020
@Borda
Copy link
Member

Borda commented Nov 18, 2020

@xiadingZ mind sharing some code example or Colab or how do you know that your test loaded more videos, what is you batch size?

@xiadingZ
Copy link
Author

xiadingZ commented Nov 18, 2020

I write a test_flist to load video data, it only has 499 line video info. And in test_epoch_end, I write all video's prediction score to disk, with video id. Then I load these predictions, it has 512 videos, some of videos have same id. my batch size is 8, use ddp mode on multi-gpu

this ls my test dataloader

        dataset = VideoDataset(self.hparams, mode='val', transform=transform)
        return DataLoader(dataset, batch_size=self.batch_size,
                          num_workers=self.num_workers, pin_memory=True)

@SkafteNicki
Copy link
Member

Duplicate of #2398
The short answer is that DistributedSampler adds additional samples to even the load over all processes. This will cause a slight bias in the metric value. Currently, you would need to run the testing on single gpu, until we support uneven inputs in ddp (#3325)

@Borda Borda added duplicate This issue or pull request already exists and removed information needed labels Nov 18, 2020
@Borda
Copy link
Member

Borda commented Nov 18, 2020

closing in favor of #2398 so pls continue the thread there 🐰

@Borda Borda closed this as completed Nov 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists feature Is an improvement or enhancement help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

3 participants