Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-scale feature extraction for image retrieval? #90

Closed
woctezuma opened this issue Jul 16, 2021 · 4 comments
Closed

Multi-scale feature extraction for image retrieval? #90

woctezuma opened this issue Jul 16, 2021 · 4 comments

Comments

@woctezuma
Copy link

woctezuma commented Jul 16, 2021

Following #72, I have noticed that there is some code for multi-scale feature extraction:

dino/eval_knn.py

Lines 101 to 104 in ba9edd1

if multiscale:
feats = utils.multi_scale(samples, model)
else:
feats = model(samples).clone()

where three scales are used: 1, 1/sqrt(2), and 1/2.

dino/utils.py

Lines 795 to 809 in ba9edd1

def multi_scale(samples, model):
v = None
for s in [1, 1/2**(1/2), 1/2]: # we use 3 different scales
if s == 1:
inp = samples.clone()
else:
inp = nn.functional.interpolate(samples, scale_factor=s, mode='bilinear', align_corners=False)
feats = model(inp).clone()
if v is None:
v = feats
else:
v += feats
v /= 3
v /= v.norm()
return v

I see that this piece of code was committed for image retrieval (rather than KNN even though it appears in the code for KNN).

However, I don't see it mentioned in the paper. Does it lead to better results for image retrieval?

@mathildecaron31
Copy link
Contributor

mathildecaron31 commented Jul 19, 2021

Hi @woctezuma

We have not tried multi-scale feature extraction for k-NN.
From my understanding, it is a standard practice in image retrieval though, which indeed leads to better results.

See https://github.com/filipradenovic/cnnimageretrieval-pytorch/blob/master/cirtorch/networks/imageretrievalnet.py#L309-L324 and https://github.com/filipradenovic/cnnimageretrieval-pytorch/blob/c5368dfbbfe0286f536e374a4a35ff89578ef2e5/cirtorch/examples/test.py#L53-L55 for example.

@woctezuma
Copy link
Author

woctezuma commented Jul 19, 2021

Thank you for your answer.

To be clear, are the image-retrieval results shown in the paper obtained with a single scale or with multiple scales?
Maybe it is mentioned in the paper, but I could not find the information with some Ctrl+F.

If multi-scale was used in the paper, you might want to set the default value to True here:

parser.add_argument('--multiscale', default=False, type=utils.bool_flag)

@mathildecaron31
Copy link
Contributor

mathildecaron31 commented Jul 19, 2021

Paris is with multi-scale, Oxford is without.
https://github.com/facebookresearch/dino#evaluation-image-retrieval-on-revisited-oxford-and-paris

Yes it is true that I have not provided much implementation details for this image retrieval benchmark in the paper but hopefully my implementation in this repo provides all the details for users to replicate the published numbers.

@woctezuma
Copy link
Author

woctezuma commented Jul 19, 2021

Thank you. I had not noticed that the information was present in the command-line examples!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants