Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter Issue: "triplet loss is flawed" #31

Open
nbstrong opened this issue May 28, 2019 · 2 comments
Open

Twitter Issue: "triplet loss is flawed" #31

nbstrong opened this issue May 28, 2019 · 2 comments

Comments

@nbstrong
Copy link

nbstrong commented May 28, 2019

https://twitter.com/alfcnz/status/1133372277876068352

Unfortunately that triplet loss is flawed. The most offending negative sample has zero gradient. That power of 2 should be a power of ½.
I feel bad so many people still use it. 😕 https://t.co/M3daSGzlMK

— Alfredo Canziani (@alfcnz) May 28, 2019

There's some discussion going on in her replies as well, but if there is an issue it should be addressed here.

@adambielski
Copy link
Owner

Yes, I'm aware, I commented on the thread as well.
The implementation is technically correct, it follows the loss formulation from the papers.
But if we look at gradients it can indeed be problematic and suboptimal.
Even if in many cases this formulation seems to work in practice, users should be aware of potential issues - I'll add a clarification and loss alternatives.

@jonkoi
Copy link

jonkoi commented Oct 7, 2019

Hi,

From what I understood from the Twitter discussion, power of ½ will create a stronger push or gradient against negatives when they are close. Is that correct?

Moreover, what's the point of the margin when, from what I understand, it is zero out in the gradient calculation?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants