Skip to content

Knowledge distillation #95

Answered by leondgarse
jcsm89 asked this question in Q&A

You must be logged in to vote
  • Current implementation is for either distilling a model from a larger dataset or a similar one. Just not limited.
  • The reason for distilling from embedding layer is that, most high accuracy models provided, like those from insightface, are not including the output classifier layer. Ya, the traditional distillation is aiming at minimizing the difference from output classifier layer. Current method just works, I myself didn't dig deep which one is better either...

Replies: 1 comment

You must be logged in to vote
0 replies
Answer selected by jcsm89
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants