Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch_labels = np.zeros_like(batch_token_ids[:, :1]) ;batch_labels都为0,那batch_token_ids和dropout的batch_token_ids的相似度矩阵y_pred不应该为1吗 #3

Open
lonngxiang opened this issue Jun 4, 2021 · 4 comments

Comments

@lonngxiang
Copy link

No description provided.

@bojone
Copy link
Owner

bojone commented Jul 22, 2021

我什么时候用到了batch_labels?

@lonngxiang
Copy link
Author

我什么时候用到了batch_labels?

预料样本生成里有
image

@bojone
Copy link
Owner

bojone commented Jul 23, 2021

我的问题是哪里用了batch_labels,不是哪里创建了batch_labels。创建了不代表用了,它可能只是一个占位符。

你既然没找到哪里用了batch_labels,怎么可以这么自信地断定“相似度矩阵y_pred不应该为1吗”?

@JaylenLau
Copy link

batch_labels作为数据标签,一般是在loss函数中使用的。但本程序中,simcse_loss中的y_true是基于y_pred生成的,所以事实上没有用到数据标签计算loss,这也是为什么作者在博客中说无监督语义相似度比较嘛。要是用了数据标签,那就是有监督了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants