Replies: 2 comments 1 reply
-
Qwen模型是纯decoder的语言模型(这里虽然一般这么说,但很不严谨的,它其实是原始Transformer论文里encoder部分的架构,但是auto-regressive的训练,就是自注意力只能看到句子前的token,不能看到句子后面的token),虽然最后输出层前的隐层输出可以当成是embedding,但可能不太适合做embedding类任务。 这里有些论文可参考,像OpenAI有篇文章就是拿语言模型(GPT3/Codex)做基础,然后用对比学习去继续训练获得embedding模型的。 https://cdn.openai.com/papers/Text_and_Code_Embeddings_by_Contrastive_Pre_Training.pdf 欢迎讨论! |
Beta Was this translation helpful? Give feedback.
0 replies
-
从逻辑上讲,输入一个句子,想得到句子的embedding,可以找到输入句子的结束标识,比如eos,然后用eos对应的embedding作为整个句子向量吧;那么具体怎么获取这个eos位置的embedding呢? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我想用模型生成embedding,然后计算两段文本的相似度
Beta Was this translation helpful? Give feedback.
All reactions