qwen2-audio不支持flash-attn? RuntimeError: cu_seqlens_q must have shape (batch_size + 1) #51

kindaQ · 2024-08-27T03:23:15Z

我在尝试使用flash-attn做训练的时候，报错 RuntimeError: cu_seqlens_q must have shape (batch_size + 1)
经过排查，transformers/modeling_qwen2_audio.py Qwen2AudioEncoderLayer中的attention_mask格式为 (batch, 1, tgt_len, src_len),
而transformers/modeling_flash_attention_utils.py中的attention_mask格式为(batch_size, seq_len) 导致在计算seq_lens的时候维度不对，本来需要batch_size+1长度的mask，但是现在得到的是(batch_size,1,tgt_len+1)

0ohadeso0 · 2024-08-27T12:23:58Z

+1

SixGoodX · 2024-09-20T05:52:35Z

我在尝试使用flash-attn做训练的时候，报错 RuntimeError: cu_seqlens_q must have shape (batch_size + 1) 经过排查，transformers/modeling_qwen2_audio.py Qwen2AudioEncoderLayer中的attention_mask格式为 (batch, 1, tgt_len, src_len), 而transformers/modeling_flash_attention_utils.py中的attention_mask格式为(batch_size, seq_len) 导致在计算seq_lens的时候维度不对，本来需要batch_size+1长度的mask，但是现在得到的是(batch_size,1,tgt_len+1)

你好，请问你解决这个问题了吗？

Lollipop · 2024-09-20T07:34:04Z

没太理解为什么qwen2audio要输入特定的audio attention mask，看whisper的源码，这个mask其实没有用

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2-audio不支持flash-attn? RuntimeError: cu_seqlens_q must have shape (batch_size + 1) #51

qwen2-audio不支持flash-attn? RuntimeError: cu_seqlens_q must have shape (batch_size + 1) #51

kindaQ commented Aug 27, 2024

0ohadeso0 commented Aug 27, 2024

SixGoodX commented Sep 20, 2024

Lollipop commented Sep 20, 2024

qwen2-audio不支持flash-attn? RuntimeError: cu_seqlens_q must have shape (batch_size + 1) #51

qwen2-audio不支持flash-attn? RuntimeError: cu_seqlens_q must have shape (batch_size + 1) #51

Comments

kindaQ commented Aug 27, 2024

0ohadeso0 commented Aug 27, 2024

SixGoodX commented Sep 20, 2024

Lollipop commented Sep 20, 2024