这是indexloc提供的服务,不要输入任何密码
Skip to content

RuntimeError: _scaled_dot_product_attention: Explicit attn_mask should not be set when is_causal=True #453

@magengyu123

Description

@magengyu123

直接下载最新代码和模型,跑eval_model.py测试时报错:

......
File "E:\anaconda3\envs\minimind\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\anaconda3\envs\minimind\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "e:\git\minimind-v\minimind\model\model_minimind.py", line 180, in forward
output = F.scaled_dot_product_attention(xq, xk, xv, attn_mask=attn_mask, dropout_p=dropout_p, is_causal=True)
RuntimeError: _scaled_dot_product_attention: Explicit attn_mask should not be set when is_causal=True

对应代码是这个,把is_casual改成False就可以正常跑了。但是不知道会不会影响训练。
output = F.scaled_dot_product_attention(xq, xk, xv, attn_mask=attn_mask, dropout_p=dropout_p, is_causal=True)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions