Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PIR-Auto-Parallel] [cherry-pick] refactor refined recompute pass in PIR mode #70703

Closed

Conversation

waliwali777
Copy link
Contributor

PR Category

Auto Parallel

PR Types

Performance

Description

CP from:#70064#70521

该 PR 是在 recompute pass( #69681 ) 的基础上,实现的 refined recompute,是在recompute layer 中选择一些算子不参与重计算,其开关在代码中的调用如下所示:

strategy = dist.Strategy()
strategy._recompute.enable = True
strategy._recompute.refined_ops_patterns = [
            {
                "main_ops": ["matmul"],
                "num": -1,
                "pre_ops": ["multiply"],
                "suf_ops": [],
            }
        ]
...
model = dist.to_static(model, dist_loader, criterion, optimizer, strategy=strategy)

在每个layer segment 中,按照计算图拓扑结构匹配 pattern = pre_ops + main_ops + suf_ops,其中,pre_ops 和 suf_ops 是用于辅助匹配 main_ops 的,对于匹配到的前 num 个 main_ops,在反向时不进行重计算,当 num = -1 时,默认匹配到的 main_ops 全部不进行重计算。

同时 pass 也对 segment 的数目进行 assert 断言检测,如果在开启recompute ( strategy._recompute.enable=1),但是在模型代码没有使用到 recompute(layer),则将在 recompute pass 中报错

其他:
PaddleNLP 中增加 refined recompute 的测试:PaddlePaddle/PaddleNLP#9679
该实现部分参考了旧 IR 下 refined recompute实现:#58533

PCard-88114

Copy link

paddle-bot bot commented Jan 8, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@waliwali777 waliwali777 closed this Jan 8, 2025
@waliwali777 waliwali777 reopened this Jan 8, 2025
@waliwali777 waliwali777 closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant