-
Notifications
You must be signed in to change notification settings - Fork 685
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
## 背景 rank 数多时,master 编译 所有 rank 的 task node - 顺序编译,速度慢; - plan 大(可能超过 2G),总体要发送的数据规模可能达到上百G,传输太慢; 所以必须改成每个 rank 独立编译自己的执行计划。 ## 测评数据 - 模拟 n 卡的数据并行:Oneflow-Inc/OneTeam#1679 (comment) - 实测:Oneflow-Inc/OneTeam#1944 ## 实现思路总结 Oneflow-Inc/OneTeam#1791 --------- Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: ZZK <359521840@qq.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Ping Zhu <58718936+reygu@users.noreply.github.com> Co-authored-by: Wang Yi <53533850+marigoold@users.noreply.github.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: Shiyuan Shangguan <shiyuan@oneflow.org> Co-authored-by: yuhao <72971170+howin98@users.noreply.github.com> Co-authored-by: jackalcooper <jackalcooper@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Zhimin Yang <76760002+small1945@users.noreply.github.com> Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: Dongche Zhang <zhang2000dc@gmail.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: Yipeng Li <jamesonli1313@gmail.com> Co-authored-by: wyg1997 <wangyinggang@foxmail.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com> Co-authored-by: Yu OuYang <xuanjiuye@gmail.com> Co-authored-by: WangYi <buaawangyi03@gmail.com> Co-authored-by: rejoicesyc <47683675+rejoicesyc@users.noreply.github.com> Co-authored-by: songyicheng <int.rejoice@gmail.com> Co-authored-by: QI JUN <qijun1994@hotmail.com> Co-authored-by: zhaoyongke <zhaoyongke@yeah.net> Co-authored-by: JiaKui Hu <hjk1938927583@163.com> Co-authored-by: cheng cheng <472491134@qq.com>
- Loading branch information
1 parent
f72ebf6
commit ae52678
Showing
37 changed files
with
849 additions
and
243 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
/* | ||
Copyright 2020 The OneFlow Authors. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
#include "oneflow/core/job/compile_mode.h" | ||
#include "oneflow/core/common/env_var/env_var.h" | ||
#include "oneflow/core/common/util.h" | ||
#include "oneflow/core/common/container_util.h" | ||
|
||
namespace oneflow { | ||
|
||
namespace { | ||
|
||
struct CompileModeName final : public CompileModeVisitor<CompileModeName> { | ||
static std::string VisitNaive() { return "naive"; } | ||
static std::string VisitRankPerProcess() { return "rank_per_process"; } | ||
static std::string VisitInValid() { return "invalid"; } | ||
}; | ||
|
||
std::unordered_map<std::string, CompileMode> Name2CompileMode() { | ||
std::unordered_map<std::string, CompileMode> name2compile_mode; | ||
for (int i = static_cast<int>(CompileMode::kInvalid) + 1; | ||
i != static_cast<int>(CompileMode::kEnd); ++i) { | ||
CompileMode compile_mode = static_cast<CompileMode>(i); | ||
CHECK(name2compile_mode.emplace(CompileModeName::Visit(compile_mode), compile_mode).second); | ||
} | ||
return name2compile_mode; | ||
} | ||
|
||
std::string GetValidCompileModeNames() { | ||
std::stringstream ss; | ||
for (int i = static_cast<int>(CompileMode::kInvalid) + 1; | ||
i != static_cast<int>(CompileMode::kEnd); ++i) { | ||
if (i > static_cast<int>(CompileMode::kInvalid) + 1) { ss << ", "; } | ||
CompileMode compile_mode = static_cast<CompileMode>(i); | ||
ss << CompileModeName::Visit(compile_mode); | ||
} | ||
return ss.str(); | ||
} | ||
|
||
} // namespace | ||
|
||
Maybe<CompileMode> CurrentCompileMode() { | ||
static thread_local CompileMode mode = | ||
JUST_MSG(MapAt(Name2CompileMode(), ThreadLocalEnvString<ONEFLOW_LAZY_COMPILE_MODE>()), | ||
std::stringstream() | ||
<< "ONEFLOW_LAZY_COMPILER(value: " | ||
<< ThreadLocalEnvString<ONEFLOW_LAZY_COMPILE_MODE>() | ||
<< ") is invalid. valid options: \"" << GetValidCompileModeNames() << "\""); | ||
return mode; | ||
} | ||
|
||
} // namespace oneflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
/* | ||
Copyright 2020 The OneFlow Authors. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
#ifndef ONEFLOW_CORE_JOB_COMPILE_MODE_H_ | ||
#define ONEFLOW_CORE_JOB_COMPILE_MODE_H_ | ||
|
||
#include "oneflow/core/common/maybe.h" | ||
|
||
namespace oneflow { | ||
|
||
enum class CompileMode { | ||
kInvalid = 0, // make sure kInvalid is the first CompileMode | ||
kNaive, | ||
kRankPerProcess, | ||
kEnd, // make sure kEnd is the last CompileMode | ||
}; | ||
|
||
template<typename DerivedT> | ||
struct CompileModeVisitor { | ||
template<typename... Args> | ||
static auto Visit(CompileMode compile_mode, Args&&... args) { | ||
switch (compile_mode) { | ||
case CompileMode::kNaive: return DerivedT::VisitNaive(std::forward<Args>(args)...); | ||
case CompileMode::kRankPerProcess: | ||
return DerivedT::VisitRankPerProcess(std::forward<Args>(args)...); | ||
default: { | ||
LOG(FATAL) << "invalid compile mode"; | ||
return DerivedT::VisitInValid(std::forward<Args>(args)...); | ||
} | ||
} | ||
} | ||
}; | ||
|
||
Maybe<CompileMode> CurrentCompileMode(); | ||
|
||
} // namespace oneflow | ||
|
||
#endif // ONEFLOW_CORE_JOB_COMPILE_MODE_H_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.