Mixture of Experts #19

xrsrke · 2023-10-25T05:39:10Z

APIs

from pipegoose.nn.expert_parallel import ExpertParallel, ExpertLoss

parallel_context = ParallelContext.from_torch(expert_parallel_size=8)

mlp = CustomExpert()
router = CustomRouter()
noise_policy = CustomNoisePolicy()
loss_func = nn.CrossEntropy()

model = ExpertParallel(
     model,
     expert=mlp,
     router=router,
     noise_policy=noise_policy,
     enable_tensor_parallelism=True,
     parallel_context=parallel_context,
).parallelize()

loss_func = ExpertLoss(loss_func, aux_weight=0.1)

TODOs

Engineering Reading

Pipeline MoE - A Flexible MoE Implementation with Pipeline Parallelism
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
DeepSpeed-TED: Tensor-Expert-Data Parallelism Optimize Hybrid: A Approach to Mixture-of-Experts Training
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
FasterMoE: Modeling and Optimizing Training of Large-Scale Dynamic Pre-Trained Models
MegaBlocks - Efficient Sparse Training with Mixture-of-Experts

MoE Reading

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Mixture-of-Experts with Expert Choice Routing

The text was updated successfully, but these errors were encountered:

xrsrke added this to pipegoose v1 Oct 25, 2023

xrsrke converted this from a draft issue Oct 25, 2023

xrsrke added the help wanted Extra attention is needed label Oct 25, 2023

xrsrke self-assigned this Oct 25, 2023

xrsrke moved this from Todo to In Progress in pipegoose v1 Oct 25, 2023

xrsrke removed the help wanted Extra attention is needed label Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixture of Experts #19

Mixture of Experts #19

xrsrke commented Oct 25, 2023 •

edited

Loading

Mixture of Experts #19

Mixture of Experts #19

Comments

xrsrke commented Oct 25, 2023 • edited Loading

xrsrke commented Oct 25, 2023 •

edited

Loading