Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen]Support matmul codegen and runtime in mlir jit #10283

Merged
merged 13 commits into from
May 30, 2023

Conversation

howin98
Copy link
Contributor

@howin98 howin98 commented May 24, 2023

本pr支持在mlir jit中运行matmul op,该pr补全了在gpu dialect之后的codegen中健壮性的一部分问题

@howin98 howin98 enabled auto-merge (squash) May 24, 2023 03:22
@github-actions
Copy link
Contributor

CI failed when running job: Build cpu. PR label automerge has been removed

howin98 and others added 2 commits May 24, 2023 15:21
@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@howin98 howin98 force-pushed the support-matmul-in-mlir-jit branch from 14ec1ee to dd2b306 Compare May 30, 2023 02:06
@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@github-actions
Copy link
Contributor

CI failed when running job: Build cpu. PR label automerge has been removed

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/10283/

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: NVIDIA GeForce RTX 3090 

❌ OneFlow resnet50 time: 43.7ms (= 4374.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.3ms (= 5729.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.31 (= 57.3ms / 43.7ms)

OneFlow resnet50 time: 26.2ms (= 2624.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 37.3ms (= 3727.7ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.42 (= 37.3ms / 26.2ms)

OneFlow resnet50 time: 19.5ms (= 3894.0ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 34.8ms (= 6960.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.79 (= 34.8ms / 19.5ms)

OneFlow resnet50 time: 19.2ms (= 3840.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.6ms (= 6316.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.64 (= 31.6ms / 19.2ms)

OneFlow resnet50 time: 17.7ms (= 3530.4ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.3ms (= 5869.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.66 (= 29.3ms / 17.7ms)

OneFlow swin dataloader time: 0.200s (= 39.942s / 200, num_workers=1)
PyTorch swin dataloader time: 0.130s (= 25.953s / 200, num_workers=1)
Relative speed: 0.650 (= 0.130s / 0.200s)

OneFlow swin dataloader time: 0.057s (= 11.316s / 200, num_workers=4)
PyTorch swin dataloader time: 0.032s (= 6.470s / 200, num_workers=4)
Relative speed: 0.572 (= 0.032s / 0.057s)

OneFlow swin dataloader time: 0.031s (= 6.117s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.338s / 200, num_workers=8)
Relative speed: 0.546 (= 0.017s / 0.031s)

❌ OneFlow resnet50 time: 48.5ms (= 4853.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 64.9ms (= 6489.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 64.9ms / 48.5ms)

OneFlow resnet50 time: 35.6ms (= 3559.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 50.7ms (= 5073.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.43 (= 50.7ms / 35.6ms)

OneFlow resnet50 time: 28.4ms (= 5670.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.3ms (= 8060.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.42 (= 40.3ms / 28.4ms)

OneFlow resnet50 time: 25.2ms (= 5035.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.7ms (= 7738.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 38.7ms / 25.2ms)

OneFlow resnet50 time: 24.0ms (= 4800.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.3ms (= 7253.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.51 (= 36.3ms / 24.0ms)

@howin98 howin98 merged commit 4ac3692 into master May 30, 2023
@howin98 howin98 deleted the support-matmul-in-mlir-jit branch May 30, 2023 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants