-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoScheduler] Bug fix for layout rewrite CI error in i386 #6830
Conversation
1fdbac1
to
48655e9
Compare
ProblemCompute get wrong result after layout rewrite, and this only occurs in i386 CI. Current StatusAfter trying many different tests, I guess I have finally found the reason. Only i386 CI used llvm-4 to build the TVM. This test has set the random seed to a fixed number that AutoScheduler can always generate a same schedule. Same schedule in i386 CI with llvm-8: https://ci.tlcpack.ai/blue/rest/organizations/jenkins/pipelines/tvm/branches/PR-6830/runs/7/nodes/253/steps/331/log/?start=0 The lowered result of TVM is exactly the same, so I think the only cause may be some special bug during llvm codegen in llvm-4. To fully confirm it, we may need to compare their llvm ir. I'm trying llvm-4 in my local runtime to see if this bug can be reproduced. |
bc5b603
to
5aace9f
Compare
5c6876f
to
e664bb6
Compare
This problem can be reproduced in my local runtime with ci-i386 docker. Seems the float point operations under 32bit environment trends to be less accurate than 64bit? I've tried more tests on different llvm versions, codegen results with higher llvm version can still encounter accuracy problem, but with lower possibility. In x86_64 environment, different llvm versions all worked well even with atol and rtol setting to 1e-7. Currently a better way to fix this may still be setting a bigger atol and rtol value. |
Thanks @jcf94 for timely fix and indepth analysis |
No description provided.