-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executors/BF: optimize BF using LLVM #1091
Conversation
Can one of the admins verify this patch? |
Codecov ReportBase: 81.04% // Head: 84.17% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #1091 +/- ##
==========================================
+ Coverage 81.04% 84.17% +3.13%
==========================================
Files 137 137
Lines 4821 4922 +101
==========================================
+ Hits 3907 4143 +236
+ Misses 914 779 -135
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
ok to test |
e77693f
to
1638784
Compare
Prod tests. Methodology:
My notes:
I didn't see any issues with verdict change, execution time, or memory use. |
3948e75
to
b4a7710
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this!
There are a few parts to this BF optimizing compiler: simple loop pass, translation into LLVM,
opt
,llc -O2
,ld
.Simple loop pass (Python)
Optimizes simple loops.
opt
can optimize these loops too, but it takes a long time to do so. If this step is skipped, the BF program 1162081 takes 67s onopt
and gets compiler TLE (limit is 10s).Also checks for unmatched brackets. Later steps don't need to check for unmatched brackets.
Translation into LLVM
A few notable things:
mmap
rounds up to the nearest 4096, so I putmemory: 4
in everytest.yml
.%p
, and is maintained usinggetelementptr
andphi
.*char_unlocked
, because for large I/O, it's much faster than*char
.opt
Optimizes LLVM using the passes in
OPT_PASSES
. The performance gain is very tiny (~5%), and I suspect it's becausellc -O2
does its own optimizations. I didn't useopt -O2
for 2 reasons: possible performance regressions in the future, and similar performance toOPT_PASSES
.llc -O2
,ld
These are the LLC executor's defaults. I didn't use
llc -O0
, because in every test,O0
produced slower programs thanO2
.Benchmarks
These benchmarks were done using my slow laptop. For each number, I ran 5-9 trials and took the median. The unit is milliseconds.
Prod currently uses
gcc -O0
. The goal to beat isgcc -O2
+putchar_unlocked
.opt
stepgcc -O0
gcc -O2
clang -O2
gcc -O0
gcc -O2
gcc -O2
+putchar_unlocked
clang -O2
gcc -O0
gcc -O2
gcc -O2
+putchar_unlocked
clang -O2
+putchar_unlocked