Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executors/BF: optimize BF using LLVM #1091

Merged
merged 1 commit into from
Jan 1, 2023
Merged

executors/BF: optimize BF using LLVM #1091

merged 1 commit into from
Jan 1, 2023

Conversation

int-y1
Copy link
Contributor

@int-y1 int-y1 commented Dec 27, 2022

There are a few parts to this BF optimizing compiler: simple loop pass, translation into LLVM, opt, llc -O2, ld.

Simple loop pass (Python)

Optimizes simple loops. opt can optimize these loops too, but it takes a long time to do so. If this step is skipped, the BF program 1162081 takes 67s on opt and gets compiler TLE (limit is 10s).

Also checks for unmatched brackets. Later steps don't need to check for unmatched brackets.

Translation into LLVM

A few notable things:

  • mmap rounds up to the nearest 4096, so I put memory: 4 in every test.yml.
  • The pointer looks like %p, and is maintained using getelementptr and phi.
  • I used *char_unlocked, because for large I/O, it's much faster than *char.

opt

Optimizes LLVM using the passes in OPT_PASSES. The performance gain is very tiny (~5%), and I suspect it's because llc -O2 does its own optimizations. I didn't use opt -O2 for 2 reasons: possible performance regressions in the future, and similar performance to OPT_PASSES.

llc -O2, ld

These are the LLC executor's defaults. I didn't use llc -O0, because in every test, O0 produced slower programs than O2.

Benchmarks

These benchmarks were done using my slow laptop. For each number, I ran 5-9 trials and took the median. The unit is milliseconds.

Prod currently uses gcc -O0. The goal to beat is gcc -O2 + putchar_unlocked.

Program method of compilation opt step total compilation runtime
mandelbrot.b (11K inst, 6K output) gcc -O0 - 841 28746
mandelbrot.b gcc -O2 - 2480 1648
mandelbrot.b clang -O2 - 4925 1790
mandelbrot.b this PR 706 1237 1627
1162081 (46K inst, 16M output) gcc -O0 - 5371 30913
1162081 gcc -O2 - 12579 1231
1162081 gcc -O2 + putchar_unlocked - 23962 1010
1162081 clang -O2 - >120000 💀 ?
1162081 this PR 1433 2892 1132
230309 (240 inst, 16M output) gcc -O0 - 325 1135
230309 gcc -O2 - 315 297
230309 gcc -O2 + putchar_unlocked - 352 195
230309 clang -O2 + putchar_unlocked - 304 211
230309 this PR 83 184 206

@dmoj-build
Copy link
Collaborator

Can one of the admins verify this patch?

@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2022

Codecov Report

Base: 81.04% // Head: 84.17% // Increases project coverage by +3.13% 🎉

Coverage data is based on head (1e57fb4) compared to base (10ff96a).
Patch coverage: 99.23% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1091      +/-   ##
==========================================
+ Coverage   81.04%   84.17%   +3.13%     
==========================================
  Files         137      137              
  Lines        4821     4922     +101     
==========================================
+ Hits         3907     4143     +236     
+ Misses        914      779     -135     
Impacted Files Coverage Δ
dmoj/executors/BF.py 99.25% <99.23%> (+5.14%) ⬆️
dmoj/result.py 83.11% <0.00%> (-1.30%) ⬇️
dmoj/judge.py 54.54% <0.00%> (+1.21%) ⬆️
dmoj/executors/java_executor.py 84.84% <0.00%> (+2.02%) ⬆️
dmoj/cptbox/compiler_isolate.py 55.31% <0.00%> (+6.38%) ⬆️
dmoj/cptbox/tracer.py 76.32% <0.00%> (+15.90%) ⬆️
dmoj/cptbox/handlers.py 100.00% <0.00%> (+26.31%) ⬆️
dmoj/cptbox/isolate.py 89.63% <0.00%> (+38.41%) ⬆️
dmoj/control.py 100.00% <0.00%> (+64.70%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kiritofeng
Copy link
Member

ok to test

@int-y1 int-y1 force-pushed the bf-llvm branch 10 times, most recently from e77693f to 1638784 Compare December 31, 2022 05:37
@int-y1 int-y1 marked this pull request as ready for review December 31, 2022 05:38
@int-y1
Copy link
Contributor Author

int-y1 commented Dec 31, 2022

Prod tests. Methodology:

  • Rejudge everything with LLVM to get new_* columns.
  • Rejudge everything with gcc -O0 to get old_* columns. This ensures the old data is recent.

My notes:

  • Compiler TLE is so minor that I don't consider it an issue.
  • There are lots of TLE->AC, which is good. Some took longer to RTE/TLE, but that's fine. Some ACs took longer (e.g. 2968660, 1359681), but there's evidence that it's noise.
  • Memory use was roughly the same, because both executors use libc.

I didn't see any issues with verdict change, execution time, or memory use.

dmoj/executors/BF.py Outdated Show resolved Hide resolved
dmoj/executors/BF.py Outdated Show resolved Hide resolved
dmoj/executors/BF.py Outdated Show resolved Hide resolved
@int-y1 int-y1 force-pushed the bf-llvm branch 3 times, most recently from 3948e75 to b4a7710 Compare January 1, 2023 03:03
Copy link
Member

@Xyene Xyene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for working on this!

@Xyene Xyene merged commit 9ad0035 into DMOJ:master Jan 1, 2023
@int-y1 int-y1 deleted the bf-llvm branch January 1, 2023 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants