Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not build LLVM bitcast instructions for opaque pointer casts #13171

Conversation

HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented Mar 9, 2023

Part of #12743. When opaque pointers are available, casts between pointers (in the same address space) are always no-ops in LLVM, but this alone doesn't prevent Crystal from computing the pointee type. This PR does that by not evaluating the type argument at all for LLVM 15+. Note that all uses of bitcast in Crystal are for pointer casts.

@HertzDevil
Copy link
Contributor Author

HertzDevil commented Mar 9, 2023

Moment of truth:

$ git checkout c6e3116df9a44a871740318ebcf486f986b79dad
$ make clean crystal clean_cache release=1 && touch src/compiler/crystal.cr && make stats=1
Parse:                             00:00:00.000214084 (   0.77MB)
Semantic (top level):              00:00:00.345461000 ( 173.16MB)
Semantic (new):                    00:00:00.001670375 ( 173.16MB)
Semantic (type declarations):      00:00:00.030641833 ( 173.16MB)
Semantic (abstract def check):     00:00:00.012177750 ( 189.16MB)
Semantic (restrictions augmenter): 00:00:00.011101791 ( 205.16MB)
Semantic (ivars initializers):     00:00:04.578064625 (1813.06MB)
Semantic (cvars initializers):     00:00:00.004923458 (1813.06MB)
Semantic (main):                   00:00:05.646283959 (2197.06MB)
Semantic (cleanup):                00:00:00.000617875 (2197.06MB)
Semantic (recursive struct check): 00:00:00.001288625 (2197.06MB)
Codegen (crystal):                 00:00:03.563826917 (2669.06MB)
Codegen (bc+obj):                  00:00:06.565988417 (2669.06MB)
Codegen (linking):                 00:00:00.878433167 (2669.06MB)
dsymutil:                          00:00:00.558394875 (2669.06MB)

Macro runs:
 - /Users/quinton/crystal/crystal/src/ecr/process.cr: 00:00:04.653824125

Codegen (bc+obj):
 - no previous .o files were reused
  • LLVM 14.0.1, this PR
$ git checkout cfd9d60de08a0c5988816402f7722fb44355bbb6
$ make clean crystal clean_cache release=1 && touch src/compiler/crystal.cr && make stats=1
Parse:                             00:00:00.000189375 (   0.77MB)
Semantic (top level):              00:00:00.341806000 ( 172.08MB)
Semantic (new):                    00:00:00.001625917 ( 172.08MB)
Semantic (type declarations):      00:00:00.029480709 ( 172.08MB)
Semantic (abstract def check):     00:00:00.012977834 ( 204.08MB)
Semantic (restrictions augmenter): 00:00:00.009416000 ( 204.08MB)
Semantic (ivars initializers):     00:00:04.530023875 (1828.00MB)
Semantic (cvars initializers):     00:00:00.005073917 (1828.00MB)
Semantic (main):                   00:00:06.559616208 (2140.00MB)
Semantic (cleanup):                00:00:00.000502583 (2140.00MB)
Semantic (recursive struct check): 00:00:00.001180125 (2140.00MB)
Codegen (crystal):                 00:00:02.610722417 (2732.02MB)
Codegen (bc+obj):                  00:00:06.334841000 (2732.02MB)
Codegen (linking):                 00:00:01.092296084 (2732.02MB)
dsymutil:                          00:00:00.696052500 (2732.02MB)

Macro runs:
 - /Users/quinton/crystal/crystal/src/ecr/process.cr: 00:00:04.650499750

Codegen (bc+obj):
 - no previous .o files were reused
  • LLVM 15.0.7, this PR
$ git checkout cfd9d60de08a0c5988816402f7722fb44355bbb6
$ make clean crystal clean_cache release=1 && touch src/compiler/crystal.cr && make stats=1
Parse:                             00:00:00.000174458 (   0.77MB)
Semantic (top level):              00:00:00.419430167 ( 173.17MB)
Semantic (new):                    00:00:00.001808792 ( 173.17MB)
Semantic (type declarations):      00:00:00.030975167 ( 173.17MB)
Semantic (abstract def check):     00:00:00.012816666 ( 189.17MB)
Semantic (restrictions augmenter): 00:00:00.010163750 ( 205.17MB)
Semantic (ivars initializers):     00:00:06.030664375 (1813.08MB)
Semantic (cvars initializers):     00:00:00.004775667 (1813.08MB)
Semantic (main):                   00:00:06.327635500 (2205.08MB)
Semantic (cleanup):                00:00:00.000621167 (2205.08MB)
Semantic (recursive struct check): 00:00:00.001355583 (2205.08MB)
Codegen (crystal):                 00:00:02.735493083 (2685.09MB)
Codegen (bc+obj):                  00:00:05.499450291 (2685.09MB)
Codegen (linking):                 00:00:00.426940250 (2685.09MB)
dsymutil:                          00:00:00.593159500 (2685.09MB)

Macro runs:
 - /Users/quinton/crystal/crystal/src/ecr/process.cr: 00:00:03.145058791

Codegen (bc+obj):
 - no previous .o files were reused

Maximum memory load is not as bad as I thought, despite the addition of Crystal::CodeGenVisitor#@fun_types.

Codegen (crystal) actually takes less time now, regardless of whether LLVM supports opaque pointers. Improvements in Codegen (bc+obj) are solely due to LLVM and beyond Crystal's direct control; they are not very conclusive as I am not building the compiler in release mode for the second generation. On the other hand, quite a lot more time is spent on the semantic phase, which I can't quite relate to the LLVM changes (yet the times for macro runs have reduced by almost the same duration).

I will be doing a few more runs of make clean crystal clean_cache release=1 && bin/crystal run --stats src/compiler/crystal.cr to see if there is imminent need for further optimizations, i.e. precomputing call types where they are known in advance instead of caching them.

@HertzDevil
Copy link
Contributor Author

HertzDevil commented Mar 9, 2023

Here are the stats of make clean crystal release=1, followed by 10 runs of make clean_cache && bin/crystal run --stats src/compiler/crystal.cr:

LLVM 14.0.1 (c6e3116) LLVM 14.0.1 (cfd9d60) LLVM 15.0.7 (cfd9d60)
Parse 0.0000s (±21.8%) 0.0000s (±17.3%) 0.0000s (±13.9%)
Semantic (top level) 0.4296s (±3.6%) 0.4528s (±9.0%) 0.4658s (±4.7%)
Semantic (new) 0.0018s (±2.6%) 0.0018s (±12.7%) 0.0019s (±17.5%)
Semantic (type declarations) 0.0308s (±3.2%) 0.0310s (±4.6%) 0.0329s (±16.9%)
Semantic (abstract def check) 0.0127s (±1.7%) 0.0131s (±3.0%) 0.0129s (±2.3%)
Semantic (restrictions augmenter) 0.0101s (±4.4%) 0.0101s (±8.9%) 0.0098s (±6.9%)
Semantic (ivars initializers) 4.5505s (±0.8%) 4.5354s (±0.7%) 6.0346s (±2.6%)
Semantic (cvars initializers) 0.0051s (±3.4%) 0.0053s (±9.4%) 0.0050s (±2.0%)
Semantic (main) 6.3904s (±1.9%) 6.3909s (±2.8%) 5.5233s (±3.2%)
Semantic (cleanup) 0.0006s (±4.2%) 0.0006s (±2.4%) 0.0006s (±2.7%)
Semantic (recursive struct check) 0.0012s (±3.6%) 0.0012s (±2.8%) 0.0012s (±8.6%)
Codegen (crystal) 2.6454s (±9.7%) 2.7394s (±3.5%) 2.8363s (±3.8%)
Codegen (bc+obj) 5.9099s (±2.8%) 6.0044s (±2.3%) 5.1995s (±0.8%)
Codegen (linking) 0.7582s (±1.5%) 0.7576s (±1.4%) 0.3392s (±19.1%)
dsymutil 0.3473s (±2.0%) 0.3486s (±1.6%) 0.3940s (±2.9%)
Macro runs 4.6345s (±1.4%) 4.6609s (±1.9%) 3.0254s (±3.3%)

Plus 1 run in release mode (only the stages beyond Crystal's control are shown because the other times are unaffected):

LLVM 14.0.1 (c6e3116) LLVM 14.0.1 (cfd9d60) LLVM 15.0.7 (cfd9d60)
Codegen (bc+obj) 247.8567s 251.4313s 184.3743
Codegen (linking) 0.4536s 0.4714s 0.3295s
dsymutil 0.6281s 0.6342s 0.5236s

@HertzDevil HertzDevil merged commit 2df696b into crystal-lang:feature/llvm-opaque-pointers Mar 9, 2023
@HertzDevil HertzDevil deleted the refactor/llvm-pointer-bitcast branch March 9, 2023 21:49
@HertzDevil HertzDevil mentioned this pull request Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants