Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler: initialize right-away constants in a separate function #10334

Merged
merged 1 commit into from
Jan 30, 2021

Conversation

asterite
Copy link
Member

Fixes the slowness in the compiler.

Alternative to #10330

The idea is to still eagerly initialize the constants that can be initialized like that, but do that in a separate function. It seems like that LLVM has an easier time optimizing things.

The code for doing that was already there in the compiler: it's used for all other constants, but doing that in a wrapper that makes sure they are only initialized once.

Does it really work, and is it really fast?

Please help me try it out. Here's what you can do:

  • Run crystal build src/ecr/process.cr --release -s with the existing 0.36.0 compiler, or by compiling a compiler from master. Check how long the "bc+obj|" phase takes. For me this takes about 30 seconds.
  • Compile a compiler with the code in this PR. Then run bin/crystal build src/ecr/process.cr --release -s. It should be faster. For me this takes about 8 seconds.
  • Also try the previous point with Revert optimization of constants and class vars #10330 . It should take about the same time as before (8 seconds for me).

Then with a compiler built from this PR make sure this benchmark is still fast.

You can also compare the times with a 0.35.1 compiler. For me that "bc+obj" phase takes about 6 seconds, not 8. I think LLVM 11 is a bit slower in this aspect.

@asterite asterite added kind:regression Something that used to correctly work but no longer works performance topic:compiler labels Jan 29, 2021
@jwoertink
Copy link
Contributor

Here's what I get:

Crystal 0.35.1
Codegen (bc+obj):                  00:00:09.678091387 (  92.28MB)
Crystal 0.36.0
Codegen (bc+obj):                  00:01:38.148248913 (  89.86MB)
Crystal this PR
Codegen (bc+obj):                  00:00:10.685466711 (  89.86MB)
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'

For the last one, here's the version it showed

❯ ./bin/crystal -v
Using compiled compiler at .build/crystal
Crystal 1.0.0-dev [e1a0da726] (2021-01-29)

LLVM: 11.0.0
Default target: x86_64-apple-macosx

@asterite
Copy link
Member Author

@jwoertink Thanks! What's that ld warning? Is that a problem? First time I see it.

@jwoertink
Copy link
Contributor

Oh, and here's that benchmark with those 3 versions:

Crystal 0.35.1 (took 14s to build)
❯ ./kostya
17185365
00:00:01.240112000
Crystal 0.36.0 (took 1m36s to build)
❯ ./kostya
17191068
00:00:00.382423000
Crystal this PR (took 11s to build)
❯ ./kostya
17187241
00:00:00.397169000

@jwoertink
Copy link
Contributor

@asterite I've never seen that ld warning before. It only seems to happen when I compile from this PR. I ran make clean crystal when I built the compiler. Maybe I missed a step?

@asterite
Copy link
Member Author

It's also very strange that the error mentions a directory with Crystal 0.34.0 in it

@asterite
Copy link
Member Author

@jwoertink thank you for trying these! Of course you can also use a compiler from this PR with whatever program was slow to compile for you, and see if it's fixed.

@jwoertink
Copy link
Contributor

jwoertink commented Jan 29, 2021

It's also very strange

It is

❯ ls -la /usr/local/Cellar/crystal/
total 0
drwxr-xr-x    3 jeremywoertink  staff    96 Jan 27 08:38 .
drwxrwxr-x  156 jeremywoertink  admin  4992 Jan 27 08:43 ..
drwxr-xr-x   14 jeremywoertink  staff   448 Jan 27 08:38 0.36.0

I definitely don't have 0.34 on here 😂

@jwoertink
Copy link
Contributor

Sweet! I ran the Lucky specs using this PR, and it took 1m30s. Interesting thing was that ld: warning showed up 5 times. I wonder if there's 5 different run macros and each has to be compiled separately?

lucky on  jaw/crystal0.36 [!?] via 🔮 v0.35.1
❯ /Users/jeremywoertink/Development/crystal/lang/bin/crystal spec
Using compiled compiler at /Users/jeremywoertink/Development/crystal/lang/.build/crystal
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'
ld: warning: directory not found for option '-L/usr/local/Cellar/crystal/0.34.0/embedded/lib'

..........................................

@asterite
Copy link
Member Author

If you pass -s I think it tells you which macro runs were compiled

@jwoertink
Copy link
Contributor

Ah, yup.

- /Users/jeremywoertink/Development/crystal/lucky/lib/teeplate/src/lib/file_tree/macros/directory.cr: reused previous compilation (00:00:00.038240130)
 - /Users/jeremywoertink/Development/crystal/lucky/src/run_macros/generate_asset_helpers.cr: reused previous compilation (00:00:00.039536256)
 - /Users/jeremywoertink/Development/crystal/lucky/src/run_macros/infer_route.cr: reused previous compilation (00:00:00.040292692)
 - /Users/jeremywoertink/Development/crystal/lucky/lib/avram/src/run_macros/infer_table_name.cr: reused previous compilation (00:00:00.041060772)
 - /Users/jeremywoertink/Development/crystal/lang/src/ecr/process.cr: reused previous compilation (00:00:00.045328081)

Not a huge deal I guess. 1min is WAY better than the 14min I was seeing 😂

@asterite
Copy link
Member Author

Let's wait for someone else to see if they also get those warnings.

@bcardiff
Copy link
Member

I checked before/after this PR against llvm@10 and bc+obj when from 27s to 7s. The released 0.36.0 tar.gz llvm@6 is even 12s.

I'm happy with this. I'm not sure if @mattrberry can verify this before 0.36.1, that would be awesome.

@bcardiff
Copy link
Member

@jwoertink what does bin/crystal env (and crystal env) shows on your environment?

@jwoertink
Copy link
Contributor

@bcardiff

❯ bin/crystal env
Using compiled compiler at .build/crystal
CRYSTAL_CACHE_DIR=/Users/jeremywoertink/.cache/crystal
CRYSTAL_PATH=lib:/Users/jeremywoertink/Development/crystal/lang/src
CRYSTAL_VERSION=1.0.0-dev
CRYSTAL_LIBRARY_PATH=/usr/local/Cellar/crystal/0.34.0/embedded/lib
CRYSTAL_OPTS=''


❯ crystal env
CRYSTAL_CACHE_DIR=/Users/jeremywoertink/.cache/crystal
CRYSTAL_PATH=lib:/Users/jeremywoertink/Downloads/crystal-0.35.1-1/src
CRYSTAL_VERSION=0.35.1
CRYSTAL_LIBRARY_PATH=/Users/jeremywoertink/Downloads/crystal-0.35.1-1/embedded/lib
CRYSTAL_OPTS=''

Maybe I have something left over from a while ago?

@bcardiff
Copy link
Member

Yes, it seems that you probably have a CRYSTAL_CONFIG_LIBRARY_PATH env set that is setting the CRYSTAL_LIBRARY_PATH in the built compiler

@asterite
Copy link
Member Author

I don't think mattrberry has ever compiled the compiler before, but I'm pretty sure it will work fine and fast

@mattrberry
Copy link
Contributor

I've built the compiler on a few occasions, including when I tested your changes to the constants prior to 0.36.0. I'm happy to try again in a few hours. CryBoy appears to fail compilation on 0.36.0 still due to some later changes that went into 0.36.0, so I still need to work those out. I'll update this thread this evening when I get a chance to look into it a little further

@mattrberry
Copy link
Contributor

Testing on CryBoy:

Compiler versions tested:

  • 0.35.1
  • 0.36.0
  • opt/const-and-class-var

For each of these versions, I tested 3 ROMs 3 times. Across all of those runs, both 0.36.0 and this branch seemed to have similar performance, both of which improved fps over 0.35.1 by 20-25% depending on the rom.

In terms of performance that I've noticed, this branch seems to be identical or close enough to identical to 0.36.0. If this improves upon the compilation speed of 0.36.0, I think this is a win-win :)

Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test-alpine are failing sporadically. I did some runs localle to validate. This is GTG 🚀

@asterite asterite merged commit 68e7152 into master Jan 30, 2021
@asterite asterite deleted the opt/const-and-class-var branch January 30, 2021 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Something that used to correctly work but no longer works performance topic:compiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants