Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate Tokens and Identifiers from other Nodes #33431

Merged
merged 3 commits into from
Feb 12, 2020

Conversation

Swatinem
Copy link
Contributor

I have been profiling tsc quite extensively recently, with the following super huge project:

Files:           4745
Lines:         713103
Nodes:        2395069
Identifiers:   832041
Symbols:      1011676
Types:         309045
Memory used: 1061792K

Analyzing a heap snapshot, I noticed that a large portion of the memory usage is due to Node, which is 160 bytes. (~370M worth)

Looking at the code, I noticed that TS already has the code to split Token and Identifier from Node, which I did. Token is now 88 bytes, and Identifier is at 104 bytes.

In total, this gives before stats of:

Memory used: 1061792K
I/O read:       0.29s
I/O write:      0.00s
Parse time:     5.92s
Bind time:      1.94s
Check time:    21.23s
Emit time:      0.00s
Total time:    29.09s

and after stats of:

Memory used: 993297K
I/O read:      0.25s
I/O write:     0.00s
Parse time:    5.73s
Bind time:     2.18s
Check time:   21.16s
Emit time:     0.00s
Total time:   29.07s

Which in total is a ~6% reduction in Memory used, without any significant change in runtime, by such a simple change.
Thanks to using node --expose-gc, those numbers are really stable!

But I am not done yet. Digging deeper into the code, and instrumenting things a bit, I noticed that literals (strings, numbers, etc) are also of type Token, but they participate in type-checking, and get some additional properties attached, which might spill out of the allocated storage, which might slow down accesses.
I suspect splitting this up further into syntax only Tokens, and Literal types would even lead to improved performance.

So I would really appreciate some help further building on this Idea:

  • Is a change such as this appreciated?
  • Does it really make sense to further split off Literal nodes?
  • Does someone have more insight into which properties belong onto which node type? I mostly went with a bit of instrumentation, and a lot of experimentation :-D
  • Is it a good idea to use --expose-gc to get more stable numbers? How about using it inside of TS own perf infrastructure?
  • Speaking of perf infrastructure… Would someone kindly trigger a perf test for this? I would really like to validate the effectiveness of this change using the official testsuite.

Use this together with `node --expose-gc` to get more stable results
With some further optimization to their properties, this shrinks the
objects by quite a bit.
@jack-williams
Copy link
Collaborator

@typescript-bot perf test this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Sep 14, 2019

Heya @jack-williams, I've started to run the perf test suite on this PR at 391a3c8. You can monitor the build here. It should now contribute to this PR's status checks.

Update: The results are in!

@jack-williams
Copy link
Collaborator

So I know essentially zero about JS runtimes, but is one consideration here that certain runtimes only support inline-caches for one shape, while v8 supports up to four? I'm not sure that a possible regression introduced by this change would be observed testing with node.

I've been quite keen to learn more about the details around the perf design of the compiler, so any insights here would be a welcomed learning experience (for me).

@Swatinem
Copy link
Contributor Author

I’m no expert either. Just tried to apply some things I learned from talk recordings, and digging through the heap profiler.

But mostly all engines agree that its best to initialize all properties at least in the same order to avoid shape changes. Which TS is notoriously bad at, since it just throws on additional properties onto Node and friends whenever it wants to…

I also do see a ton of stuff when running with node --trace-deopt related to wrong map (= shape change), which reminds me, I didn’t compare that output before/after my changes.

Also interesting to actually run the compiler with --no-opt, which basically disables the JIT, and compare the different phases. I did that once and noticed that while the Parse step had a speedup of ~6x with JIT, Check only had a speedup of ~2-3x, which means something is throwing off the optimizer. Or at least that is my guess.

All of these things are really badly documented though and its pretty much guesswork to figure out what all that means :-(

And regarding your note that this would only be observed when testing with node: I would argue running tsc on node is the >90% usecase, so IMO its a good idea to optimize for that :-)

@typescript-bot
Copy link
Collaborator

@jack-williams
The results of the perf run you requested are in!

Here they are:

Comparison Report - master..33431

Metric master 33431 Delta Best Worst
Angular - node (v12.1.0, x64)
Memory used 331,205k (± 0.06%) 305,883k (± 0.02%) -25,322k (- 7.65%) 305,748k 306,028k
Parse Time 1.56s (± 0.44%) 1.55s (± 0.48%) -0.00s (- 0.19%) 1.54s 1.57s
Bind Time 0.78s (± 0.00%) 0.78s (± 0.61%) -0.00s (- 0.38%) 0.77s 0.79s
Check Time 4.26s (± 0.58%) 4.27s (± 0.47%) +0.01s (+ 0.14%) 4.22s 4.31s
Emit Time 5.25s (± 1.18%) 5.25s (± 0.73%) -0.00s (- 0.06%) 5.19s 5.34s
Total Time 11.85s (± 0.68%) 11.85s (± 0.31%) -0.00s (- 0.01%) 11.77s 11.94s
Monaco - node (v12.1.0, x64)
Memory used 345,970k (± 0.01%) 315,848k (± 0.02%) -30,122k (- 8.71%) 315,685k 315,981k
Parse Time 1.23s (± 0.54%) 1.21s (± 0.53%) -0.02s (- 1.22%) 1.20s 1.23s
Bind Time 0.67s (± 0.71%) 0.67s (± 1.00%) -0.01s (- 0.89%) 0.65s 0.68s
Check Time 4.25s (± 0.39%) 4.27s (± 0.43%) +0.03s (+ 0.59%) 4.23s 4.32s
Emit Time 2.83s (± 0.68%) 2.84s (± 0.56%) +0.01s (+ 0.39%) 2.80s 2.87s
Total Time 8.97s (± 0.21%) 8.98s (± 0.34%) +0.01s (+ 0.17%) 8.94s 9.07s
TFS - node (v12.1.0, x64)
Memory used 301,424k (± 0.01%) 276,017k (± 0.02%) -25,407k (- 8.43%) 275,910k 276,116k
Parse Time 0.95s (± 0.58%) 0.95s (± 0.49%) -0.00s (- 0.21%) 0.94s 0.96s
Bind Time 0.62s (± 1.21%) 0.62s (± 1.18%) -0.01s (- 0.96%) 0.61s 0.64s
Check Time 3.85s (± 0.40%) 3.82s (± 0.51%) -0.02s (- 0.62%) 3.79s 3.88s
Emit Time 2.99s (± 1.53%) 2.94s (± 1.01%) -0.05s (- 1.67%) 2.86s 3.03s
Total Time 8.40s (± 0.64%) 8.33s (± 0.57%) -0.08s (- 0.93%) 8.23s 8.49s
Angular - node (v8.9.0, x64)
Memory used 350,106k (± 0.02%) 324,384k (± 0.02%) -25,723k (- 7.35%) 324,262k 324,487k
Parse Time 2.09s (± 0.49%) 2.09s (± 0.23%) -0.00s (- 0.19%) 2.08s 2.10s
Bind Time 0.83s (± 0.78%) 0.82s (± 0.45%) -0.01s (- 0.84%) 0.82s 0.83s
Check Time 5.11s (± 0.59%) 5.14s (± 0.31%) +0.03s (+ 0.51%) 5.10s 5.16s
Emit Time 5.99s (± 1.00%) 5.82s (± 0.96%) -0.17s (- 2.92%) 5.65s 5.94s
Total Time 14.03s (± 0.45%) 13.87s (± 0.45%) -0.16s (- 1.12%) 13.71s 14.01s
Monaco - node (v8.9.0, x64)
Memory used 363,734k (± 0.02%) 333,320k (± 0.01%) -30,414k (- 8.36%) 333,256k 333,413k
Parse Time 1.56s (± 0.57%) 1.54s (± 0.64%) -0.02s (- 1.41%) 1.52s 1.56s
Bind Time 0.89s (± 0.63%) 0.83s (± 0.67%) -0.06s (- 6.76%) 0.82s 0.84s
Check Time 5.08s (± 1.43%) 5.12s (± 0.46%) +0.04s (+ 0.87%) 5.06s 5.17s
Emit Time 3.20s (± 4.18%) 3.37s (± 0.50%) +0.17s (+ 5.31%) 3.34s 3.41s
Total Time 10.73s (± 0.72%) 10.86s (± 0.35%) +0.13s (+ 1.25%) 10.79s 10.96s
TFS - node (v8.9.0, x64)
Memory used 317,699k (± 0.01%) 292,059k (± 0.01%) -25,640k (- 8.07%) 291,983k 292,181k
Parse Time 1.26s (± 0.79%) 1.25s (± 0.48%) -0.01s (- 1.11%) 1.23s 1.26s
Bind Time 0.68s (± 5.26%) 0.66s (± 0.61%) -0.02s (- 3.36%) 0.65s 0.67s
Check Time 4.45s (± 1.03%) 4.60s (± 2.24%) +0.15s (+ 3.28%) 4.45s 4.77s
Emit Time 3.08s (± 0.48%) 3.09s (± 3.16%) +0.01s (+ 0.39%) 2.95s 3.29s
Total Time 9.48s (± 0.35%) 9.60s (± 0.50%) +0.12s (+ 1.29%) 9.42s 9.67s
Angular - node (v8.9.0, x86)
Memory used 198,198k (± 0.02%) 184,973k (± 0.02%) -13,225k (- 6.67%) 184,860k 185,108k
Parse Time 2.03s (± 0.66%) 2.02s (± 0.57%) -0.00s (- 0.15%) 2.00s 2.05s
Bind Time 0.95s (± 0.36%) 0.95s (± 0.55%) +0.00s (+ 0.21%) 0.94s 0.96s
Check Time 4.65s (± 0.49%) 4.64s (± 0.65%) -0.01s (- 0.17%) 4.58s 4.73s
Emit Time 5.64s (± 0.61%) 5.71s (± 0.94%) +0.07s (+ 1.26%) 5.59s 5.81s
Total Time 13.27s (± 0.35%) 13.33s (± 0.53%) +0.06s (+ 0.45%) 13.16s 13.46s
Monaco - node (v8.9.0, x86)
Memory used 203,231k (± 0.01%) 187,805k (± 0.03%) -15,426k (- 7.59%) 187,727k 187,924k
Parse Time 1.62s (± 0.85%) 1.58s (± 0.52%) -0.04s (- 2.41%) 1.56s 1.60s
Bind Time 0.72s (± 0.82%) 0.71s (± 0.67%) -0.01s (- 1.52%) 0.70s 0.72s
Check Time 4.86s (± 0.38%) 4.98s (± 2.08%) +0.12s (+ 2.55%) 4.81s 5.17s
Emit Time 3.16s (± 0.53%) 2.97s (± 4.74%) -0.19s (- 5.92%) 2.72s 3.18s
Total Time 10.36s (± 0.27%) 10.24s (± 0.56%) -0.12s (- 1.12%) 10.13s 10.37s
TFS - node (v8.9.0, x86)
Memory used 178,553k (± 0.01%) 165,550k (± 0.02%) -13,002k (- 7.28%) 165,459k 165,610k
Parse Time 1.31s (± 0.45%) 1.30s (± 0.69%) -0.02s (- 1.37%) 1.28s 1.32s
Bind Time 0.64s (± 0.81%) 0.63s (± 1.02%) -0.01s (- 1.10%) 0.62s 0.65s
Check Time 4.29s (± 0.61%) 4.31s (± 0.78%) +0.02s (+ 0.44%) 4.25s 4.43s
Emit Time 2.87s (± 1.54%) 2.83s (± 0.99%) -0.04s (- 1.33%) 2.75s 2.89s
Total Time 9.11s (± 0.67%) 9.06s (± 0.61%) -0.05s (- 0.49%) 8.97s 9.24s
Angular - node (v9.0.0, x64)
Memory used 349,750k (± 0.02%) 323,896k (± 0.01%) -25,854k (- 7.39%) 323,823k 324,027k
Parse Time 1.82s (± 0.44%) 1.79s (± 1.52%) -0.03s (- 1.60%) 1.71s 1.82s
Bind Time 0.78s (± 0.64%) 0.79s (± 3.14%) +0.01s (+ 1.42%) 0.76s 0.86s
Check Time 4.87s (± 0.56%) 4.84s (± 1.63%) -0.02s (- 0.49%) 4.62s 4.94s
Emit Time 5.78s (± 1.22%) 5.70s (± 2.22%) -0.09s (- 1.47%) 5.37s 6.01s
Total Time 13.24s (± 0.68%) 13.12s (± 0.73%) -0.12s (- 0.94%) 12.75s 13.24s
Monaco - node (v9.0.0, x64)
Memory used 363,360k (± 0.01%) 332,960k (± 0.01%) -30,400k (- 8.37%) 332,847k 333,021k
Parse Time 1.31s (± 0.44%) 1.29s (± 0.59%) -0.02s (- 1.37%) 1.28s 1.32s
Bind Time 0.84s (± 0.44%) 0.69s (± 0.96%) -0.15s (-18.34%) 0.68s 0.70s
Check Time 4.87s (± 0.33%) 5.05s (± 0.50%) +0.18s (+ 3.64%) 4.99s 5.11s
Emit Time 3.36s (± 0.54%) 3.32s (± 0.65%) -0.04s (- 1.04%) 3.26s 3.38s
Total Time 10.38s (± 0.33%) 10.36s (± 0.44%) -0.03s (- 0.27%) 10.23s 10.48s
TFS - node (v9.0.0, x64)
Memory used 317,508k (± 0.01%) 291,773k (± 0.01%) -25,735k (- 8.11%) 291,731k 291,857k
Parse Time 1.04s (± 0.59%) 1.03s (± 0.80%) -0.01s (- 0.48%) 1.02s 1.05s
Bind Time 0.62s (± 0.84%) 0.61s (± 0.95%) -0.01s (- 0.97%) 0.60s 0.63s
Check Time 4.38s (± 0.43%) 4.40s (± 0.64%) +0.01s (+ 0.32%) 4.33s 4.45s
Emit Time 3.17s (± 1.80%) 3.15s (± 1.96%) -0.02s (- 0.72%) 2.96s 3.22s
Total Time 9.21s (± 0.76%) 9.20s (± 0.86%) -0.02s (- 0.17%) 8.92s 9.31s
Angular - node (v9.0.0, x86)
Memory used 198,395k (± 0.03%) 184,772k (± 0.02%) -13,624k (- 6.87%) 184,676k 184,848k
Parse Time 1.73s (± 0.45%) 1.73s (± 0.44%) -0.00s (- 0.17%) 1.71s 1.75s
Bind Time 0.89s (± 0.73%) 0.90s (± 0.77%) +0.00s (+ 0.34%) 0.88s 0.91s
Check Time 4.34s (± 0.45%) 4.32s (± 0.57%) -0.01s (- 0.32%) 4.28s 4.40s
Emit Time 5.49s (± 0.59%) 5.46s (± 0.49%) -0.03s (- 0.51%) 5.40s 5.51s
Total Time 12.44s (± 0.32%) 12.40s (± 0.26%) -0.04s (- 0.35%) 12.33s 12.46s
Monaco - node (v9.0.0, x86)
Memory used 203,301k (± 0.02%) 187,743k (± 0.02%) -15,558k (- 7.65%) 187,639k 187,838k
Parse Time 1.34s (± 0.45%) 1.33s (± 0.62%) -0.02s (- 1.34%) 1.31s 1.35s
Bind Time 0.65s (± 1.03%) 0.63s (± 0.75%) -0.01s (- 2.16%) 0.62s 0.64s
Check Time 4.67s (± 0.63%) 4.66s (± 0.40%) -0.01s (- 0.24%) 4.63s 4.71s
Emit Time 3.07s (± 0.46%) 3.06s (± 0.53%) -0.00s (- 0.16%) 3.04s 3.11s
Total Time 9.73s (± 0.26%) 9.69s (± 0.20%) -0.05s (- 0.47%) 9.64s 9.74s
TFS - node (v9.0.0, x86)
Memory used 178,605k (± 0.01%) 165,449k (± 0.02%) -13,156k (- 7.37%) 165,404k 165,552k
Parse Time 1.06s (± 0.70%) 1.06s (± 1.03%) -0.00s (- 0.09%) 1.05s 1.09s
Bind Time 0.57s (± 0.86%) 0.58s (± 0.86%) +0.00s (+ 0.35%) 0.57s 0.59s
Check Time 4.13s (± 0.58%) 4.17s (± 0.60%) +0.05s (+ 1.16%) 4.13s 4.24s
Emit Time 2.80s (± 1.06%) 2.77s (± 0.81%) -0.02s (- 0.89%) 2.70s 2.81s
Total Time 8.56s (± 0.56%) 8.58s (± 0.49%) +0.02s (+ 0.22%) 8.51s 8.67s
System
Machine Namets-ci-ubuntu
Platformlinux 4.4.0-161-generic
Architecturex64
Available Memory16 GB
Available Memory10 GB
CPUs4 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
Hosts
  • node (v12.1.0, x64)
  • node (v8.9.0, x64)
  • node (v8.9.0, x86)
  • node (v9.0.0, x64)
  • node (v9.0.0, x86)
Scenarios
  • Angular - node (v12.1.0, x64)
  • Angular - node (v8.9.0, x64)
  • Angular - node (v8.9.0, x86)
  • Angular - node (v9.0.0, x64)
  • Angular - node (v9.0.0, x86)
  • Monaco - node (v12.1.0, x64)
  • Monaco - node (v8.9.0, x64)
  • Monaco - node (v8.9.0, x86)
  • Monaco - node (v9.0.0, x64)
  • Monaco - node (v9.0.0, x86)
  • TFS - node (v12.1.0, x64)
  • TFS - node (v8.9.0, x64)
  • TFS - node (v8.9.0, x86)
  • TFS - node (v9.0.0, x64)
  • TFS - node (v9.0.0, x86)
Benchmark Name Iterations
Current 33431 10
Baseline master 10

@Swatinem
Copy link
Contributor Author

wow, those results are really encouraging, 6-8% wins on memory usage. total time seems to be ±1%, so not really conclusive.

@jack-williams
Copy link
Collaborator

jack-williams commented Sep 14, 2019

What is the call to gc for? Will that mislead the results when comparing against master?

@Swatinem
Copy link
Contributor Author

Swatinem commented Sep 14, 2019

What is the call to gc?

That trigger a manual gc when run with node --expose-gc. The purpose is to get more stable numbers from the process.memoryUsage() below, otherwise you have high variance due to random gc. That’s why I was asking if the perf infra already uses that, or that it maybe should.
So the call has no effect when the testing infra is not running node with the --expose-gc flag.

@jack-williams
Copy link
Collaborator

I'd consider opening an issue with the questions and discussion points you list. It would probably get more visibility by those on the team, and also for other users of the tracker.

@Swatinem
Copy link
Contributor Author

I'd consider opening an issue with the questions and discussion points you list. It would probably get more visibility by those on the team, and also for other users of the tracker.

Sure!
I opened #33432 for discussion about Node types, and #33433 for discussion about using gc().

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Sep 14, 2019

Does anyone know what the "Lines" diagnostic is for? Is it the number of lines emitted?

My source files are about 30k lines total.

But when I build, "Lines" diagnostic says 5,266,173 and takes 300s (5 minutes)


@Swatinem

How many LOC do your source files have? How simple or complex are the types in your "super huge project"?


For reference, when building my composite project, my total stats are,

Source LOC  :     30,368
Files       :     74,285
Lines       :  5,266,173
Nodes       : 26,671,299
Identifiers :  7,810,628
Symbols     : 13,598,594
Types       :  5,053,944
Memory Used : 31,723,111K
Check time  :        218.41s
Emit time   :         69.38s
Total time  :        300.57s

I'm very interested in knowing if your changes will improve the memory usage without negatively affecting check times for me.

The total stats are huge because it's building 42 sub projects. Those stats are about a few weeks old. I am at 52 sub projects now =/


The perf tests don't show much of a change in check times; +3.64% at worst.
But I feel like those test "normal" usage and I tend to fall into "unusual" usage.

If we pack this, I can try running it on my project and report back with the results.

@Swatinem
Copy link
Contributor Author

This might be a bit off-topic for this PR, but maybe something is going wrong with your dependencies. Take a look at tsc --listFiles. When you talk about sub-projects, make sure their dependencies have the correct version and are correctly deduped using something like yarn workspaces. Yarn has some nasty bugs around that and will fail to dedupe semver compatible packages and you might end up with multiple copies of it.
Also, some dependencies are really badly structured. Yesterday I found out that aws-sdk alone is pulling in 300kLOC, which alone causes ~250M memory usage when checking our project. See aws/aws-sdk-js#2846

Oh, and is your tsc srsly using 32G of ram? How is that even possible?

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Sep 15, 2019

I'm not using any fancy tooling. Just regular tsc with a composite project set up. So, unless tsc itself is bugged, I should be good in that area.

I'll try out --listFiles when I get home.

It's using 32G total for all 42 sub projects. Not in one compilation =P it's basically creating 42 ts programs, I think.

All 42 sub projects use the same node_modules folder. So, not too many worries about dependencies duplicating all over the place. (I know it's still possible to have duplicates like using two different major versions of the same package indirectly)


@jack-williams can we have this packed? =x

@sheetalkamat
Copy link
Member

@RyanCavanaugh @rbuckton I had done initial change for LS only as part of #9529 and as part of decreasing memory usage for Salsa.. We had then decided to only do this for LS at that time, don't remember the reason for that though.

@Swatinem Swatinem changed the title WIP: Separate Tokens and Identifiers from other Nodes Separate Tokens and Identifiers from other Nodes Sep 25, 2019
@Swatinem
Copy link
Contributor Author

Swatinem commented Dec 6, 2019

Well this has conflicts now, should be trivial to fix. I would suggest to just land a very minimal version first, which only copy-pastes the Node constructor for all the already defined types.

I would be willing to further work on this to add shortcuts, remove unused props, maybe better organize the different types, etc. if someone wants to mentor me.

However, I’m a bit… well… disappointed that this has been sitting idle for almost 3 months now, so I don’t really want to waste my time if this isn’t going anywere… 😒

@sheetalkamat @rbuckton @RyanCavanaugh any guidance?

@sandersn sandersn added the For Backlog Bug PRs that fix a backlog bug label Feb 1, 2020
@rbuckton
Copy link
Member

@Swatinem I apologize for not looking into this sooner. The memory wins are appreciated, and the few cases of perf losses are either within the margin-of-error or targeting NodeJS 8.x which is now outside of Node's LTS maintenance schedule. If you are willing to resolve the conflicts, I can take a thorough review in the next few days.

@Swatinem
Copy link
Contributor Author

Awesome! I’m gone now for an extended weekend, will rebase monday evening. :-)

src/tsc/tsc.ts Outdated Show resolved Hide resolved
@rbuckton
Copy link
Member

The change in utilities.ts is a fairly trivial conflict to resolve, and the change in tsc.ts can just be dropped per my comment above. I can easily resolve the conflicts myself.

@Swatinem
Copy link
Contributor Author

I can easily resolve the conflicts myself.

go ahead ;-)

@Swatinem
Copy link
Contributor Author

Oh, while you are at it…

There is also an allocator for SourceFile pre-defined already. Make sure to also turn that one into its own constructor function. My experiments also have shown that that can further shrink Node.

@rbuckton
Copy link
Member

@Swatinem

There is also an allocator for SourceFile pre-defined already.

I'll look into that one in a separate PR, we also have one for PrivateIdentifier now as well.

@rbuckton
Copy link
Member

Hmm. The GitHub resolve conflicts web UI seems to have messed with the linebreaks in utilities.ts and tsc.ts. I'll have to fix that manually.

@rbuckton rbuckton merged commit cf6b641 into microsoft:master Feb 12, 2020
@rbuckton
Copy link
Member

@Swatinem: Thanks! I've merged these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
For Backlog Bug PRs that fix a backlog bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants