Skip to content
This repository has been archived by the owner on Jan 2, 2021. It is now read-only.

Use object code for Template Haskell #836

Merged
merged 25 commits into from
Oct 4, 2020
Merged

Conversation

wz1000
Copy link
Collaborator

@wz1000 wz1000 commented Sep 27, 2020

Rework of mpickering#43

Should improve performance and also functions as a workaround for #614

We now generate object code along with the interface file. Object code is only generated for a module if

  1. The module uses TH
  2. Any parents of the module use TH

For this to work, we require the complete module graph. Earlier, tests were failing because TargetFile's were not included in the set of known files. I fixed this, but now tests are failing because the modules don't define a main function. Any idea why this is happening @pepeiborra? See my last commit.

FIxes #614
Fixes #107

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 27, 2020

OK, I figured out the issue (LinkBinary triggers that, LinkInMemory fixes it). Not sure about the performance impact of this change...

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 27, 2020

ghc-lib :(

I have no idea how to locally build this configuration, and worse still, no tests run so I have no way to know if my changes work.

@pepeiborra
Copy link
Collaborator

pepeiborra commented Sep 27, 2020

It would be cool to see the performance improvements of this change, but for that we need a benchmark example that uses TH (and doesn't have too many dependencies).

EDIT: see #838, now you only need to find a suitable example (the hard part)

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI benchmarks look rather useless. Can you run cabal bench locally and share the results?

src/Development/IDE/Core/Compile.hs Show resolved Hide resolved

generateObjectCode :: HscEnv -> TcModuleResult -> IO (IdeResult Linkable)
generateObjectCode hscEnv tmr = do
(compile_diags, Just (_, guts, _)) <- compileModule (RunSimplifier True) hscEnv tmr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid compiling the module here? I think the answer is not, because GenerateCore is not used anymore

fmap (either (, Nothing) (second Just)) $
evalGhcEnv packageState $
catchSrcErrors "compile" $ do
setupEnv (deps ++ [(tmrModSummary tmr, tmrModInfo tmr)])

setupEnv [(tmrModSummary tmr, tmrModInfo tmr)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised that we don't need to setup the env with the dependencies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked for every usage of this and made sure it is only called with GhcSessionDeps. In that case I don't think we need to call setupEnv

setPriority priorityGenerateCore
packageState <- hscEnv <$> use_ GhcSession file
liftIO $ compileModule runSimplifier packageState [(tmrModSummary x, tmrModInfo x) | x <- tms] tm
liftIO $ compileModule runSimplifier packageState tm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised that this can work with GhcSession and doesn't need GhcSessionDeps

src/Development/IDE/Core/Rules.hs Outdated Show resolved Hide resolved
src/Development/IDE/Core/Rules.hs Show resolved Hide resolved
@michaelpj
Copy link
Contributor

Any parents of the module use TH

Possibly dumb question, but: we need object code for a module's dependencies if the module uses TH. This PR achieves that by checking a module's dependants to see if they'll eventually use TH. But couldn't we just set up rules for computing the object code and only request it if we need it?

i.e.

  • Module A uses TH, uses B, so we request object code for B
  • Module B uses C, so we request the object code for C in order to produce that for B
  • etc.

Isn't this the joy of building our IDE on a build system?

I haven't understood the code in detail, but it looks like you're sort of doing that already. So what stops this becoming entirely demand-driven?

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 28, 2020

But couldn't we just set up rules for computing the object code and only request it if we need it?

Good question. This is indeed what we were doing before, with the GenerateBytecode rule.

To compile a module, we need the actual typechecked AST that ghc computes after typechecking. However, to avoid excessive memory usage, we only keep typechecked ASTs for files that the user has open in their editor. For all other modules, we compute a ModIface(which we can use to typecheck dependent modules) and keep it in memory¸ throwing away the actual AST.

This is no good for when we need to compile object/byte code for a module. Previously, we were calling the Typecheck rule on such modules all over again. This meant that for projects making extensive use of TH, we were keeping typechecked ASTs in memory for all modules, even if they weren't open in the editor. Furthermore, sometimes the GetModIface and Typecheck would run typechecking for the same module concurrently, leading to the bug described in #614 (which is likely due to an underlying race condition in GHC).

Now, we figure out if we are going to need the object code before we throw away the typechecked AST, restoring the invariant that we only keep typechecked ASTs in memory for files the user has opened. Furthermore, we avoid duplicating work caused by needing to typecheck the same file twice.

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 28, 2020

There are a bunch of options here we can expose to the user -

  1. Bytecode or object code?
  2. Run simplifier or not? (affects IDE performace differently depending on the complexity of the TH)
  3. Run desugarer for non-TH projects? If yes, desugar everything or only files of interest?

For 1) and 2), I imagine we want to set a default based on some benchmarks.

@michaelpj
Copy link
Contributor

Interesting, this is potentially a usecase for Core Interface Sections, something we've been working on at IOHK. The idea being that you dump the post-desugaring Core into the interface file, and ideally you should then be able to resume compilation from that step. So you'd be able to do the Typecheck->ObjectCode step later without redoing work. (Speculative, obviously)

@pepeiborra
Copy link
Collaborator

pepeiborra commented Sep 28, 2020

There are a bunch of options here we can expose to the user -

  1. Bytecode or object code?
  2. Run simplifier or not? (affects IDE performace differently depending on the complexity of the TH)
  3. Run desugarer for non-TH projects? If yes, desugar everything or only files of interest?

For 1) and 2), I imagine we want to set a default based on some benchmarks.

My views:

  1. Since currently bytecode cannot be serialised, object code is a much better fit.
  2. Never run the simplifier, until we have a direct application or user request, then make it configurable.
  3. Run the desugarer always, but at a lower priority than typechecking, and only on the files of interest.

@alanz
Copy link
Collaborator

alanz commented Sep 28, 2020

I thought bytecode was serialised for the external interpreter?

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 1, 2020

I think I have found the root cause of the bug that #741 fixed.

We call GHC.typecheckedModule, which is meant for interactive use. In particular, it calls mkBootModDetailsTc to generate a ModDetails

-- compiler/GHC/Iface/Tidy.hs
-- This is Plan A: make a small type env when typechecking only,
-- or when compiling a hs-boot file, or simply when not using -O
--
-- We don't look at the bindings at all -- there aren't any
-- for hs-boot files

If we want to generate code, we need to use tidyProgram to generate the ModDetails, which requires the module to be desugared.

For now, I've changed IDE.typecheckModule to always desugar, as well as removing all usages of 'interactive' ghc API functions. If someone can think of a nice way to run the desugarer at a lower priority when we don't need to compile code, please let me know.

Also, @mpickering says we always need to run the simplifier, as discussed in #410. We do set -O0, which should mitigate the slowdown somewhat.

@ndmitchell
Copy link
Collaborator

The way to run it at a lower priority is to do:

typeCheck = rule for type check

desugar = do
    use typeCheck ...
    reschedule 0
    do desugaring

main = depend on desugar

That says start with desugaring, but once you have type checked, put yourself to the back of the queue behind all other type checks.

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 1, 2020

@ndmitchell what happens if a rule with high priority uses a rule with low priority?

@ndmitchell
Copy link
Collaborator

The rule with high priority will wait for the low priority one to finish - you get priority inversion - so its generally not a good thing to do (although its not that harmful, unless you have a lot of normal priority stuff waiting around)

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 1, 2020

I'm a bit confused about the meaning of priorities. We have the following in ghcide,

newtype Priority = Priority Double

setPriority :: Priority -> Action ()
setPriority (Priority p) = reschedule p

priorityTypeCheck :: Priority
priorityTypeCheck = Priority 0

priorityGenerateCore :: Priority
priorityGenerateCore = Priority (-1)

priorityFilesOfInterest :: Priority
priorityFilesOfInterest = Priority (-2)

Which one of these will be given preference by the scheduler?

Reading the shake source for reschedule only adds to my confusion...

reschedule x = do
    (wait, _) <- actionAlwaysRequeuePriority (PoolDeprioritize $ negate x) $ pure ()

Too many negations for me to follow 😕

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 1, 2020

Getting some strange failures when I add desugaring to kick ¯\_(ツ)_/¯

  cpp
    cpp-error:                                                                                            FAIL (1.95s)
      test/src/Development/IDE/Test.hs:44:
      Could not find (DsError,(2,1),"error: unterminated",Nothing) in List [Diagnostic {_range = Range {_start = Position {_line = 2, _character = -1}, _end = Position {_line = 2, _character = -1}}, _severity = Just DsError, _code = Nothing, _source = Just "CPP", _message = " error: unterminated #ifdef\n    3 | #ifdef FOO\n      | \n", _tags = Nothing, _relatedInformation = Nothing}]
    cpp-ghcide:                                                                                           OK (1.24s)
  diagnostics
    fix syntax error:                                                                                     FAIL (0.64s)
      test/src/Development/IDE/Test.hs:115:
      Incorrect number of diagnostics for Uri {getUri = "file:///tmp/extra-dir-37097490727748/Testing.hs"}, expected [] but got List [Diagnostic {_range = Range {_start = Position {_line = 0, _character = 15}, _end = Position {_line = 0, _character = 19}}, _severity = Just DsError, _code = Nothing, _source = Just "parser", _message = "parse error on input \8216wher\8217", _tags = Nothing, _relatedInformation = Nothing},Diagnostic {_range = Range {_start = Position {_line = 0, _character = 15}, _end = Position {_line = 0, _character = 19}}, _severity = Just DsError, _code = Nothing, _source = Just "parser", _message = "parse error on input \8216wher\8217", _tags = Nothing, _relatedInformation = Nothing}]
    introduce syntax error:                                                                               FAIL (1.18s)
      test/src/Development/IDE/Test.hs:78:
      Got unexpected diagnostics for Uri {getUri = "file:///tmp/extra-dir-37097490727749/Testing.hs"} got List [Diagnostic {_range = Range {_start = Position {_line = 0, _character = 15}, _end = Position {_line = 0, _character = 21}}, _severity = Just DsError, _code = Nothing, _source = Just "parser", _message = "parse error on input \8216wherre\8217", _tags = Nothing, _relatedInformation = Nothing},Diagnostic {_range = Range {_start = Position {_line = 0, _character = 15}, _end = Position {_line = 0, _character = 21}}, _severity = Just DsError, _code = Nothing, _source = Just "parser", _message = "parse error on input \8216wherre\8217", _tags = Nothing, _relatedInformation = Nothing}]

First one is particularly spooky. Position {_line = 2, _character = -1}} ?

Otherwise, I think this is ready after a re-review.

@pepeiborra
Copy link
Collaborator

This looks fantastic, great job @wz1000, hopefully fixing all the TH bugs in the process.

I will find the time to re-review tomorrow evening. In the meantime:

  • could you clarify the shape of the Shake graph after these changes? Ideally we put desugaring as a separate step after typechecking, so that we can get diagnostics as early as possible.
  • have you done any benchmarks? What's the plan to test on a TH example?

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 1, 2020

The changes to the shake graph are as follows:

  1. Typecheck doesn't write out (or generate) any iface files, it just does whats stated on the tin
  2. GetModIface/GetModIfaceFromDisk is where all the magic happens.
    • We use NeedsObjectCode to figure out if we need to compile the module
    • If we do, we use_ GenerateCore for FOIs, or call generateCoreDefinition for non-FOIs
    • If need object code and generated core, we use that to generate a full Iface file, along with object code
    • Otherwise, we just generate a 'simple' iface file by calling 'mkIfaceTc`
    • In the end, we figure out we need to write the iface file on disk, and do so if required
  3. kick now calls GenerateCore to produce warnings, and GetHieAST to write out hie files.

Benchmarks and fixing the remaining tests are next on my agenda.

I'm still conflicted about whether we need to keep the result of GenerateCore in the graph, or if we could just throw it away.

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 2, 2020

I'm using the following configuration for benchmarking:

example:
    name: haskell-lsp-types
    version: 0.22.0.0
    # # path: path/to/example
    module: src/Language/Haskell/LSP/Types/Lens.hs

Here are the results:

version    name                         success   samples   startup              setup                userTime               delayedTime             totalTime             maxResidency   allocatedBytes
upstream   hover                        True      100       10.614806232000001   0.0                  0.10257583399999999    3.340193600000001e-2    0.14050068000000002   319777184      6115153688
HEAD       hover                        True      100       7.7185963520000005   0.0                  0.45777121499999973    3.5779573e-2            0.497748911           164670560      2716464040
upstream   edit                         True      100       9.299547219          0.0                  27.364207991           0.20843280399999992     27.578036549          339209832      47839423136
HEAD       edit                         True      100       8.063393366          0.0                  38.008773957999985     0.6213196139999998      38.635416729          166595120      68167264816
upstream   getDefinition                True      100       9.299885888          0.0                  6.3038403e-2           2.0231722999999993e-2   8.6907673e-2          320671832      6090475128
HEAD       getDefinition                True      100       8.01779254           0.0                  7.801578100000002e-2   0.39756615800000006     0.479599582           164391728      2692337384
upstream   hover after edit             True      100       9.569584439          0.0                  0.649604651            27.137278584000008      27.792731983000003    333945152      55002822280
HEAD       hover after edit             True      100       7.8474618430000005   0.0                  1.2949229160000006     76.95365204099998       78.255706743          175195272      135863207032
upstream   completions after edit       True      100       9.292151514          0.0                  6.870972618000005      21.154938894000004      28.032305479          339928928      69540079968
HEAD       completions after edit       True      100       8.354598959          0.0                  7.296952549000002      22.032134922999994      29.337312232000002    177797184      65961116000
upstream   code actions                 True      100       9.418781461          0.7522214460000001   0.6891327480000001     4.1814197000000004e-2   0.737375199           365674056      8408618680
HEAD       code actions                 True      100       8.478195708000001    0.8031139460000001   0.9741454130000003     0.28154403900000013     1.262175481           222930536      5155011264
upstream   code actions after edit      True      100       9.205042200000001    0.0                  65.97669554199999      3.9817547000000016e-2   66.02095223           388161080      126965911240
HEAD       code actions after edit      True      100       8.362657242000001    0.0                  73.54964010300002      0.4446636240000001      74.000585133          245196664      116454301472
upstream   documentSymbols after edit   True      100       9.106493854          0.0                  0.4217413190000001     0.5491246350000001      0.977217133           328066504      8863760416
HEAD       documentSymbols after edit   True      100       10.102078910000001   0.0                  0.4835813629999999     1.193969614000001       1.6880487540000002    165829184      6147453776

User time and total time increase, max residency is halved.

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 2, 2020

Here are the results if we don't desugar and just typecheck

version    name                         success   samples   startup              setup                userTime               delayedTime             totalTime              maxResidency   allocatedBytes
upstream   hover                        True      100       9.775739957          0.0                  7.509792200000004e-2   2.3456665999999994e-2   0.10212122800000001    319672744      6128870352
HEAD       hover                        True      100       7.677350291000001    0.0                  0.5480348690000003     3.4458624000000014e-2   0.586015334            164726944      2717224448
upstream   edit                         True      100       9.43701669           0.0                  24.708787200000003     0.217431362             24.931171044000003     339036160      47838467520
HEAD       edit                         True      100       7.486315764          0.0                  22.994777094999996     0.576966195             23.576577878000002     188922200      40749324328
upstream   getDefinition                True      100       9.275651766000001    0.0                  6.240487899999998e-2   2.4159567e-2            9.006775900000001e-2   320183152      6105467816
HEAD       getDefinition                True      100       7.625304047          0.0                  0.2696263159999998     0.5086219970000002      0.7818169930000001     164392840      2695024224
upstream   hover after edit             True      100       9.439057042          0.0                  0.6495210729999996     27.40231568300001       28.057617792000002     334082656      54990381576
HEAD       hover after edit             True      100       7.530357511          0.0                  1.1434777970000005     76.56428582400001       77.714699726           187706984      135592405328
upstream   completions after edit       True      100       9.397739779          0.0                  6.843254851000001      21.314822434000003      28.165777491           339736568      69531959720
HEAD       completions after edit       True      100       7.456776028          0.0                  6.937418449            19.876863895            26.821768431000002     190212280      62544143512
upstream   code actions                 True      100       9.308324401          0.780531033          0.7470925310000002     4.502434800000001e-2    0.797485121            365567584      8405471256
HEAD       code actions                 True      100       7.5375515470000005   0.8169353610000001   0.8686975460000003     0.6868279529999997      1.5637661770000002     246016776      4879625856
upstream   code actions after edit      True      100       9.41876682           0.0                  66.48259642200003      3.841946e-2             66.525442083           387453680      126967822672
HEAD       code actions after edit      True      100       7.496058034000001    0.0                  63.494944305000004     0.6717071160000001      64.170956139           257168424      115286460936
upstream   documentSymbols after edit   True      100       9.233068117          0.0                  0.4202038720000001     0.52859087              0.956152438            328096856      8845712320
HEAD       documentSymbols after edit   True      100       7.518854782000001    0.0                  0.666159482            1.035005161             1.7089357550000002     180192304      5353000216
diff --git a/src/Development/IDE/Core/OfInterest.hs b/src/Development/IDE/Core/OfInterest.hs
index 1caacfc3..dda25030 100644
--- a/src/Development/IDE/Core/OfInterest.hs
+++ b/src/Development/IDE/Core/OfInterest.hs
@@ -94,10 +94,10 @@ kick = mkDelayedAction "kick" Debug $ do
     liftIO $ progressUpdate KickStarted
 
     -- Update the exports map for the project
-    results <- uses GenerateCore $ HashMap.keys files
+    results <- uses TypeCheck $ HashMap.keys files
     ShakeExtras{exportsMap} <- getShakeExtras
     let mguts = catMaybes results
-        !exportsMap' = createExportsMapMg mguts
+        !exportsMap' = createExportsMapTc $ map tmrTypechecked mguts
     liftIO $ modifyVar_ exportsMap $ evaluate . (exportsMap' <>)

@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 2, 2020

If we use bytecode instead of object code, and don't desugar in kick, here is what we get:

version    name                         success   samples   startup              setup                userTime               delayedTime             totalTime            maxResidency   allocatedBytes
upstream   hover                        True      100       10.891087254         0.0                  9.209635400000003e-2   2.7845442000000005e-2   0.123939107          320352552      6135043000
HEAD       hover                        True      100       10.904560363         0.0                  0.280567656            3.521885500000002e-2    0.319338823          282604048      6249242712
upstream   edit                         True      100       9.811893251          0.0                  28.136112453000003     0.24779284699999996     28.389346223         340329488      47839749088
HEAD       edit                         True      100       10.118380068         0.0                  245.20731385400006     0.28579294200000005     245.50228024600003   489770408      440044434672
upstream   getDefinition                True      100       9.386153363          0.0                  6.171676200000002e-2   2.0830596999999996e-2   8.5869031e-2         320234360      6101097840
HEAD       getDefinition                True      100       10.123143692000001   0.0                  6.514175700000001e-2   0.23917315700000008     0.308031795          283381232      6222840680
upstream   hover after edit             True      100       9.385658361          0.0                  0.6469818239999999     32.112947690999995      32.766236413         331231416      54991535400
HEAD       hover after edit             True      100       10.045937113         0.0                  2.628170223999999      256.7142117480001       259.348206441        486343256      486450681840
upstream   completions after edit       True      100       9.385118779          0.0                  6.7748024220000005     21.02937944699999       27.811350462         339541104      69533543008
HEAD       completions after edit       True      100       9.122527803          0.0                  11.919567483           220.11206598000004      232.039704228        482652976      462159548352
upstream   code actions                 True      100       9.204064802000001    0.7599791300000001   0.8867297640000003     5.1324132e-2            0.944999629          365385928      8410147512
HEAD       code actions                 True      100       9.287688051          2.4146111510000003   1.0743402820000005     0.13389134              1.2140506320000002   426299520      12438797472
upstream   code actions after edit      True      100       9.170937589000001    0.0                  65.652404632           3.9814251000000016e-2   65.696594452         387002776      127032999936
HEAD       code actions after edit      True      100       9.925779058          0.0                  268.79041496599984     0.12485944300000001     268.920168937        515844856      519647459384
upstream   documentSymbols after edit   True      100       8.971234719          0.0                  0.4269711559999999     0.5216603530000001      0.9547841890000001   328411872      8856234008
HEAD       documentSymbols after edit   True      100       11.055146743         0.0                  0.879312115            3.0882921830000005      3.974355355          384377232      15598340128

Something is clearly going wrong somewhere, or we would expect similar performance to upstream

@maralorn
Copy link
Contributor

maralorn commented Oct 4, 2020

Thank you so much! This is great!

jneira added a commit to jneira/haskell-language-server that referenced this pull request Oct 5, 2020
pepeiborra pushed a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
…l/ghcide#836)

* Use object code for TH

* Set target location for TargetFiles

* Fix tests

* hlint

* fix build on 8.10

* fix ghc-lib

* address review comments

* hlint

* better error handling if module headers don't parse

* Always desugar, don't call interactive API functions

* deprioritize desugar when not TH, fix iface handling

* write hie file on save

* more tweaks

* fix tests

* disable desugarer warnings

* use ModGuts for exports map

* don't desugar

* use bytecode

* make HiFileStable early-cutoff

* restore object code

* re-enable desugar

* review comments

* Don't use ModIface for DocMap

* fix docs for the current module

* mark test as broken on windows
pepeiborra pushed a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
…l/ghcide#836)

* Use object code for TH

* Set target location for TargetFiles

* Fix tests

* hlint

* fix build on 8.10

* fix ghc-lib

* address review comments

* hlint

* better error handling if module headers don't parse

* Always desugar, don't call interactive API functions

* deprioritize desugar when not TH, fix iface handling

* write hie file on save

* more tweaks

* fix tests

* disable desugarer warnings

* use ModGuts for exports map

* don't desugar

* use bytecode

* make HiFileStable early-cutoff

* restore object code

* re-enable desugar

* review comments

* Don't use ModIface for DocMap

* fix docs for the current module

* mark test as broken on windows
pepeiborra pushed a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
…l/ghcide#836)

* Use object code for TH

* Set target location for TargetFiles

* Fix tests

* hlint

* fix build on 8.10

* fix ghc-lib

* address review comments

* hlint

* better error handling if module headers don't parse

* Always desugar, don't call interactive API functions

* deprioritize desugar when not TH, fix iface handling

* write hie file on save

* more tweaks

* fix tests

* disable desugarer warnings

* use ModGuts for exports map

* don't desugar

* use bytecode

* make HiFileStable early-cutoff

* restore object code

* re-enable desugar

* review comments

* Don't use ModIface for DocMap

* fix docs for the current module

* mark test as broken on windows
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
7 participants