-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memoization function #1489
Memoization function #1489
Conversation
src/memoization.ml
Outdated
|
||
type 'a input_spec = { | ||
serialize : 'a -> ser_input; | ||
print : 'a -> string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using print : 'a -> Sexp.t
seems better here, so that we leave the formatting to the caller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently I'm only using serialize
, but this should probably be changed to also include the "human readable" form. I'm wondering if it is better to store this as a lazy function (which would prevent the input from being garbage collected) or as the printed input (which would require computation of print
for every function).
9fd28cf
to
6683fcb
Compare
src/build_system.ml
Outdated
| Some file -> (* build *) build_file_spec file >>| ignore) | ||
) |> Fiber.with_error_handler ~on_error | ||
|
||
let _prepare_file t path = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dead code, let's delete it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code may be nice to register as computation at a later stage, even if it isn't used currently in the code. It essentially builds a files dependencies without building the file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Could you add a comment? Otherwise, future readers will wonder why it's here.
src/build_system.ml
Outdated
@@ -448,6 +385,12 @@ type t = | |||
; hook : hook -> unit | |||
; (* Package files are part of *) | |||
packages : Package.Name.t Path.Table.t | |||
; load_dir_cache : Path.Set.t Memoize.t | |||
(* memoized functions *) | |||
; prepare_rule_def : (Internal_rule.t * Static_deps.t * (unit, Action.t) Build.t, Action.t * Deps.t) CRef.comp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about putting Static_deps.t
inside Internal_rule.t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now fixed this, but it does mean that rule.static_deps (which is defined as lazy) is evaluated twice, once in prepare_rule
and once in build_rule
. Otherwise one would need to memoize this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, evaluating it twice doesn't seem good. We should keep it separate then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to introduce a Once
module, which similarly to lazy only evaluates something when Once.get
is called, but that also caches the result so Once.get
only causes the expression to be evaluated a single time.
8526b2e
to
734fcdf
Compare
@rudihorn Once this is ready to be merged, could you remove the WIP tag? I'd like to give this a read when it's ready. To make sure I understand the basics. |
e42021a
to
46361b2
Compare
I did a bit more work on this PR and I now get very similar timings between the base and tip of this PR. For null builds of the platform, there is now only a 0.1s slowdown, which is likely due to the new cycle detection algorithm. This is fine given that the one in master is broken. We can now rebase and merge this PR. |
6058c84
to
9251fb7
Compare
@rudihorn there is an issue with the signature, could you update the email in @rgrinberg @emillon could you have another look as I changed a couple of things (and @rudihorn as well if you want). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I have a couple minor comments but I look forward to having this merged 👍
9251fb7
to
7f8b3ac
Compare
Bootstrap seems to be broken now. |
@diml feel free to merge after fixing bootstrap. I'd still like to study some parts of this PR, but I don't think we should delay merging. Rebasing these pr's over and over is a waste of time. |
Ok, following the slack conversation, I'll wait for 1.6 and then merge. |
5bc4698
to
3c675b8
Compare
Introduce a general system for memoizing functions. The core of Dune is refactored on top of this system. The new system will allow to share not only the result of external commands between builds but also the result of internal computations. The system automatically keeps tracks of the effects performed by computations in order to detect when it is safe to reuse previous results. The intents of this change is to make incremental builds faster, especially in polling mode. The new system is more powerful as well and will allow us to use the result of commands for computing rules. This patch also re-implement cycle detection as we realized that the previous algorithm was broken and the problem showed up more often in the new system. The new algorithm was taken from the following paper: Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Robert E. Tarjan. 2015. A New Approach to Incremental Cycle Detection and Related Problems. ACM Trans. Algorithms 12, 2, Article 14 (December 2015), 22 pages. DOI: https://doi.org/10.1145/2756553 *) Signed-off-by: Rudi Horn <dyn-git@rudi-horn.de> Signed-off-by: Jeremie Dimino <jeremie@dimino.org>
CHANGES: - Second step of the deprecation of jbuilder: the `jbuilder` binary now emits a warning on every startup and both `jbuilder` and `dune` emit warnings when encountering `jbuild` files (ocaml/dune#1752, @diml) - Change the layout of build artifacts inside _build. The new layout enables optimizations that depend on the presence of `.cmx` files of private modules (ocaml/dune#1676, @bobot) - Fix merlin handling of private module visibility (ocaml/dune#1653 @bobot) - unstable-fmt: use boxes to wrap some lists (ocaml/dune#1608, fix ocaml/dune#1153, @emillon, thanks to @rgrinberg) - skip directories when looking up programs in the PATH (ocaml/dune#1628, fixes ocaml/dune#1616, @diml) - Use `lsof` on macOS to implement `--stats` (ocaml/dune#1636, fixes ocaml/dune#1634, @xclerc) - Generate `dune-package` files for every package. These files are installed and read instead of `META` files whenever they are available (ocaml/dune#1329, @rgrinberg) - Fix preprocessing for libraries with `(include_subdirs ..)` (ocaml/dune#1624, fix ocaml/dune#1626, @nojb, @rgrinberg) - Do not generate targets for archive that don't match the `modes` field. (ocaml/dune#1632, fix ocaml/dune#1617, @rgrinberg) - When executing actions, open files lazily and close them as soon as possible in order to reduce the maximum number of file descriptors opened by Dune (ocaml/dune#1635, ocaml/dune#1643, fixes ocaml/dune#1633, @jonludlam, @rgrinberg, @diml) - Reimplement the core of Dune using a new generic memoization system (ocaml/dune#1489, @rudihorn, @diml) - Replace the broken cycle detection algorithm by a state of the art one from [this paper](https://doi.org/10.1145/2756553) (ocaml/dune#1489, @rudihorn) - Get the correct environment node for multi project workspaces (ocaml/dune#1648, @rgrinberg) - Add `dune compute` to call internal memoized functions (ocaml/dune#1528, @rudihorn, @diml) - Add `--trace-file` option to trace dune internals (ocaml/dune#1639, fix ocaml/dune#1180, @emillon) - Add `--no-print-directory` (borrowed from GNU make) to suppress `Entering directory` messages. (ocaml/dune#1668, @dra27) - Remove `--stats` and track fd usage in `--trace-file` (ocaml/dune#1667, @emillon) - Add virtual libraries feature and enable it by default (ocaml/dune#1430 fixes ocaml/dune#921, @rgrinberg) - Fix handling of Control+C in watch mode (ocaml/dune#1678, fixes ocaml/dune#1671, @diml) - Look for jsoo runtime in the same dir as the `js_of_ocaml` binary when the ocamlfind package is not available (ocaml/dune#1467, @nojb) - Make the `seq` package available for OCaml >= 4.07 (ocaml/dune#1714, @rgrinberg) - Add locations to error messages where a rule fails to generate targets and rules that require files outside the build/source directory. (ocaml/dune#1708, fixes ocaml/dune#848, @rgrinberg) - Let `Configurator` handle `sizeof` (in addition to negative numbers). (ocaml/dune#1726, fixes ocaml/dune#1723, @Chris00) - Fix an issue causing menhir generated parsers to fail to build in some cases. The fix is to systematically use `-short-paths` when calling `ocamlc -i` (ocaml/dune#1743, fix ocaml/dune#1504, @diml) - Never raise when printing located errors. The code that would print the location excerpts was prone to raising. (ocaml/dune#1744, fix ocaml/dune#1736, @rgrinberg) - Add a `dune upgrade` command for upgrading jbuilder projects to Dune (ocaml/dune#1749, @diml) - When automatically creating a `dune-project` file, insert the detected name in it (ocaml/dune#1749, @diml) - Add `(implicit_transitive_deps <bool>)` mode to dune projects. When this mode is turned off, transitive dependencies are not accessible. Only listed dependencies are directly accessible. (ocaml/dune#1734, ocaml/dune#430, @rgrinberg, @hnrgrgr) - Add `toplevel` stanza. This stanza is used to define toplevels with libraries already preloaded. (ocaml/dune#1713, @rgrinberg) - Generate `.merlin` files that account for normal preprocessors defined using a subset of the `action` language. (ocaml/dune#1768, @rgrinberg) - Emit `(orig_src_dir <path>)` metadata in `dune-package` for dune packages built with `--store-orig-source-dir` command line flag (also controlled by `DUNE_STORE_ORIG_SOURCE_DIR` env variable). This is later used to generate `.merlin` with `S`-directives pointed to original source locations and thus allowing merlin to see those. (ocaml/dune#1750, @andreypopp) - Improve the behavior of `dune promote` when the files to be promoted have been deleted. (ocaml/dune#1775, fixes ocaml/dune#1772, @diml) - unstable-fmt: preserve comments (ocaml/dune#1766, @emillon) - Pass flags correctly when using `staged_pps` (ocaml/dune#1779, fixes ocaml/dune#1774, @diml) - Fix an issue with the use of `(mode promote)` in the menhir stanza. It was previously causing intermediate *mock* files to be promoted (ocaml/dune#1783, fixes ocaml/dune#1781, @diml) - unstable-fmt: ignore files using OCaml syntax (ocaml/dune#1784, @emillon) - Configurator: Add `which` function to replace the `which` command line utility in a cross platform way. (ocaml/dune#1773, fixes ocaml/dune#1705, @Chris00) - Make configurator append paths to `$PKG_CONFIG_PATH` on macOS. Previously it was prepending paths and thus `$PKG_CONFIG_PATH` set by users could have been overridden by homebrew installed libraries (ocaml/dune#1785, @andreypopp) - Disallow c/cxx sources that share an object file in the same stubs archive. This means that `foo.c` and `foo.cpp` can no longer exist in the same library. (ocaml/dune#1788, @rgrinberg) - Forbid use of `%{targets}` (or `${@}` in jbuild files) inside preprocessing actions (ocaml/dune#1812, fixes ocaml/dune#1811, @diml) - Add `DUNE_PROFILE` environment variable to easily set the profile. (ocaml/dune#1806, @rgrinberg) - Deprecate the undocumented `(no_keep_locs)` field. It was only necessary until virtual libraries were supported (ocaml/dune#1822, fix ocaml/dune#1816, @diml) - Rename `unstable-fmt` to `format-dune-file` and remove its `--inplace` option. (ocaml/dune#1821, @emillon). - Autoformatting: `(using fmt 1.1)` will also format dune files (ocaml/dune#1821, @emillon). - Autoformatting: record dependencies on `.ocamlformat-ignore` files (ocaml/dune#1824, fixes ocaml/dune#1793, @emillon)
This is referencing the computation model specified in #1280. The goal has been to implement a memoization function which takes a computation of the form
'a -> 'b Fiber.t
and returns a computation'a -> 'b Fiber.t
which additionally performs:Currently the memoization function has been implemented and is able to perform these three functions. An example of how to use the memoization functions is shown in
test/unit-tests/memoize.mlt
. Below is an example of how one could use the system to implement a Fibonacci sequence which is automatically cached between computations.There has been a little bit of progress to change
build_system.ml
so that the actual build process is a little more abstracted to make it easier to drop in thememoization
function and remove the current caching etc. This bit is still a little bit of a mess however, and in essence it just breaks existing caching (for now).Feedback welcome.