Skip to content

Releases: ORNL/curifactory

v0.18.0

09 Oct 21:45
Compare
Choose a tag to compare

Added

  • An ImageReporter for adding any generated and saved images into the output report.
  • A LatexTableReporter for adding a latex string version of a dataframe in the report.

v0.17.1

07 Dec 13:14
Compare
Choose a tag to compare

Added

  • Link to the output log in generated reports.

Changed

  • --print-params output is now conditioned on --verbose: whether specifying
    a hash directly or the flag by itself, the _DRY_REPS will be included when
    --verbose is specified and removed when not.

Fixed

  • Excessive "no run info" warnings from caching when running an experiment
    notebook.
  • run_experiment incorrectly handling a param_files of None.

v0.17.0

16 Nov 21:09
Compare
Choose a tag to compare

Added

  • Templating/keyword formating for cacher path overrides. This allows overriding
    cacher paths (at the expense of automatically not tracking them) to specify
    paths outside of the cache folder or directly including parameters in the
    filename etc.
  • PathRef cacher, a special type of cacher that allows exclusively passing around
    paths and short-circuiting directly based on that path's existence (as opposed
    to the FileReferenceCacher which saves a file containing the path), rather
    than handling saving/loading itself.
  • --hashes debugging flag, when specified it prints out the hash and name of
    each parameter set passed into an experiment and then exits.
  • --print-params debugging flag, when specified it prints out the full string
    representation of each parameter set passed into an experiment, or, if at
    least the first few characters of a hash are specified, it prints out the
    corresponding parameter set hash from the params_registry.json. Note that
    both this and the --hashes flag are temporary debugging tools until the CLI
    gets broken out into subcommands, where they may become part of a separate
    command.

Fixed

  • --notebook manager's not using modified experiment cache paths.
  • Manager maps are disabled after a run_experiment call, so managers used in
    live contexts (e.g. notebooks) may continue to run stages after the experiment
    has completed.
  • Experiments generating multiple reports instead of just once and
    linking/copying the folders as necessary.

Removed

  • Old ExperimentArgs references and associated deprecation warnings.

v0.16.1

23 Oct 21:05
Compare
Choose a tag to compare

Fixed

  • Accidental singleton cacher objects in stage decorators causing all DAG-mode
    reproduction artifacts to always show as the artifacts from the first record.

v0.16.0

16 Oct 21:12
Compare
Choose a tag to compare

Added

  • Optional dependency curifactory[h5] (pytables, for h5 pandas cacher) to setup.
  • Ability to configure whether non-curifactory logs are silenced with
    --all-loggers flag.

Changed

  • Repr for Lazy objects, so OutputSignatureErrors don't just list pointer addresses.
  • Procedures initialized without an artifact manager don't auto-create one.
    Instead, the procedure.run() function now optionally takes a manager and
    records list.

Fixed

  • Lazy instance cached from previous run not displaying correct preview in detailed report map.
  • Experiment run spewing out command error if running from non-git-repo. (Single line
    warning is now displayed instead.)
  • Raising InputSignatureError for potentially unrelated TypeErrors raised within stages.
  • Completer parsing for experiments and parameters on MacOS.
  • generate_report() calls inside an experiment run() breaking in map mode.
  • Fallback package report CSS not being used if report path has no style.css.

v0.15.1

09 Aug 19:34
Compare
Choose a tag to compare

Added

  • Hash dry representation output to params registry, to help debug hashing.

Fixed

  • Spacing issue around parameter set list in generated notebook.
  • Extra metadata not grabbed in save_metadata if metadata had already been collected.

v0.15.0

01 Aug 18:08
Compare
Choose a tag to compare

The args -> params naming convention change will eventually cause breaking changes (currently args references should just trigger a deprecation warning.) See the migration guide for details on how to remove: https://ornl.github.io/curifactory/latest/migration.html

Added

  • PandasCacher as a more generalized variant of PandasCsvCacher and
    PandasJsonCacher, supporting much more of the IO types pandas supports.

Changed

  • args.ExperimentArgs to params.ExperimentParameters (former still exists with deprecation warning.)
  • Record.args to Record.params (former still exists with deprecation warning.)
  • Organization in examples directory.

Fixed

  • None extension for cacher not correctly handled in get_path.
  • Generated experiment notebook not reference correct cache path for artifacts on store full runs.
  • set_logging_prefix incorrectly handling global logging scope (which can lead
    to recursion errors.)

v0.14.2

24 Jul 15:05
Compare
Choose a tag to compare

Fixed

  • DAG mapping incorrectly handling stages with missing inputs in state.

v0.14.1

21 Jul 20:07
Compare
Choose a tag to compare

Fixed

  • DAG never adding a stage with no outputs to the execution list.

v0.14.0

19 Jul 15:00
Compare
Choose a tag to compare

DAG-based execution of stages is finally here!

Note that there are breaking API changes in this release, please see the migration guide:

Added

  • DAG representation of experiment, this is created and analyzed during the experiment
    mapping phase. The DAG is used to more intelligently determine which stages need to
    execute, based on which outputs are ever actually needed for the final experiment
    outputs (leaf nodes).
  • --map CLI flag, this runs the mapping phase of the experiment and then exits, printing
    out the experiment DAG and showing which artifacts it found in cache and the run name
    that generated them.
  • inputs to aggregate stage decorator. This acts similarly to inputs on a regular
    stage, except these input artifacts are searched for in the list of records the aggregate
    is running across, rather than the aggregate's own record. It is also not a requirement
    that the requested artifact exist in every passed record (though it will throw a warning
    on any records where it doesn't exist.) Similar to stage, each input needs to have a
    corresponding argument (with the same name as in the string) in the function definition.
    The artifacts for each input will be passed as a dictionary, where the values are the
    artifacts, and the keys are the records they come from. Note that while you can technically
    have None as the inputs and still access each record's state, in order for the DAG
    to compute properly, you must specify each needed state artifact in the inputs. (or use the
    --no-dag flag listed below.)
  • stage_cachers list to record, at the beginning of every stage this will contain
    references to the initialized cachers for that stage - this can be used to get
    output path information.
  • -n CLI flag shorthand for --names
  • --params CLI flag long form of -p
  • RawJupyterNotebookCacher, which takes a list of cells of raw strings of python code and
    stores them as a notebook. This is useful for exporting an interactive analysis with each
    experiment run.

Changed

  • --no-map CLI flag to --no-dag, which disables both the mapping phase and the
    DAG analysis/DAG-based execution determination. This returns curifactory to its
    regular stage-by-stage cache short-circuit determination.
    NOTE: if any weird bugs are encountered, or if inputs isn't set on
    aggregate stages, it's advisable to use this flag.
  • --parallel-mode flag to --parallel-safe

Fixed

  • Record copy not also containing a copy of the state artifact representations.
  • Wrong progress bar updating if multiple records/args had the same hash