Skip to content

Support LLVM Flang + add entropy-maximizing filter to speed convergence

Compare
Choose a tag to compare
@rouson rouson released this 21 Jul 06:00
· 57 commits to main since this release
7d564b8

This release adds a feature to speed convergence and adds support for a fourth compiler in addition to the GNU, NAG, & Intel compilers:

  1. All tests pass with the LLVM Flang (flang-new) compiler,
  2. The cloud-microphysics/app/train-cloud-microphysics.f90 program includes new options
    • --bins filters the training data to maximize the Shannon entropy by selecting only one data point per bin in a five-dimensional phase space,
    • --report controls the frequency of writing JSON files to reduce file I/O costs,
  3. Eliminates several warning messages from the NAG compiler (nagfor).
  4. Switches a dependency from Sourcery to Julienne to eliminate the requirement for coarray feature support,
  5. Adds the GELU activation function.
  6. Speeds up the calculation of the data needed to construct histograms.
  7. Adds a new cloud-microphysics/train.sh program to manage the training process,
  8. Adds the ability to terminate a training run based on a cost-function tolerance rather than a fixed number of epochs.

What's Changed

  • Remove second, unneeded and no longer supported build of gcc by @ktras in #150
  • build: update to sourcery 4.8.1 by @rouson in #151
  • doc(README): add instructions for auto offload by @rouson in #152
  • Work around ifx automatic-offloading bugs by @rouson in #145
  • Add bug workarounds for gfortran-14 associate-stmt bug by @ktras in #155
  • Switching from Sourcery to Julienne by @rouson in #154
  • Update fpm manifest with tag for v1.0 of dependency julienne by @ktras in #157
  • Support compilation with LLVM Flang by @ktras in #159
  • Update cloud-microphysics compiler and dependencies by @rouson in #160
  • Add GELU activation function by @rouson in #161
  • Feature: Faster histogram construction when the number of bins exceeds 80 by @rouson in #162
  • Read & perform inference on networks for which the hidden-layer width varies across layers by @rouson in #166
  • Fix/Feature(JSON): disambiguate tensor_range objects and allow flexible line-positioning of objects by @rouson in #165
  • Feature: Support JSON output for networks with varying-width hidden layers by @rouson in #167
  • Feature: filter training data for maximal information entropy via flat multidimensional output-tensor histograms by @rouson in #169
  • Features: maximize information entropy and variable reporting interval. by @rouson in #170
  • build: add single-file compile script by @rouson in #171
  • Add ubuntu to CI by @ktras in #156
  • Feature: add training script in cloud-microphysics/train.sh by @rouson in #172
  • feat(train.sh): graceful exits by @rouson in #173
  • refac(train): rm rendundant array allocations by @rouson in #174
  • feat(cloud-micro): write 1st/last cost, fewer JSON by @rouson in #175
  • feat(train.sh): add outer loop for refinement by @rouson in #176
  • feat(cloud-micro): terminate on cost-tolerance by @rouson in #177
  • Concurrent loop through each mini-batch during training by @rouson in #178
  • test(adam): reset iterations so all tests pass with flang-new by @rouson in #179
  • doc(README): add flags to optimize builds by @rouson in #180
  • fix(single-source): mv script outside fpm's purview by @rouson in #182
  • doc(README): optimize ifx builds by @rouson in #181
  • Eliminate compiler warnings by @rouson in #183
  • fix(single-file-source): respect file extension case by @rouson in #184

Full Changelog: 0.11.1...0.12.0