Release v0.15.0 · FluxML/Flux.jl

Flux v0.15.0

Highlights

This release includes two breaking changes:

The recurrent layers have been thoroughly revised. See below and read the documentation for details.
Flux now defines and exports its own gradient function. Consequently, using gradient in an unqualified manner (e.g., after using Flux, Zygote) could result in an ambiguity error.

The most significant updates and deprecations are as follows:

Recurrent layers have undergone a complete redesign in PR 2500.
- RNNCell, LSTMCell, and GRUCell are now exported and provide functionality for single time-step processing: rnncell(x_t, h_t) -> h_{t+1}.
- RNN, LSTM, and GRU no longer store the hidden state internally, it has to be explicitely passed to the layer. Moreover, they now process entire sequences at once, rather than one element at a time: rnn(x, h) -> h′.
- The Recur wrapper has been deprecated and removed.
- The reset! function has also been removed; state management is now entirely up to the user.
The Flux.Optimise module has been deprecated in favor of the Optimisers.jl package.
Now Flux re-exports the optimisers from Optimisers.jl. Most users will be uneffected by this change.
The module is still available for now, but will be removed in a future release.
Most Flux layers will re-use memory via NNlib.bias_act!, when possible.
Further support for Enzyme.jl, via methods of Flux.gradient(loss, Duplicated(model)).
Flux now owns & exports gradient and withgradient, but without Duplicated this still defaults to calling Zygote.jl.
Flux.params has been deprecated. Use Zygote's explicit differentiation instead,
gradient(m -> loss(m, x, y), model), or use Flux.trainables(model) to get the trainable parameters.
Flux now requires Functors.jl v0.5. This new release of Functors assumes all types to be functors by default. Therefore, applying Flux.@layer or Functors.@functor to a type is no longer strictly necessary for Flux's models. However, it is still recommended to use @layer Model for additional functionality like pretty printing.
@layer Modelnow behaves the same as @layer :expand Model, which means that the model is expanded into its sublayers (if there are any) when printed. To force compact printing, use @layer :noexpand Model.

Diff since v0.14.25

Merged pull requests:

Use NNlib.bias_act! (#2327) (@mcabbott)
Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) (#2393) (@mcabbott)
Epsilon change in normalise for stability (#2421) (@billera)
Add more Duplicated methods for Enzyme.jl support (#2471) (@mcabbott)
Export Optimisers and remove params and Optimise from tests (#2495) (@CarloLucibello)
RNNs redesign (#2500) (@CarloLucibello)
Adjust docs & Flux.@functor for Functors.jl v0.5, plus misc. depwarns (#2509) (@mcabbott)
GPU docs (#2510) (@mcabbott)
CompatHelper: bump compat for Optimisers to 0.4, (keep existing compat) (#2520) (@github-actions[bot])
Distinct init for kernel and recurrent (#2522) (@MartinuzziFrancesco)
Functors v0.5 + tighter version bounds (#2525) (@CarloLucibello)
deprecation of params and Optimise (continued) (#2526) (@CarloLucibello)
Bump codecov/codecov-action from 4 to 5 (#2527) (@dependabot[bot])
updates for Functors v0.5 (#2528) (@CarloLucibello)
fix comment (#2529) (@oscardssmith)
set expand option as default for @layer (#2532) (@CarloLucibello)
misc stuff for v0.15 release (#2534) (@CarloLucibello)
Tweak quickstart.md (#2536) (@mcabbott)
Remove usage of global variables in linear and logistic regression tutorial training functions (#2537) (@christiangnrd)
Fix linear regression example (#2538) (@christiangnrd)
Update gpu.md (#2539) (@AdamWysokinski)

Closed issues:

RNN layer to skip certain time steps (like Masking layer in keras) (#644)
Backprop through time (#648)
Initial state in RNNs should not be learnable by default (#807)
Bad recurrent layers training performance (#980)
flip function assumes the input sequence is a Vector or List, it can be Matrix as well. (#1042)
Regression in package load time (#1155)
Recurrent layers can't use Zeros() as bias (#1279)
Flux.destructure doesn't preserve RNN state (#1329)
RNN design for efficient CUDNN usage (#1365)
Strange result with gradient (#1547)
Call of Flux.stack results in StackOverfloxError for approx. 6000 sequence elements of a model output of a LSTM (#1585)
Gradient dimension mismatch error when training rnns (#1891)
Deprecate Flux.Optimisers and implicit parameters in favour of Optimisers.jl and explicit parameters (#1986)
Pull request #2007 causes Flux.params() calls to not get cached (#2040)
gradient of Flux.normalise return NaN when std is zero (#2096)
explicit differentiation for RNN gives wrong results (#2185)
Make RNNs blocked (and maybe fixing gradients along the way) (#2258)
Should everything be a functor by default? (#2269)
Flux new explicit API does not work but old implicit API works for a simple RNN (#2341)
Adding Simple Recurrent Unit as a recurrent layer (#2408)
deprecate Flux.params (#2413)
Implementation of AdamW differs from PyTorch (#2433)
gpu should warn if cuDNN is not installed (#2440)
device movement behavior inconsistent (#2513)
mark as public any non-exported but documented interface (#2518)
broken image in the quickstart (#2530)
Consider making the :expand option the default in @layer (#2531)
Flux.params is broken (#2533)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.15.0

Flux v0.15.0

Highlights

Contributors