On some recent Flux experiences #2171

mcabbott · 2023-02-01T15:57:09Z

Apparently @owainkenwayucl was trying out Flux, and @giordano was helping him out.

https://twitter.com/owainkenway/status/1620771011863121921

https://github.com/owainkenwayucl/JuliaML/blob/main/Fashion/simple.jl

Edit: now at https://www.youtube.com/watch?v=Yd1JkPljpbY

It's useful to see what problems newcomers run into. Especially people new to Julia.

Scope issues like global a_sum_ = a_sum_ + 1 are weird. Flux's tutorials tend to define many functions to put everything in local scope... maybe too many... but for this common use, perhaps Flux ought to have an accuracy function instead of every tutorial rolling its own?
Wrong array dimensions give pretty confusing errors. Perhaps Flux layers should catch more of them, instead of waiting for * etc. Some made-up examples (but examples from the wild might be different):

julia> Conv((3,3),3=>4)(rand(10,10,3))  # reasonable to try, maybe it should just work
ERROR: DimensionMismatch: Rank of x and w must match! (3 vs. 4)

julia> Conv((3,3),3=>4)(rand(10,10,2,1))  # error could print out which Conv layer, and size of input, etc.
ERROR: DimensionMismatch: Input channels must match! (2 vs. 3)

julia> Dense(2,3)(rand(4))  # error could be from Dense, there is no matrix called A in user code
ERROR: DimensionMismatch: second dimension of A, 2, does not match length of x, 4

Perhaps we can encourage more use of outputsize for understanding array sizes. There could be some way to get an overview of the whole model, like this: https://flax.readthedocs.io/en/latest/getting_started.html#view-model-layers . The default show doesn't know the input size, at present, so it can't tell you all of this. One idea would be to give Chain a mutable field in which to store the most recent input size?

The text was updated successfully, but these errors were encountered:

skyleaworlder · 2023-02-04T18:06:27Z

It seems that storing input size is a necessary way. Chain itself is a good place for both points: clear responsibilities and uncomplicated design.

In flax, all operators push their input & output size to a list (call_info_stack) which is included by a global thread local object (_DynamicContext). If tabulate called, flax would go to fetch the last record from call_info_stack and format its properties. This is another choice.

Flax's way introduces the complexity of thread management, but it provides more convenience to sampling or profiling work.

ToucheSir · 2023-02-04T18:20:24Z

I would like to avoid all the implicit state Flax uses if possible. It makes reasoning about errors more difficult and introduces this temporal dimension to input size handling I'm not sure we want to deal with. If you have layers pre-shape inferred and post-shape inferred, what happens when you pass a new input shape at each stage? How do you know which stage a layer or a composite of layers is in. What happens when you compose pre- and post-shape inference layers? etc.

mcabbott · 2023-02-04T18:53:45Z

What both show and outputsize presently avoid is any need to understand how Chain/Parallel/PairwiseFusion/etc work. There is no layer API to support them. The ways I thought about writing a tabulate-like function need something like a re-implementation of such layers, in order to simultaneously print their contents and trace sizes into their constituents. Could that be avoided?

One quick idea would be to duplicate Nil as NilPrint, and then have @layer always define something like (l::Dense)(x::AbstractArray{NilPrint}) = (println(l, " ... ", size(x)); @invoke l(x::AbstractArray)). Then some trace(model, size) would print the sizes in execution order, without thinking about Functors/children at all.

mcabbott · 2023-04-25T01:31:39Z

Not the same experience, but see here for how weird the model zoo's loss & accuracy accumulation functions look. Would be nice to fix that somehow.

https://stackoverflow.com/questions/75921783/accuracy-and-gradient-update-not-within-the-same-training-loop

mcabbott mentioned this issue Feb 3, 2023

Add friendly size check #2176

Merged

skyleaworlder mentioned this issue Feb 5, 2023

Propose accuracy functions #2181

Draft

3 tasks

mcabbott added the documentation label Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On some recent Flux experiences #2171

On some recent Flux experiences #2171

mcabbott commented Feb 1, 2023 •

edited

Loading

skyleaworlder commented Feb 4, 2023 •

edited

Loading

ToucheSir commented Feb 4, 2023

mcabbott commented Feb 4, 2023

mcabbott commented Apr 25, 2023

On some recent Flux experiences #2171

On some recent Flux experiences #2171

Comments

mcabbott commented Feb 1, 2023 • edited Loading

skyleaworlder commented Feb 4, 2023 • edited Loading

ToucheSir commented Feb 4, 2023

mcabbott commented Feb 4, 2023

mcabbott commented Apr 25, 2023

mcabbott commented Feb 1, 2023 •

edited

Loading

skyleaworlder commented Feb 4, 2023 •

edited

Loading