-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On some recent Flux experiences #2171
Comments
It seems that storing input size is a necessary way. Chain itself is a good place for both points: clear responsibilities and uncomplicated design. In flax, all operators push their input & output size to a list ( Flax's way introduces the complexity of thread management, but it provides more convenience to sampling or profiling work. |
I would like to avoid all the implicit state Flax uses if possible. It makes reasoning about errors more difficult and introduces this temporal dimension to input size handling I'm not sure we want to deal with. If you have layers pre-shape inferred and post-shape inferred, what happens when you pass a new input shape at each stage? How do you know which stage a layer or a composite of layers is in. What happens when you compose pre- and post-shape inference layers? etc. |
What both One quick idea would be to duplicate |
Not the same experience, but see here for how weird the model zoo's loss & accuracy accumulation functions look. Would be nice to fix that somehow. |
Apparently @owainkenwayucl was trying out Flux, and @giordano was helping him out.
https://twitter.com/owainkenway/status/1620771011863121921
https://github.com/owainkenwayucl/JuliaML/blob/main/Fashion/simple.jl
Edit: now at https://www.youtube.com/watch?v=Yd1JkPljpbY
It's useful to see what problems newcomers run into. Especially people new to Julia.
Scope issues like
global a_sum_ = a_sum_ + 1
are weird. Flux's tutorials tend to define many functions to put everything in local scope... maybe too many... but for this common use, perhaps Flux ought to have anaccuracy
function instead of every tutorial rolling its own?Wrong array dimensions give pretty confusing errors. Perhaps Flux layers should catch more of them, instead of waiting for
*
etc. Some made-up examples (but examples from the wild might be different):outputsize
for understanding array sizes. There could be some way to get an overview of the whole model, like this: https://flax.readthedocs.io/en/latest/getting_started.html#view-model-layers . The defaultshow
doesn't know the input size, at present, so it can't tell you all of this. One idea would be to give Chain a mutable field in which to store the most recent input size?The text was updated successfully, but these errors were encountered: