Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recurrent difficulties #12

Open
DrChainsaw opened this issue Feb 13, 2020 · 0 comments
Open

Recurrent difficulties #12

DrChainsaw opened this issue Feb 13, 2020 · 0 comments

Comments

@DrChainsaw
Copy link
Owner

Flux uses sequences for recurrent layers while ONNX wants 3D input.

Current workaround is to

  1. Specify inputs to recurrent as 3D
  2. Reshape/ flatten time dimension as batch dimension if hitting a Dense layer with 3D input

1 is not so good as the general API is to specify input shapes in the Flux format.
2 is ok-ish, but it is not really an honest representation of what is going on in flux and it does create a fair bit of look-ahead-a-few-layers-and-figure-out-if-a-few-OPs-shall-be-ignored wonderfulness.

Also, Dense -> Recurrent does not work iirc.

Perhaps something can be done with sequences, although I dread trying to figure out in deserialization if something is wrapped in a for loop over elements of a sequence or whatever one needs to do to feed a sequence into GEMM.

Perhaps the easiest way out is to just give in and use a 3D->sequence wrapper around recurrent layers but I can't imagine this being good for performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant