-
Notifications
You must be signed in to change notification settings - Fork 3.5k
RFC: What do you think about TRAX? #1478
Comments
Hello Lukasz, Current problem #1 - TF1.0 Current problem #2 - Estimators TRAX - What I like TRAX - What I dislike I think in the long term T2T will not benefit from multiple branches that need to be maintained separately. Especially if one of them is still heavily relying on TF1.0 and graph mode, and the other is relying on a quite small framework like JAX. Thoughts? |
I've used Autograd in the past for some research, and found it to be really inefficient and ended up walking away from it to use pytorch, which had huge speed ups. As a more concrete example, at one point I encountered the backwards pass of autograd consuming 20gb of memory, which ended up being fixed with PR. I mention the above to add support to @f-lng's comment, that adding another framework for t2t would make maintaining t2t harder, since bugs would inevitably arise that stem from the underlying framework. OTOH, I think it's important to note that T2T's mission in part is to make deep learning more accessible. I'm not familiar enough with TF2.0 to understand if migrating to it would help this goal. What are your thoughts on that @lukaszkaiser / @f-lng? |
@etragas-fathom I do think that TF2.0 would serve the mission of making it more accessible, because model execution / codeflow is simplified with the eager programming style and model creation is simplified with the keras API. I might be biased here, as I only touch TF if I can not get around it (e.g. for T2T), and always found Keras to be a pretty good alternative, especially if you are doing engineering, not research. I think a proper TF2.0 based T2T would be perfect, but if TRAX is the way to go, it might be a better approach to simply drop the TF backend alltogether and focus on the TRAX version. |
The problem with TF 2.0 at least for now is that when you want speed (use @tf.function or functional Keras mode) you're back in TF 1.0 graph-mode land. With shape bugs, large stack traces and all, and it feels as hard to debug as TF 1.0 or harder. With JAX the speed problem of autograd is gone (I'm just training a Transformer LM as fast as T2T and a Resnet just a little slower). But other bugs may re-surface with more use, we'll need to see, I guess. Please keep adding comments so we know what to look out for! |
Well, if you guys say TF 2.0 is not a good fit for T2T (yet?), then I guess TRAX is a good alternative, as the code looks very clean and the rewrite is easy to follow :-) Btw, is there already a beam search decoding implemented and documented? I would love to give it a try. |
Perhaps not a hugely insightful comment, it's a shame that this makes installing recent versions of t2t non-trivial under Windows (see #1507). |
I would like TRAX to be independent a repo/package. I don't want to install tensorflow if possible, but T2T depends on it. |
As on 6c7c601 it seems like TRAX has been moved to it's own repo |
… On Mon, Oct 28, 2019, 7:51 AM Alexander ***@***.***> wrote:
As on 6c7c601
<6c7c601>
it seems like TRAX has been moved to it's own repo, but although it's not
clear to which one though.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1478?email_source=notifications&email_token=AAIUEFSJPBZUFXW3WCH33A3QQ34ARA5CNFSM4G4IZMA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECNEX4Y#issuecomment-546982899>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIUEFSKAIECAUVAWPOHQXTQQ34ARANCNFSM4G4IZMAQ>
.
|
Could you tell me the advantage of JAX over PyTorch? Does JAX provide any tool with which Transformer and Reformer becomes faster for the reason other than the fact that JAX probably has better support of TPU? |
@lukaszkaiser Here is an example of Pytorch on Unet, which is readable and beautiful. I am no researcher or engineer. I am just nobody, but I am able to build GAN. I am also able to build a self-supervised learning model to color my favorite Japanese Anime picture without having GAN's unstable problem. But I can only do it in Pytorch because it is so easy, readable, and beautiful. It is so friendly to open source people. https://gist.github.com/Hsankesara/e3b064ff47d538052e059084b8d4df9f#file-unet-py |
@lukaszkaiser Trax takes extremely long time to be imported, which makes it very uncomfortable to debug. |
We're thinking how to make the next T2T much better. One thing that came up is using JAX and gin config and we've prototyped TRAX:
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/trax
If you're interested, please take a look. Run one of the examples, try to change something, train on your data, make your own model, tweak things. If you had trouble doing things in T2T before, let us know if that looks like it'd help!
TRAX is very early and it lacks features and has bug - we know that and we'll be correcting the small things as we go. But we'd love to think about higher-level things that may be easier to address at this stage, before the design stabilizes. Especially if you had trouble doing things in T2T before, let us know if that looks like it'd help!
The text was updated successfully, but these errors were encountered: