Enable XLA Compatibility for Pretraining BERT with Keras NLP on TensorFlow GPU #1661

jacob-talroo · 2024-06-05T17:13:26Z

Describe the bug
Training BERT using Keras NLP is significantly slower due to the keras.layers.Embedding not being XLA compatible by default on TensorFlow GPU. This is similar to an issue reported for Keras at keras-team/keras#19809.

To Reproduce
You can reproduce this issue by following the steps in this Colab Notebook: Link to Notebook

Expected behavior
I expect BERT training using Keras NLP on TensorFlow GPU with XLA to be optimized for performance, similar to native TensorFlow implementations.

Additional context
The lack of XLA compatibility affects the training speed and efficiency on GPU, crucial for model training scalability and practical application in production environments.

See also:

Would you like to help us fix it?
Yes, I am willing to contribute to resolving this issue by testing and suggesting implementations that ensure XLA compatibility.

The text was updated successfully, but these errors were encountered:

mattdangerw · 2024-07-23T23:23:53Z

I think we probably want to solve this at the Keras level not the KerasNLP ideally.

I played around with always using the one hot approach under a distribution strategy. keras-team/keras@master...mattdangerw:embedding-fix

I think this could work, but I am not sure we would want to do it when XLA is off by default on the TF backend. So the first thing might be to look at enabling XLA with tf.distribute.

mattdangerw · 2024-07-23T23:24:40Z

Is there a reason training on the Jax backend doesn't work for your use case? It is likely faster, and everything is XLA compatible as there is no other option on Jax.

jacob-talroo · 2024-07-29T19:59:23Z

We have switched to the JAX backend. If there is no desire to reduce Keras 2 TF vs Keras 3 TF performance degradation, we can close this one out.

github-actions · 2024-08-13T01:55:22Z

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions bot assigned sachinprasadhs Jun 5, 2024

sachinprasadhs added type:feature New feature or request keras-team-review-pending labels Jun 11, 2024

mattdangerw assigned mattdangerw and unassigned sachinprasadhs Jul 23, 2024

mattdangerw removed the keras-team-review-pending label Jul 23, 2024

sachinprasadhs added the stat:awaiting response from contributor label Jul 23, 2024

github-actions bot added the stale label Aug 13, 2024

jacob-talroo closed this as completed Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable XLA Compatibility for Pretraining BERT with Keras NLP on TensorFlow GPU #1661

Enable XLA Compatibility for Pretraining BERT with Keras NLP on TensorFlow GPU #1661

jacob-talroo commented Jun 5, 2024

mattdangerw commented Jul 23, 2024

mattdangerw commented Jul 23, 2024

jacob-talroo commented Jul 29, 2024

github-actions bot commented Aug 13, 2024

Enable XLA Compatibility for Pretraining BERT with Keras NLP on TensorFlow GPU #1661

Enable XLA Compatibility for Pretraining BERT with Keras NLP on TensorFlow GPU #1661

Comments

jacob-talroo commented Jun 5, 2024

mattdangerw commented Jul 23, 2024

mattdangerw commented Jul 23, 2024

jacob-talroo commented Jul 29, 2024

github-actions bot commented Aug 13, 2024