-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🧹 Cleanup and fixes for TGI (#96)
* feat(Jetstream Pt): set torch default dtype to bfloat16 This is what should be used on TPUs. * fix(warmup): make warmup work for smallest prefill size Also, add timing checks in warmup test. * chore(pytest): ignore common warnings * chore(docker): add jetstream installation step to TGI image * feat(tgi): update TGI version - TGI version updated from 2.0.3 to 0ff6ff60ada291840beed63d8bf458d6f9606f7f, that is essentially 2.3.0 + few fixes to get the v2 proto interface working again. - This update was done because otherwise debug logs were not working. This can be complicated if we need to debug something in TGI, and so far the only solution was a hack forcing to re-add the debug in the server. This is now fixed and with the Jetstream Pt generator logs are fine now. - Obviously there was a drawback 😖 Logs on threads spawned by the Pytorch/XLA generator were now all weird and always appearing even when debug was off. This has been fixed, but the workaround is not very nice (I set an env var). I think the multithread generator is going to go away soon anyway, so this should not be a big deal. - The new TGI version is built using Python 3.11, while so far with optimum-tpu we have worked on Python 3.10, because that is what they say it should be used on Pytorch/XLA front page. So the image has been updated with the python3.11 support and the required transformers installation for now, because it's easier since they run on separate processes. - TGI build process has changed a bit, so dockerfile has been changed accordingly. The text-generation-router-v2 is renamed into text-generation-router because that is what the launcher expects. * chore(install): install torch for cpu This extra step will make images leaner as it will avoid having the gpu dependencies.
- Loading branch information
1 parent
01d3a42
commit f5ad698
Showing
11 changed files
with
84 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# This is not a complete list of dependencies, but it allows to install torch without CUDA support | ||
--index-url https://download.pytorch.org/whl/cpu | ||
torch==2.4.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters