Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧹 Cleanup and fixes for TGI #96

Merged
merged 6 commits into from
Sep 27, 2024
Merged

Conversation

tengomucho
Copy link
Collaborator

What does this PR do?

Many changes to prepare a proper TGI image compatible with Jetstream Pytorch.
In particular:

  • torch default type set to bfloat16 (because that's what's used mostly in TPUs anyway).
  • Make warmup prepare smallest size from bucket (that it was previously skipping).
  • ignore common warnings in pytests to avoid cluttering tests outputs
  • add jetstream installation step to TGI image
  • update TGI image (refer to commit message for details)
  • install torch for cpu to avoid installing the default gpu version when possible.

This is what should be used on TPUs.
Also, add timing checks in warmup test.
- TGI version updated from 2.0.3 to
  0ff6ff60ada291840beed63d8bf458d6f9606f7f, that is essentially 2.3.0 +
  few fixes to get the v2 proto interface working again.
- This update was done because otherwise debug logs were not working.
  This can be complicated if we need to debug something in TGI, and so
  far the only solution was a hack forcing to re-add the debug in the
  server. This is now fixed and with the Jetstream Pt generator logs are
  fine now.
- Obviously there was a drawback 😖 Logs on threads spawned by the
  Pytorch/XLA generator were now all weird and always appearing even
  when debug was off. This has been fixed, but the workaround is not
  very nice (I set an env var). I think the multithread generator is
  going to go away soon anyway, so this should not be a big deal.
- The new TGI version is built using Python 3.11, while so far with
  optimum-tpu we have worked on Python 3.10, because that is what they
  say it should be used on Pytorch/XLA front page. So the image has been
  updated with the python3.11 support and the required transformers
  installation for now, because it's easier since they run on separate
  processes.
- TGI build process has changed a bit, so dockerfile has been changed
  accordingly. The text-generation-router-v2 is renamed into
  text-generation-router because that is what the launcher expects.
This extra step will make images leaner as it will avoid having the
gpu dependencies.
@tengomucho tengomucho changed the title Quick fixes for jetstream 🧹 Cleanup and fixes for TGI Sep 25, 2024
@tengomucho tengomucho marked this pull request as ready for review September 25, 2024 14:18
Copy link
Member

@mfuntowicz mfuntowicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LFG! Let's go!

@@ -8,7 +8,7 @@ RUN tar -C /tgi -xf /tgi/sources.tar.gz --strip-components=1

# Build cargo components (adapted from TGI original Dockerfile)
# Note that the build image is aligned on the same Linux version as the base image (Debian bookworm/ Ubuntu 22.04)
FROM lukemathwalker/cargo-chef:latest-rust-1.77-bookworm AS chef
FROM lukemathwalker/cargo-chef:latest-rust-1.79-bookworm AS chef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using 1.81 (latest)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umh I used the latest version that was on the TGI docker image.

COPY --from=tgi /tgi/launcher launcher
RUN cargo build --release --workspace --exclude benchmark
RUN cargo build --profile release-opt

# Python base image
FROM ubuntu:22.04 AS base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu:24.04 or is it breaking stuff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All TPU packages and images are so far tuned for this version. I might update that if at some point I have time to, but I do not see a priority about it for now.

@tengomucho tengomucho merged commit f5ad698 into main Sep 27, 2024
3 checks passed
@tengomucho tengomucho deleted the quick-fixes-for-jetstream branch September 27, 2024 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants