Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Neo] Refactor Neo TRT-LLM partition script #2166

Merged
merged 7 commits into from
Jul 17, 2024

Conversation

ethnzhng
Copy link
Contributor

Description

  1. Refactor structure of script to be more in line with sm_neo_neuron_partition.py
  2. Remove code related to deprecated Neo compiler_flags
  3. Move generation of serving.properties file post-compilation from tensorrt_llm_toolkit to here
  4. Add support for HF dataset cache env var

@ethnzhng ethnzhng requested review from zachgk, frankfliu and a team as code owners July 11, 2024 23:27
@ethnzhng ethnzhng changed the title Refactor Neo TRT-LLM partition script [Neo] Refactor Neo TRT-LLM partition script Jul 17, 2024
@ethnzhng ethnzhng merged commit 9b72600 into deepjavalibrary:master Jul 17, 2024
9 checks passed
@ethnzhng ethnzhng deleted the neo-trtllm-refactor branch August 2, 2024 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants