Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* feat(models): Add models.json blocks for Granite Code 3b and 8b Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Initial model params for granite code 3b Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(model config): Fix model configs for Granite Code * Use the right tokenizer_file name * Use the right transformer_params_key based on the file name in model_params * Use the updated name to indicate HF tokenizers Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat(granite): Add model params for granite-code-8b Something isn't quite working with this model yet, but the config should be accurate at this point. Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(deps): Add tokenizers to the deps explicitly It was implicitly being pulled in via lm_eval -> transformers, but it's better to have it explicit since we use it directly Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat(tokenizer): Add basic support for jinja2 template rendering for HF tokenizers This is a much simplified version of the corresponding logic in transformers. I opted for this so that the full transformers dependency is not added here. CITE: https://github.com/huggingface/transformers/blob/main/src/transformers/tokenization_utils_base.py#L1522 Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(chat): Add HFTokenizerChatFormatter and use it for HF tokenizers This will allow the jinja2 templates for HF tokenizers to be applied without needing to hard-code the formatter logic. This will likely need to be duplicated in the embedded code version of chat. Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(deps): Add jinja2 as an explicit dep It was getting pulled in implicitly via flask and lm_eval -> transformers, but better to have it explicit. Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat(log): Add env-based LOG_LEVEL config to CLI Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat(log): Add better logging in model and generate In generate, there were a number of commented-out log lines. These are safe to leave in as long as lazy string interpolation is used. Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat(generate): Make prepending BOS model-conigurable And disable it for Granite Code models Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(chat): Refactor chat template logic to encapsulate all formatting in classes The formatted strings may not be perfectly 1:1 with the previous impl, but they should be in line with the official model guidelines: * https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3 * https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-2 Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(chat): Fix small formatting bugs in llama3 chat formatter Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * test: Add initial unit tests for chat formatters There's no formal execution framework for pytest yet, but these were helpful in ensuring that the formatting was working correctly! To run them, install pytest and run `pytest tests/` Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(logging): Disable logging in generate unless set in the env There is an incompatibility with logging and torch._dynamo, so this disables it unless the developer asks for it explicitly. NOTE: The TC team has stated that they have holistic logging on the roadmap so this is a short-term solution pending a more robust approach. REF: https://github.com/pytorch/torchchat/actions/runs/11963066986/job/33493237302#step:14:3599 Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Remove trailing n from llama3 <|eot_id|> There's inconsistency in the documentation on whether or not there should be a n after <|eot_id|>, but this maintains consistency with previous formatting Branch: GraniteCodeSupport Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Jack-Khuu <jack.khuu.7@gmail.com>
- Loading branch information