feat: llama-3.2 on bedrock #10

p0deje · 2024-11-21T03:13:15Z

This is initial support for Llama 3.2 90B vision instruct model!

For such a big model, it's very hard to make it work locally with all Alumnium requirements (tool calling, structured output, multimodal). For the time being, AWS Bedrock is a provider that proves to work fine in this initial implementation.

There are few things to keep in mind in this initial implementation:

tool calling types are less strict (e.g. it's common for the model to return str instead of int/bool). Pydantic coercion helps with this.
vision is disabled for now - when the model is used both with image and structured output, the latter does not work. This can probably be worked around with custom response parsing, but this is left for the future (maybe AWS will fix it eventually).
images needs to be resized to max of 1120x1120, but this is not implemented yet due to the previous point.

It would be great to use Ollama or Llama.cpp to support true local inference. This commit however proves that Alumnium can be used with open models!

This is initial support for Llama 3.2 90B vision instruct model! For such a big model, it's very hard to make it work locally with all Alumnium requirements (tool calling, structured output, multimodal). For the time being, AWS Bedrock is a provider that proves to work fine in this initial implementation. There are few things to keep in mind in this initial implementation: 1. tool calling types are less strict (e.g. it's common for the model to return str instead of int/bool). Pydantic coercion helps with this. 2. vision is disabled for now - when the model is used both with image and structured output, the latter does not work. This can probably be worked around with custom response parsing, but this is left for the future (maybe AWS will fix it eventually). 3. images needs to be resized to max of 1120x1120, but this is not implemented yet due to the previous point. It would be great to use Ollama or Llama.cpp to support true local inference. This commit however proves that Alumnium can be used with open models!

p0deje marked this pull request as ready for review November 21, 2024 03:26

p0deje requested a review from sh3pik November 21, 2024 03:27

sh3pik merged commit ce07ce8 into main Nov 22, 2024
4 checks passed

sh3pik deleted the llama-32 branch November 22, 2024 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: llama-3.2 on bedrock #10

feat: llama-3.2 on bedrock #10

p0deje commented Nov 21, 2024

feat: llama-3.2 on bedrock #10

feat: llama-3.2 on bedrock #10

Conversation

p0deje commented Nov 21, 2024