LLM inference - Openai Chat Api or Llama compatible chat params #5558

raymon-io · 2024-08-04T12:57:23Z

Describe the feature and the current behaviour/state

Does GenAi Llm inference support openai chat api or support taking parameters similar to Llama models? If not then I am requesting implementing the api similar to Openai Chat completion.

Please specify the use cases for this feature

For example having a stop param to stop the generation of the llm inference would be great. And Llama compatible chat format can be useful to integrate with other frameworks such as Langchain.

Any Other info

There is a mention of Llama in a comment in llm.h although I am not sure what this is about.

raymon-io added the type:feature Enhancement in the New Functionality or Request for a New Solution label Aug 4, 2024

google-ml-butler bot assigned ayushgdev Aug 4, 2024

kuaashish assigned kuaashish and unassigned ayushgdev Aug 5, 2024

kuaashish added the task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup label Aug 5, 2024

kuaashish assigned schmidt-sebastian and unassigned kuaashish Aug 5, 2024

kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM inference - Openai Chat Api or Llama compatible chat params #5558

LLM inference - Openai Chat Api or Llama compatible chat params #5558

raymon-io commented Aug 4, 2024 •

edited

Loading

LLM inference - Openai Chat Api or Llama compatible chat params #5558

LLM inference - Openai Chat Api or Llama compatible chat params #5558

Comments

raymon-io commented Aug 4, 2024 • edited Loading

Describe the feature and the current behaviour/state

Please specify the use cases for this feature

Any Other info

raymon-io commented Aug 4, 2024 •

edited

Loading