Tokenization endpoint #1649

benniekiss · 2024-01-26T13:06:21Z

Is your feature request related to a problem? Please describe.

For generative models, many are limited by a maximum number of tokens. in some workflows, the prompts are generated dynamically to use as much context as possible by tokenizing the responses first to ensure that they will fit in the context.

Currently, this requires a local tokenization scheme which limits a complete API workflow.

Describe the solution you'd like

backends like transformers and llama.cpp both offer tokenization methods that just tokenize text without generating
a response. Attaching these methods to a tokenization api endpoint would be helpful in removing local processing requirements.

Describe alternatives you've considered

Additional context

mudler · 2024-01-26T15:17:44Z

good point, it should be relatively easy indeed to expose

benniekiss added the enhancement New feature or request label Jan 26, 2024

mudler added area/api area/backends roadmap up for grabs Tickets that no-one is currently working on labels Jan 26, 2024

mudler mentioned this issue Jan 26, 2024

[EPIC] Model support dashboard (v2) #1126

Open

89 tasks

shraddhazpy mentioned this issue Oct 1, 2024

feat: tokenization endpoint #3710

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokenization endpoint #1649

Tokenization endpoint #1649

benniekiss commented Jan 26, 2024

mudler commented Jan 26, 2024

Tokenization endpoint #1649

Tokenization endpoint #1649

Comments

benniekiss commented Jan 26, 2024

mudler commented Jan 26, 2024