Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokenizers: support apply_tool_use_template #662

Closed
gary149 opened this issue Mar 24, 2024 · 1 comment
Closed

tokenizers: support apply_tool_use_template #662

gary149 opened this issue Mar 24, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@gary149
Copy link

gary149 commented Mar 24, 2024

Feature request

Support apply_tool_use_template https://github.com/huggingface/transformers/blob/76a33a10923ccc1074917f6b6a1e719e626b7dc9/src/transformers/models/cohere/tokenization_cohere_fast.py#L420

Motivation

This could be useful to support https://huggingface.co/CohereForAI/c4ai-command-r-v01

@gary149 gary149 added the enhancement New feature or request label Mar 24, 2024
@xenova
Copy link
Collaborator

xenova commented Mar 24, 2024

Sure I can add that! The functionality is already supported actually with the apply_chat_template function (just with different arguments), see #647. For example:

import { AutoTokenizer } from "@xenova/transformers";

const tokenizer = await AutoTokenizer.from_pretrained("Xenova/c4ai-command-r-v01-tokenizer")

// define conversation input:
const conversation = [
  { role: "user", content: "Whats the biggest penguin in the world?" }
]
// Define tools available for the model to use:
const tools = [
  {
    name: "internet_search",
    description: "Returns a list of relevant document snippets for a textual query retrieved from the internet",
    parameter_definitions: {
      query: {
        description: "Query to search the internet with",
        type: "str",
        required: true
      }
    }
  },
  {
    name: "directly_answer",
    description: "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
    parameter_definitions: {}
  }
]


// render the tool use prompt as a string:
const tool_use_prompt = tokenizer.apply_chat_template(
  conversation,
  {
    chat_template: "tool_use",
    tokenize: false,
    add_generation_prompt: true,
    tools,
  }
)
console.log(tool_use_prompt)

@gary149 gary149 closed this as completed Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants