-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Expand LLM support with PromptModel, PromptNode, and PromptTemplate #3667
Conversation
@julian-risch @sjrl @ZanSara @mathislucka @agnieszka-m I think this one is ready to go. I addressed Aga's and Julian's comments - excellent feedback. Aside from trivial package name changes of
|
I also added two more default prompt templates: |
@bogdankostic, how can we ensure that OPENAI_API_KEY gets injected even for PRs on forks? I've tested locally on my machine with OPENAI_API_KEY, but on CI it is not injected, and I suspect it is because this PR is on my fork.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks superb to me! Let's get this going. 👍
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
# attempt to resolve args first | ||
if args: | ||
if len(args) != len(self.prompt_params): | ||
logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should continue to make sure not to use f-strings in our logging statements as we found that it can unintentionally affect performance as found in this PR #3212. So instead something like
logger.warning("For %s, expected %s arguments, ...", self.name, self.prompt_params)
I think would be preferred.
), | ||
PromptTemplate( | ||
name="question-answering-check", | ||
prompt_text="Does the following context contain the answer to the question. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a question mark at the end of Does the following context contain the answer to the question.
to be grammatically correct?
), | ||
PromptTemplate( | ||
name="multiple-choice-question-answering", | ||
prompt_text="Question:$questions ; Choose the most suitable option to answer the above question. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a space to follow the format of other prompts
prompt_text="Question:$questions ; Choose the most suitable option to answer the above question. " | |
prompt_text="Question: $questions; Choose the most suitable option to answer the above question. " |
PromptTemplate( | ||
name="topic-classification", | ||
prompt_text="Categories: $options; What category best describes: $documents; Answer:", | ||
prompt_params=["documents", "options"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use categories
as the variable name to be consistent with how the prompt is structured?
Also should prompt_params
be updated to
prompt_params=["options", "documents"]
since options
comes before documents
in the prompt?
PromptTemplate( | ||
name="language-detection", | ||
prompt_text="Detect the language in the following context and answer with the " | ||
"name of the language. Context: $documents; Answer:", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the prompt_params fields missing for this template? Or do they not need to be specified?
PromptTemplate( | ||
name="translation", | ||
prompt_text="Translate the following context to $target_language. Context: $documents; Translation:", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this PromptTemplate. Should we specify the prompt_params?
def __init__( | ||
self, | ||
model_name_or_path: str = "google/flan-t5-base", | ||
max_length: Optional[int] = 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max_length
value is optional I think the default value should be None
. What do you think?
If we need the default value to be set then I would recommend removing the Optional
type.
model_name_or_path: str = "google/flan-t5-base", | ||
max_length: Optional[int] = 100, | ||
use_auth_token: Optional[Union[str, bool]] = None, | ||
use_gpu: Optional[bool] = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the Optional
type here since initialize_device_settings
requires use_gpu
to be a bool. And giving the Optional
type suggests that the value None
would be a valid value for this variable.
model_kwargs=model_input_kwargs, | ||
) | ||
|
||
def invoke(self, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like *args
is ever used. And PromptModel.invoke()
only passes through **kwargs
. I think if it is not needed or used, we should consider removing it and making the new signature
def invoke(self, **kwargs)
could be even remote, for example, a call to a remote API endpoint. | ||
""" | ||
|
||
def __init__(self, model_name_or_path: str, max_length: Optional[int] = 100, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max_length value is optional I think the default value should be None. What do you think?
If we need the default value to be set then I would recommend removing the Optional type.
""" | ||
|
||
def __init__( | ||
self, api_key: str, model_name_or_path: str = "text-davinci-003", max_length: Optional[int] = 100, **kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max_length value is optional I think the default value should be None. What do you think?
If we need the default value to be set then I would recommend removing the Optional type.
def __init__( | ||
self, | ||
model_name_or_path: str = "google/flan-t5-base", | ||
max_length: Optional[int] = 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max_length value is optional I think the default value should be None. What do you think?
If we need the default value to be set then I would recommend removing the Optional type.
model_name_or_path: Union[str, PromptModel] = "google/flan-t5-base", | ||
default_prompt_template: Optional[Union[str, PromptTemplate]] = None, | ||
output_variable: Optional[str] = None, | ||
max_length: Optional[int] = 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max_length value is optional I think the default value should be None. What do you think?
If we need the default value to be set then I would recommend removing the Optional type.
…late (#3667) Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Related Issues
Proposed Changes:
Adds PromptModel, PromptNode, and PromptTemplate components
Adds support to prompt the LLM model directly via PromptNode
Adds support to use PromptNode in Haystack pipelines
Adds support for T5-Flan model invocation
Adds support for OpenAI InstructGPT models
See #3665 and #3306 for more details.
How did you test it?
Unit tests and Google Colab notebok
Notes for the reviewer
Component names are finalized. More details in #3665
Default supported tasks to be expanded (see Appendix F, figure 17 of https://arxiv.org/abs/2210.11416)
Let's adhere thoroughly to #3665 and ensure community requests are honoured (see Discord discussions).
Checklist