Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Missing some input keys: - passed data requires input keys if containing { } #13055

Closed
4 of 14 tasks
Kaotic3 opened this issue Nov 8, 2023 · 4 comments
Closed
4 of 14 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules

Comments

@Kaotic3
Copy link

Kaotic3 commented Nov 8, 2023

System Info

Python 3.11.5
Langchain (pip show) 0.0.327
Windows OS
Visual Studio Code

Who can help?

@eyurtsev

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

I searched and was surprised this has not come up.

I am using LangChain for a RAG workflow - and when I send a document, if that document contains { } - it throws a missing key error - it is treating the content of the document, as it would a normal prompt where you might have "question {question}" and expect an input key of 'question' it then reports back that all of the { } are in fact different missing keys.

For example, my data contains this:

"...1 2 ------------------------------------ {w14 w15 w16se w16cid w16 w16cex w16sdtdh wp14}{DP}{AD}{S::}"

It will say that we are missing numerous keys:

ValueError: Missing some input keys: {'AD', 'w14 w15 w16se w16cid w16 w16cex w16sdtdh wp14', ...}

Now, I can clean the data prior to sending, but I was wondering whether it should behave like this given that this document is already within { } as content?

I use the "FewShotPromptTemplate" to create a prompt which includes a "Suffix" and my suffix is:

def get_suffix():
    return """
    Document: {content}
    Question: {question}
    """

Here content is the content of the document that contains the { } set out above.

I build the prompt like this:

prompt_template = FewShotPromptTemplate(
            examples = examples,
            example_prompt = get_prompt_template(example_template, example_variables),
            prefix = prefix,
            suffix = suffix,
            input_variables = input_variables
        )
        prompt = prompt_template.format(question=question, context=context)
        return prompt

I also did a test using another piece of code:

document_context = text_response + "{AD}"
prompt = ChatPromptTemplate.from_template("my_specific_prompt": {document}.\n{format_instructions}")
formated_prompt = prompt.format(**{"document": document_context, "format_instructions":output_parser.get_format_instructions()})

Introducing a random {AD} into the text response. It did not fail. It messed up the results, but it didn't actually cause any missing input key errors.

So this may be limited to the FewShotPromptTemplate?

Expected behavior

I would have thought that anything passed within a curly bracket set would be considered as plain text, not parsed for further keys that might be embedded in that curly bracket set and throw an error when it cannot find them?

Maybe I am wrong, but that is what I would have expected and is what appears to happen when using the ChatPromptTemplate.from_template?

Copy link

dosubot bot commented Nov 8, 2023

🤖

Hello @Kaotic3,

I'm here to help! I'm an AI bot dedicated to assisting with issues related to LangChain. I can help you debug issues, answer your questions, and guide you on how to become a better contributor. While we wait for a human maintainer, let's see if we can tackle this issue together. I'm analyzing your problem now and will get back to you with a more detailed response soon.

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Nov 8, 2023
Copy link

dosubot bot commented Nov 8, 2023

🤖

Based on the information you've provided and the context of similar issues in the LangChain repository, it seems that the issue you're experiencing is related to how Python's built-in string formatting mechanism works. When you use curly braces {} in a string that you're formatting with the format method, Python interprets anything inside the braces as a variable to be replaced. If a key in the template is not found in the keyword arguments, a KeyError is raised.

In your case, the document content you're passing to the FewShotPromptTemplate contains curly braces, which Python is interpreting as variables. Since these variables are not defined in your code, you're seeing a ValueError: Missing some input keys.

There are a few potential solutions to this issue:

  1. Escape the curly braces in your document content: You can escape the curly braces in your document content by doubling them. Here's how you can modify your code:
document_content = "...1 2 ------------------------------------ {{w14 w15 w16se w16cid w16 w16cex w16sdtdh wp14}}{{DP}}{{AD}}{{S::}}"

By doubling the curly braces, you're instructing Python to treat them as literal characters in the string, rather than as placeholders for variables.

  1. Use a different template format: The FewShotPromptTemplate class includes a template_format parameter that specifies the format of the prompt template. The options are 'f-string' and 'jinja2'. The chosen format determines how the content within the curly brackets is treated. For 'f-string', the content within the curly brackets is treated as a variable to be replaced. For 'jinja2', the content within the curly brackets is treated as a placeholder for a variable to be replaced. You might consider using 'jinja2' as your template format.
prompt_template = FewShotPromptTemplate(
    examples = examples,
    example_prompt = get_prompt_template(example_template, example_variables),
    prefix = prefix,
    suffix = suffix,
    input_variables = input_variables,
    template_format = 'jinja2'
)
  1. Upgrade to the latest version of LangChain: The latest version of LangChain is v0.0.331rc2, and you're currently using v0.0.327. It's possible that the issue you're experiencing has been fixed in a more recent version of LangChain.

I hope this helps! If you have any further questions or if the issue persists, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@Kaotic3
Copy link
Author

Kaotic3 commented Nov 8, 2023

Thanks jinja2 worked perfectly.

@Kaotic3 Kaotic3 closed this as completed Nov 8, 2023
@Kaotic3 Kaotic3 reopened this Nov 8, 2023
@Kaotic3 Kaotic3 closed this as completed Nov 8, 2023
@humphreyde
Copy link

🤖

Based on the information you've provided and the context of similar issues in the LangChain repository, it seems that the issue you're experiencing is related to how Python's built-in string formatting mechanism works. When you use curly braces {} in a string that you're formatting with the format method, Python interprets anything inside the braces as a variable to be replaced. If a key in the template is not found in the keyword arguments, a KeyError is raised.

In your case, the document content you're passing to the FewShotPromptTemplate contains curly braces, which Python is interpreting as variables. Since these variables are not defined in your code, you're seeing a ValueError: Missing some input keys.

There are a few potential solutions to this issue:

  1. Escape the curly braces in your document content: You can escape the curly braces in your document content by doubling them. Here's how you can modify your code:
document_content = "...1 2 ------------------------------------ {{w14 w15 w16se w16cid w16 w16cex w16sdtdh wp14}}{{DP}}{{AD}}{{S::}}"

By doubling the curly braces, you're instructing Python to treat them as literal characters in the string, rather than as placeholders for variables.

  1. Use a different template format: The FewShotPromptTemplate class includes a template_format parameter that specifies the format of the prompt template. The options are 'f-string' and 'jinja2'. The chosen format determines how the content within the curly brackets is treated. For 'f-string', the content within the curly brackets is treated as a variable to be replaced. For 'jinja2', the content within the curly brackets is treated as a placeholder for a variable to be replaced. You might consider using 'jinja2' as your template format.
prompt_template = FewShotPromptTemplate(
    examples = examples,
    example_prompt = get_prompt_template(example_template, example_variables),
    prefix = prefix,
    suffix = suffix,
    input_variables = input_variables,
    template_format = 'jinja2'
)
  1. Upgrade to the latest version of LangChain: The latest version of LangChain is v0.0.331rc2, and you're currently using v0.0.327. It's possible that the issue you're experiencing has been fixed in a more recent version of LangChain.

I hope this helps! If you have any further questions or if the issue persists, please let me know.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

I got the same problem, there are many "{x}..{xx}" in my system prompt, how can I handle this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

2 participants