-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kendra context summary encoding/decoding issues. #700
Comments
Hi @cboin1996 |
Yea it does look to be at the Kendra level, but I still wonder if it could be addressed in the bot fulfillment lambda. One potential idea I had, is to force specific encoding's within the bot fulfillment lambda.
ex:
in the bot-fulfillment lambda, do something like '•'.encode('cp1252').decode('utf-8') but instead of just When I run the above code I get the correct character - The caveat is this only would work for a single document encoding type.. so maybe I need to figure out if I can specify document encodings within Kendra so that when the fulfillment lambda call's kendra the document is forced to be encoded/decoded with the right encoding that matches the document. |
Hi @cboin1996, have you tried implementing this logic in post-processing lambda hook which could improve this response prior to passing it off to the client interface? With post-processing lambda hooks, you will be able to customize output from QnABot as per your needs. |
Hi @cboin1996 - I would suggest add that to post processing lambda hook and reference that lambda in the content designer under the field "LAMBDA_POSTPROCESS_HOOK" - you can catch all special characters as needed. Please close this issue if that resolves it. Thank you. |
Closing this issue, thank you. |
Describe the bug
I have a QnaBot that has a kendra index loading CSV's and PDF documents from an S3 bucket.
The bot is integrated with bedrock via the LLM plugin. The summarized responses include the
Context
dropdown. When I click theContext
drop-down, it seems that•
's are appearing in place of the proper characters.I quickly read about encoding's here: https://stackoverflow.com/questions/2477452/%C3%A2%E2%82%AC-showing-on-page-instead-of
When checking the
FulfillmentLambda
logs, it appearsKendra
is in fact returning it's queries with thesecharacters, so it gets passed along through bot fulfilliment, to the llm lambda, and then back to the client interface.
To Reproduce
Create a Kendra index, upload a PDF containing
&
or’
and perhaps the issue will re-produce.Expected behavior
I wonder if it is possible to intercept these symbols, or re-encode the queries produced by kendra in bot-fulfillment with UTF-8 before handing it off to the llm lambdas.
Please complete the following information about the solution:
These changes were merged via a yaml merge.
Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: