Use callbacks when using LLM with cache #3391

gramliu · 2023-11-23T21:05:58Z

Integrations with observability platforms like LangSmith have been great for tracing calls to OpenAI. However, when a cache is specified, the callbacks (handleLLMStart, handleLLMEnd, and handleLLMError) aren't called unlike in their uncached counterparts.

I presume this is the case because the callbacks don't currently handle the case of cached responses, so there wouldn't be a way to tell if tokens were actually consumed or not. That being said, I think it would be pretty worthwhile to add this and even initially just specify that it's cached in the callback metadata. I would very much like to see the messages being generated on LangSmith and not just the calls to the tools.

Uncached

Cached

The text was updated successfully, but these errors were encountered:

dosubot · 2023-11-23T21:06:12Z

🤖

It seems like you've got this under control, if you want help or have specific questions, let me know what I can do for you!

gramliu · 2023-11-23T21:12:26Z

This could likely be handled well by creating a _generateCached counterpart to the _generateUncached callbacks

dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Nov 23, 2023

gramliu mentioned this issue Nov 23, 2023

core[patch]: Add LLM/ChatModel callbacks to cached generation #3392

Merged

jacoblee93 closed this as completed in #3392 Dec 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use callbacks when using LLM with cache #3391

Use callbacks when using LLM with cache #3391

gramliu commented Nov 23, 2023

dosubot bot commented Nov 23, 2023

gramliu commented Nov 23, 2023

Use callbacks when using LLM with cache #3391

Use callbacks when using LLM with cache #3391

Comments

gramliu commented Nov 23, 2023

Uncached

Cached

dosubot bot commented Nov 23, 2023

gramliu commented Nov 23, 2023