Replies: 2 comments 1 reply
-
Hey @dwstevens! I'm Dosu, a bot here to help you out with bugs, questions, and more. Let's tackle this issue together while we wait for a human maintainer. I found a similar discussion regarding token cost counting when using Langchain's LLM cache. In that case, it was noted that the Langfuse UI might show costs and tokens for both cached and non-cached calls, which could be a discrepancy in how Langfuse interprets or displays cached results [1]. Additionally, there is an open issue tracking this behavior, suggesting that the current implementation might not accurately reflect cost savings from prompt caching [2]. If you're experiencing similar issues, it might be related to this known discrepancy. |
Beta Was this translation helpful? Give feedback.
-
This is a very near-term product update that's coming, thanks for sharing that this is important to you! |
Beta Was this translation helpful? Give feedback.
-
Wondering if Langfuse can accurate calculate the cost based on prompt caching
Beta Was this translation helpful? Give feedback.
All reactions