Support returning logprobs in Predictor #1895

veronicalyu320 · 2024-12-06T01:19:41Z

This PR allows Predict to return logprobs of each token as part of Prediction.

Example usage

If you want logprobs:
Set logprobs=True and optionally top_logprobs (int between 0 and 20, see openai doc for details):

import dspy
from dspy import Predict

predict_instance = Predict("question -> answer")
lm = dspy.LM("gpt-4o-mini", logprobs=True)
dspy.configure(lm=lm)
result = predict_instance(question="Where is the Eiffel Tower located?")
print(result)

Output:

Prediction(
    answer='The Eiffel Tower is located in Paris, France, on the Champ de Mars near the Seine River.',
    logprobs={'content': [{'token': '[[', 'bytes': [91, 91], 'logprob': -2.1008714e-06, 'top_logprobs': []}, {'token': ' ##', 'bytes': [32, 35, 35], 'logprob': -1.9361265e-07, 'top_logprobs': []}, {'token': ' answer', 'bytes': [32, 97, 110, 115, 119, 101, 114], 'logprob': 0.0, 'top_logprobs': []}, {'token': ' ##', 'bytes': [32, 35, 35], 'logprob': -8.180258e-06, 'top_logprobs': []}, {'token': ' ]]\n', 'bytes': [32, 93, 93, 10], 'logprob': -0.00010926496, 'top_logprobs': []}, {'token': 'The', 'bytes': [84, 104, 101], 'logprob': 0.0, 'top_logprobs': []}, {'token': ' Eiffel', 'bytes': [32, 69, 105, 102, 102, 101, 108], 'logprob': 0.0, 'top_logprobs': []}, ...], 'refusal': None}
)

If you don't want logprobs:
The usage and behavior is the same as before:

... # everything else stays the same
lm = dspy.LM("gpt-4o-mini")
...

Output:

Prediction(
    answer='The Eiffel Tower is located in Paris, France, on the Champ de Mars near the Seine River.'
)

Caveat

If an LM (e.g. o1-mini) doesn't take logprobs as argument and you still set dspy.LM(..., logprobs=True), you will get an error like openai does not support parameters: {'logprobs': True}, for model=o1-mini. Please check the corresponding LM documentation before setting this parameter.

okhat · 2024-12-07T10:24:53Z

dspy/clients/lm.py

+        else:
+            outputs = [
+                {
+                    "text": c.message.content if hasattr(c, "message") else c["text"],


This appears to change the return type of __call__ even if the user doesn't request logprobs? We normally return a list of strings. This seems to return a list of dicts even if logprobs=False.

SG, updated.

* support returning logprobs in Predictor * allow output to be either str or dict * return outputs as a list of strings if user doesn't set logprobs * Update lm.py --------- Co-authored-by: Omar Khattab <okhat@users.noreply.github.com>

support returning logprobs in Predictor

da5de0a

veronicalyu320 marked this pull request as ready for review December 6, 2024 01:19

allow output to be either str or dict

68f6fea

veronicalyu320 changed the title ~~[WIP] Support returning logprobs in Predictor~~ Support returning logprobs in Predictor Dec 6, 2024

okhat reviewed Dec 7, 2024

View reviewed changes

return outputs as a list of strings if user doesn't set logprobs

f51308a

veronicalyu320 requested a review from okhat December 9, 2024 20:08

okhat added the Behavior 2.5 label Dec 9, 2024

Update lm.py

e99aa2f

okhat removed their request for review December 10, 2024 15:54

okhat merged commit e690743 into stanfordnlp:main Dec 10, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support returning logprobs in Predictor #1895

Support returning logprobs in Predictor #1895

veronicalyu320 commented Dec 6, 2024 •

edited

Loading

okhat Dec 7, 2024

veronicalyu320 Dec 9, 2024

Support returning logprobs in Predictor #1895

Support returning logprobs in Predictor #1895

Conversation

veronicalyu320 commented Dec 6, 2024 • edited Loading

Example usage

Caveat

okhat Dec 7, 2024

Choose a reason for hiding this comment

veronicalyu320 Dec 9, 2024

Choose a reason for hiding this comment

veronicalyu320 commented Dec 6, 2024 •

edited

Loading