Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt_model does not check if type is lm, catastrophically fails without a clear error #1930

Open
gregschwartz opened this issue Dec 13, 2024 · 1 comment

Comments

@gregschwartz
Copy link

Error:

2024/12/12 17:15:04 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2024/12/12 17:15:04 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
Error getting source code: unhashable type: 'dict'.

Running without program aware proposer.
WARNING:root:   *** In DSPy 2.5, all LM clients except `dspy.LM` are deprecated, underperform, and are about to be deleted. ***
                You are using the client str, which will be removed in DSPy 2.6.
                Changing the client is straightforward and will let you use new features (Adapters) that improve the consistency of LM outputs, especially when using chat LMs. 

                Learn more about the changes and how to migrate at
                https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb
Error getting data summary: 'str' object is not callable.

Running without data aware proposer.

2024/12/12 17:15:04 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...

Traceback (most recent call last):
  File "/Users/greg/code/airbnbRating/python UI/dspy_optimize/optimize.py", line 135, in <module>
    optimized_program = teleprompter.compile(
        program.deepcopy(),
    ...<3 lines>...
        requires_permission_to_run=True,
    )
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/dspy/teleprompt/mipro_optimizer_v2.py", line 172, in compile
    instruction_candidates = self._propose_instructions(
        program,
    ...<6 lines>...
        fewshot_aware_proposer,
    )
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/dspy/teleprompt/mipro_optimizer_v2.py", line 449, in _propose_instructions
    instruction_candidates = proposer.propose_instructions_for_program(
        trainset=trainset,
    ...<4 lines>...
        trial_logs={},
    )
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/dspy/propose/grounded_proposer.py", line 340, in propose_instructions_for_program
    self.propose_instruction_for_predictor(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        program=program,
        ^^^^^^^^^^^^^^^^
    ...<7 lines>...
        tip=selected_tip,
        ^^^^^^^^^^^^^^^^^
    ),
    ^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/dspy/propose/grounded_proposer.py", line 386, in propose_instruction_for_predictor
    original_temp = self.prompt_model.kwargs["temperature"]
                    ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'kwargs'

Example code:

teleprompter = MIPROv2(
    prompt_model="openai/gpt-4o",
    metric=rating_metric,
    auto="heavy", # Can choose between light, medium, and heavy optimization runs
)

Expected behavior:

  1. Provide helpful, clear error message that points out the real problem, e.g.
  File "/Users/greg/code/airbnbRating/python UI/dspy_optimize/optimize.py", line 135, in <module>
prompt_model does not match type dspy.lm 
  1. If it is a string, automatically wrap it in an LM object. Since it isn't obvious in the documentation that it needs to be an object.

Also, example usage of that configuration would be helpful :)

@gregschwartz
Copy link
Author

gregschwartz commented Dec 13, 2024

In fact I think it's actually worse, now I get

Bootstrapped 3 full traces after 5 examples for up to 1 rounds, amounting to 6 attempts.
2024/12/12 18:06:36 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2024/12/12 18:06:36 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
Error getting source code: unhashable type: 'dict'.

Running without program aware proposer.
2024/12/12 18:06:36 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...

That's with

teleprompter = MIPROv2(
    prompt_model=dspy.LM("openai/gpt-4o"),
    metric=rating_metric,
    auto="heavy",
)

Whole file:

import dspy
from dspy.teleprompt.random_search import BootstrapFewShotWithRandomSearch
from dspy.evaluate import Evaluate
import os
from datetime import datetime
from load_csv import load_csv_as_examples, sample_examples
from dotenv import load_dotenv
load_dotenv()

############ Tracking ############
from langfuse import Langfuse
from langfuse.decorators import observe
langfuse = Langfuse(secret_key=os.getenv("LANGFUSE_SECRET_KEY"), public_key=os.getenv("LANGFUSE_PUBLIC_KEY"), host=os.getenv("LANGFUSE_HOST"))
##################################

lm = dspy.LM('openai/gpt-4o-mini', max_tokens=16384)
dspy.configure(lm=lm)

devset = load_csv_as_examples('data/small_test_set.csv')
trainset = load_csv_as_examples('data/training_set.csv')
print(f"loaded trainset: {len(trainset)} rows, devset: {len(devset)} rows")


class RatingSignature(dspy.Signature):
    """We want to rate Experience listings for completeness and quality."""
    market = dspy.InputField(description="Market eg city. Used to see if the activity is relevant to the market")
    title = dspy.InputField(description="""Activity title""")
    host = dspy.InputField()
    activity = dspy.InputField()
    location = dspy.InputField()
    answer:int = dspy.OutputField(description="The rating, 1-3. 1 is definitely not good enough quality listing. 2 is ok or maybe good, 3 is great")

class RatingModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.ChainOfThought(RatingSignature)

    def forward(self, market, title, host, activity, location):
        return self.prog(market=market, title=title, host=host, activity=activity, location=location)

program = RatingModule()

# Load last optimized program, if any
saved_programs = [f for f in os.listdir() if f.startswith("v_") and f.endswith("_miprov2.json")]
if saved_programs:
    # Sort by extracting the version number after "v_" and before the next "_"
    filename = sorted(saved_programs, key=lambda x: int(x.split('_')[1]))[-1]
    program.load(filename)
    print(f"Loaded optimized program: {filename}")


####### Evaluator ###########
def rating_metric(example, predictor, traceback=None):
    return example.answer == predictor.answer

evaluate = Evaluate(devset=trainset[:], metric=rating_metric, num_threads=8, display_progress=True, display_table=False, provide_traceback=True)
# evaluate(RatingModule(), devset=trainset[:])

######### Optimizer ###########

from dspy.teleprompt import MIPROv2
teleprompter = MIPROv2(
    prompt_model=dspy.LM("openai/gpt-4o"),     # ⬅️⬅️⬅️ THIS APPARENTLY DOES NOT WORK
    metric=rating_metric,
    auto="heavy", # Can choose between light, medium, and heavy optimization runs
)
optimized_program = teleprompter.compile(
    program.deepcopy(),
    trainset=trainset,
    max_bootstrapped_demos=4,
    max_labeled_demos=3,
    requires_permission_to_run=False,
)

# Evaluate optimized program
print(f"✅Evaluate optimized program on devset...")
evaluate(optimized_program, devset=devset[:])

######### Save optimized program ###########
version = 1
while any(f.startswith(f"v_{version}_") for f in os.listdir() if f.endswith(".json")):
    version += 1
date_added = datetime.now().strftime('%m-%d_%H-%M')
optimized_program.save(f"v_{version}_{date_added}_miprov2.json")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant