Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DSPy saving #1889

Merged
merged 11 commits into from
Dec 11, 2024
Merged

Improve DSPy saving #1889

merged 11 commits into from
Dec 11, 2024

Conversation

chenmoneygithub
Copy link
Collaborator

@chenmoneygithub chenmoneygithub commented Dec 5, 2024

We are reworking DSPy saving in order to make it more robust:

  • Introduce whole model saving, users can specify state_only=False to save the model architecture along with the state. This is done via cloudpickle.
  • Improve the state-only saving to support json and pickle, which resolves the problem that some demos are not json-serializable.
  • Add unit test on both saving format.

From the user perspective, it means they have 3 journeys on saving a DSPy model.

Method 1: Save everything.

# Saving.
predict = dspy.Predict("question->answer")
optimizer = dspy.BootstrapFewShot(max_bootstrapped_demos=4, max_labeled_demos=4, max_rounds=5, metric=some_metric)
compiled_predict = optimizer.compile(predict, trainset=trainset)
compiled_predict.save("/tmp/my_model", save_program=True)   # `my_model` is a directory.

# Loading.
loaded_predict = dspy.load("/tmp/my_model")

Method 2 (our old approach): Save state with Json

# Saving.
predict = dspy.Predict("question->answer")
optimizer = dspy.BootstrapFewShot(max_bootstrapped_demos=4, max_labeled_demos=4, max_rounds=5, metric=some_metric)
compiled_predict = optimizer.compile(predict, trainset=trainset)
compiled_predict.save("/tmp/my_model/model.json", save_program=False)

# Loading.
loaded_predict = dspy.Predict("question->answer")
loaded_predict.load("/tmp/my_model/model.json")

Method 3: Save state with cloudpickle

# Saving.
predict = dspy.Predict("question->answer")
optimizer = dspy.BootstrapFewShot(max_bootstrapped_demos=4, max_labeled_demos=4, max_rounds=5, metric=some_metric)
compiled_predict = optimizer.compile(predict, trainset=trainset)
compiled_predict.save("/tmp/my_model/model.pkl", save_program=False)

# Loading.
loaded_predict = dspy.Predict("question->answer")
loaded_predict.load("/tmp/my_model/model.pkl")

dspy/predict/predict.py Outdated Show resolved Hide resolved
dspy/primitives/module.py Outdated Show resolved Hide resolved
def load(self, path, use_legacy_loading=False):
with open(path) as f:
self.load_state(ujson.loads(f.read()), use_legacy_loading=use_legacy_loading)
def save(self, path, save_field_meta=False, state_only=True, metadata=None, use_json=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud: I don't like that we already have so many flags, since this is our opportunity to simplify save/load a lot, rather than to make it more complicated from a DX standpoint.

More actionable: I don't like that state_only is a "double negative". Turning it off does more work. It should probably be reversed, e.g. save_program=False by default.

Separately, do we need use_json? It seems to me that we use JSON iff save_program == False.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense! We allow users to pickle the state when save_program == False and use_json = False, which handles the problem of serializing data like datetime.date.

What about this:

  • I will remove the metadata arg, which is not useful now.
  • I will remove the use_json arg, and use the file suffix (.json or .pkl) to determine the format for saving state.
  • Replace state_only by save_program.

@@ -136,17 +138,20 @@ def _load_state_legacy(self, state):

return self

def load(self, path, return_self=False):
def load(self, path, use_legacy_loading=False, return_self=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method even need to exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete it since we have made our new plan for backward compatibility.

@okhat okhat merged commit 4be87ef into stanfordnlp:main Dec 11, 2024
4 checks passed
@chenmoneygithub chenmoneygithub deleted the dspy-saving branch December 27, 2024 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants