Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileCallbackHandler #5589

Merged
merged 9 commits into from
Jun 3, 2023
Merged

Conversation

mbchang
Copy link
Contributor

@mbchang mbchang commented Jun 2, 2023

like StdoutCallbackHandler, but writes to a file

When running experiments I have found myself wanting to log the outputs of my chains in a more lightweight way than using WandB tracing. This PR contributes a callback handler that writes to file what StdoutCallbackHandler would print.

Example Notebook

See the included filecallbackhandler.ipynb notebook for usage. Would it be better to include this notebook under modules/callbacks or under integrations/?
image

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

@agola11

Copy link
Collaborator

@agola11 agola11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of duplicated code between StdOutCallbackHandler and this one, can we just add a file option to StdoutCallbackHandler and if it's set, call print_text with the file?

"""Destructor to cleanup when done."""
self.file.close()

def on_llm_start(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to implement methods you aren't using, BaseCallbackHandler is not an abstract class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agola11 is there any code to guarantee that cleaned up is invoked by handlers on failure? Not critical in this case, but occurred to me that maybe there are some situations we'd want to have cleanup invoked

I don't know enough about the python data model... del i assume dell might not be invoked on exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good; I removed them.

@mbchang
Copy link
Contributor Author

mbchang commented Jun 2, 2023

Modifying StdoutCallbackHandler could work, but I wonder if it could cause difficulties in separating functionality, which was why I thought that using an explicitly separate handler could make things clearer to the user.

Specifically, there are four cases the user might want:

  1. do not print to stdout, do not print to file
  2. print to stdout, do not print to file (this is the current behavior that verbose=True triggers)
  3. do not print to stdout, print to file
  4. print to stdout, print to file

If we modify StdoutCallbackHandler to also print to file if a file is given, this would allows us to handle cases 1, 2, 4. Any suggestions on how we could handle case 3?

If StdoutCallbackHandler handles both printing to file and printing to stdout could create ambiguity on the user side with respect to what verbose=True would do. If the user simply sets verbose=True, the behavior would be that it prints to stdout, but does not write to file. If the user wants to both print to stdout and also write to file, then the user would pass the StdoutCallbackHandler in explicitly. This ambiguity might not be a big issue, but just flagging this as a possible concern. What do you think?

@mbchang
Copy link
Contributor Author

mbchang commented Jun 2, 2023

UPDATE: I realized that one solution to the above concerns could be for the user to pass in two StdoutCallBackHandlers, one with a file and one without a file, to the chain. So I will modify StdoutCallBackHandlers to take a file in as an optional input.

@mbchang
Copy link
Contributor Author

mbchang commented Jun 2, 2023

On second thought, passing in two StdoutCallBackHandlers might not work. This is because when the callbacks are configured, if verbose=True, then this line checks instantiates a new StdoutCallBackHandler to log to stdout. But if I pass an StdoutCallBackHandler with a file argument, then setting verbose=True will not instantiate another StdoutCallBackHandler to print to stdout.

We would get the following unexpected behavior below, where setting verbose=True does not result in printing stuff to stdout.
image

@mbchang mbchang requested a review from agola11 June 2, 2023 15:57
@hwchase17
Copy link
Contributor

@agola11 mind taking another look?

Copy link
Collaborator

@agola11 agola11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense thanks!

@agola11 agola11 merged commit d3bdb8e into langchain-ai:master Jun 3, 2023
Undertone0809 pushed a commit to Undertone0809/langchain that referenced this pull request Jun 19, 2023
# like
[StdoutCallbackHandler](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/stdout.py),
but writes to a file

When running experiments I have found myself wanting to log the outputs
of my chains in a more lightweight way than using WandB tracing. This PR
contributes a callback handler that writes to file what
`StdoutCallbackHandler` would print.

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

## Example Notebook

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use



See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

See the included `filecallbackhandler.ipynb` notebook for usage. Would
it be better to include this notebook under `modules/callbacks` or under
`integrations/`?

![image](https://github.com/hwchase17/langchain/assets/6439365/c624de0e-343f-4eab-a55b-8808a887489f)


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

@agola11

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants