Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include output from interactive cells in Foyle requests #1756

Merged
merged 5 commits into from
Oct 30, 2024

Conversation

jlewi
Copy link
Contributor

@jlewi jlewi commented Oct 23, 2024

This fixes a bug in the serializaiton of the notebook before sending it to Foyle that caused the output of interactive cells not to be included in the requests.

The problem is that we need to call addExecInfo before converting the VSCode NotebookData representation to the proto. That handles copying the output of the interactive terminals into the NotebookData structure.

This necessitated some code refactoring. In order to call addExecInfo we need an instance of the kernel.

We create a new Converter class to keep track of the kernel and also provide reuse in the logic for converting notebook data to protos for Foyle.

Since addExecInfo is async we need to change buildReq to return a promise and refactor some of the logic to be non blocking.

This fixes a bug in the serializaiton of the notebook before sending
it to Foyle that caused the output of interactive cells not to be
included in the requests.

The problem is that we need to call addExecInfo before converting
the VSCode NotebookData representation to the proto. That
handles copying the output of the interactive terminals into
the NotebookData structure.

This necessitated some code refactoring. In order to call
addExecInfo we need an instance of the kernel.

We create a new Converter class to keep track of the kernel
and also provide reuse in the logic for converting notebook data to
protos for Foyle.

Since addExecInfo is async we need to change buildReq to return
a promise and refactor some of the logic to be non blocking.

* Fix jlewi/foyle#286
@jlewi jlewi marked this pull request as ready for review October 23, 2024 22:39
@jlewi jlewi requested a review from sourishkrout October 23, 2024 22:39
@jlewi
Copy link
Contributor Author

jlewi commented Oct 24, 2024

@sourishkrout this is ready when you are.

jlewi added a commit to jlewi/vscode-runme that referenced this pull request Oct 24, 2024
stateful#1756
branch: jlewi/outputs

commit d279e74
Author: Jeremy Lewi <jeremy@lewi.us>
Date:   Wed Oct 23 15:25:38 2024 -0700

    Include output from interactive cells in Foyle requests

    This fixes a bug in the serializaiton of the notebook before sending
    it to Foyle that caused the output of interactive cells not to be
    included in the requests.

    The problem is that we need to call addExecInfo before converting
    the VSCode NotebookData representation to the proto. That
    handles copying the output of the interactive terminals into
    the NotebookData structure.

    This necessitated some code refactoring. In order to call
    addExecInfo we need an instance of the kernel.

    We create a new Converter class to keep track of the kernel
    and also provide reuse in the logic for converting notebook data to
    protos for Foyle.

    Since addExecInfo is async we need to change buildReq to return
    a promise and refactor some of the logic to be non blocking.

    * Fix jlewi/foyle#286
Copy link
Member

@sourishkrout sourishkrout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ LGTM

I will let you resolve the merge conflict.

@@ -82,45 +82,45 @@ export class StreamCreator {
}

log.info('handleEvent: building request')
let req = this.handlers.buildRequest(event, firstRequest)
this.handlers.buildRequest(event, firstRequest).then((req) => {
Copy link
Member

@sourishkrout sourishkrout Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could likely make handleEvent = async (event: ... and use await here instead of then. Not an issue but more readable.

There are a few other instances of this in the code. Not a merge-stopper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could likely make handleEvent = async (event: ... and use await here instead of then. Not an issue but more readable.

I'll take a look at making that change. I thought about making the function async and I wasn't sure how deep that would propogate if I had to make the callers async but I'll give it a shot.

),
)
// N.B. handlEvent is aysnc. So we need to use "then" to make sure the event gets processed
this.streamCreator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sourishkrout Is this the right way to call an async function from a non async function?
Since the return type is null we don't need to await the async function.
However, I think in the past you mentioned that if you invoke an async function but don't do anything with its return value it might not get scheduled.
Does using "then(()=> {})" solve this problem?

Copy link
Member

@sourishkrout sourishkrout Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not solve the problem. At JS runtime, there isn't such a thing as an "async function". What makes it async is returning a Promise<> type. Async/await uses a generator under the hood to pause execution to unravel promises; however, semantically, there's no difference to .then(...). It just makes you write async code that looks much more like sync code for readability.

That being said, if a promise type ("a future") is not being then'd or awaited upstream, you wind up with the same exact problem no matter how the promise is "returned."

The only reason this likely appears to work is that promises start running immediately (as opposed to when you await/then them), and VS Code is a long-running process, so the scopes live long enough to allow the promises to complete. I'm making assumptions here because I haven't thoroughly inspected the upstream code.

As a Golang programmer, this is the same as running three go func() inside a function but not using a WaitGroup to synchronize them. Similarly, this might work if the "main thread" runs long enough for all three to complete and no downstream processing requires their completion. Otherwise, all concurrent functions get killed when the main thread dies.

In other words, my suggestion to use async was only intended to be cosmetic. If there is a problem with async execution, both then and await will be prone to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason this likely appears to work is that promises start running immediately (as opposed to when you await/then them), and VS Code is a long-running process, so the scopes live long enough to allow the promises to complete

That seems fine to me. Concretely, it seems fine to treat it as a fire and forget and assume

  1. vscode runs long enough for them to complete
    or
  2. If vscode shuts down early enough then the request might not have completed.

Here's the problem I'm trying to solve. handleEvent is an async function being called from a non async function handleOnDidChangeNotebookCell. handleOnDidChangeNotebookCell is the listener for events

vscode.workspace.onDidChangeTextDocument(eventGenerator.handleOnDidChangeNotebookCell),

I originally thought I couldn't declare handleEvent as async and include an await function because I thought the listener couldn't be an async function. However, it looks like if I declare handleOnDidChangeNotebookCell to be async it still works fine; so I could update it to be async and then include an await function.

However, my suposition is that if the listener (handleOnDidChangeNotebookCell) is async its returning a promise that is not being awaited on. So we still wind up with a Promise that we aren't awaiting its just happening in a different part of the code.

Fundamentally, I think this is alright. The listener (handleOnDidChangeNotebookCell) is firing off a request to generate a completion and the response is handled asynchronously. If vscode exits before that async function can get scheduled and finish processing than we are dropping the completion generation logic on the floor rather than doing some graceful handling/shutdown. But dropping it on the floor seems fine; I'm not sure what graceful handling would actually look like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I changed handleOnDidChangeNotebookCell and added some awaits. I made this change because I agree with your earlier comment that using await is cleaner.

@sourishkrout so I think this is good to go but let me know if you think otherwise.

@jlewi
Copy link
Contributor Author

jlewi commented Oct 26, 2024

@sourishkrout I updated the code to use async; I left a question for you in the comments can you please take a look?

The other thing is it doesn't look like interactive cells are properly triggered if you have a long running command. I was hoping #1744 fixed this but it looks like #1744 only fixed it for non-interactive. I'll update jlewi/foyle#309

@sourishkrout sourishkrout merged commit f2d4dd5 into main Oct 30, 2024
1 check passed
@sourishkrout sourishkrout deleted the jlewi/outputs branch October 30, 2024 00:36
hotpocket pushed a commit to hotpocket/vscode-runme that referenced this pull request Nov 5, 2024
* Include output from interactive cells in Foyle requests

This fixes a bug in the serializaiton of the notebook before sending
it to Foyle that caused the output of interactive cells not to be
included in the requests.

The problem is that we need to call addExecInfo before converting
the VSCode NotebookData representation to the proto. That
handles copying the output of the interactive terminals into
the NotebookData structure.

This necessitated some code refactoring. In order to call
addExecInfo we need an instance of the kernel.

We create a new Converter class to keep track of the kernel
and also provide reuse in the logic for converting notebook data to
protos for Foyle.

Since addExecInfo is async we need to change buildReq to return
a promise and refactor some of the logic to be non blocking.

* Fix jlewi/foyle#286

* Update to use await.

* Add a comment.

* Use await.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cell Outputs aren't actually included in the LLM request.
2 participants