Crash when notebook scrolling is on and a lot of output is streamed in #182636

r3m0t · 2023-05-16T14:34:47Z

Does this issue occur when all extensions are disabled?: Yes/No

VS Code Version:
OS Version:
Version: 1.79.0-insider (user setup)
Commit: 2575777
Date: 2023-05-12T05:21:06.392Z
Electron: 22.5.2
Chromium: 108.0.5359.215
Node.js: 16.17.1
V8: 10.8.168.25-electron.0
OS: Windows_NT x64 10.0.19045
Sandboxed: Yes

Steps to Reproduce:

Install Jupyter and Python
Ensure user setting "notebook.output.scrolling": true, "notebook.output.textLineLimit": 30 (this is the default)
Create interactive window
Run cell containing

import sys
for i in range(31500):
    print('{!r} {!r}'.format(list(range(10)), i), file=sys.stderr)

This works, it's only abou 1 MB of output.

Run cell containing

import sys
for i in range(31500):
    print('{!r} {!r}'.format(list(range(10)), i), file=sys.stderr)
    sys.stderr.flush()

The renderer crashes!

The text was updated successfully, but these errors were encountered:

r3m0t · 2023-05-16T16:03:55Z

@rebornix @DonJayamanne are you accepting PRs for this?

It looks like, there is no batching of events from the notebook, although you made processing each event more efficient last year.

The renderer is calculating the scroll height each time stderr is flushed

r3m0t · 2023-05-16T19:11:21Z

I guess OutputCell.updateContentAndRender needs to be debounced with requestAnimationFrame, but, when I tried it the resize observer started intermittently sending height of 0 back to the extension.

rebornix · 2023-05-25T17:40:26Z

@r3m0t thanks for offering help, it would be very great. We might need to carefully design the debouncing to ensure a smooth incremental update experience though.

r3m0t · 2023-05-25T21:16:46Z

I had a look last week, but I found many concerns. It's not just the renderer. Basically, updateContentAndRender is an idempotent operation (an upsert) of the output into the rendered DOM. The approach of sending the full output to the renderer each time is quadratic-

import sys, json
res = []
total_message_length = 0

for i in range(31500):
    res.append('{!r} {!r}\n'.format(list(range(10)), i))
    message = json.dumps({'event': 'update_output', 'output_id': '01010239032-3210-321-312321321', 'contents': ''.join(res)})
    total_message_length += len(message)
print(f'Sending {total_message_length:,.0f} bytes with a final output of {len(message)}')
# Sending 18,556,686,495 bytes with a final output of 1185979

That's really excessive, and there's a lot of work needed in the output renderer as well. Each time it processes all the last 5000 lines for ANSI colors and links, instead of just the new part, and creates new div elements -

vscode/extensions/notebook-renderers/src/textHelper.ts

Lines 94 to 104 in 143d967

    
           	element.appendChild(handleANSIOutput(buffer.slice(-5000).join('\n'), trustHtml)); 
        
           	return element; 
        
           } 
        
           export function createOutputContent(id: string, outputs: string[], linesLimit: number, scrollable: boolean, trustHtml: boolean): HTMLElement { 
        
           	const buffer = outputs.join('\n').split(/\r\n|\r|\n/g); 
        
           	if (scrollable) { 
        
           		return scrollableArrayOfString(id, buffer, trustHtml);

it would make more sense to send incremental content -

vscode/src/vs/workbench/contrib/notebook/browser/view/renderers/backLayerWebView.ts

Line 1492 in 143d967

this._sendMessageToWebview({

Separate issue as it's on the main host side- there is also quadratic behaviour in formatStreamText - this is called 31,500 times and searches the full string each time with these two includes calls -

vscode/src/vs/workbench/contrib/notebook/common/notebookCommon.ts

Line 1071 in a54803a

    
           if (!buffer.buffer.includes(BACKSPACE_CHARACTER) && !buffer.buffer.includes(CARRIAGE_RETURN_CHARACTER)) {

Basically, debouncing would require updateContentAndRender to stay idempotent which I don't agree with, and it would make the logic more complicated if you want to introduce a mix of idempotent and incremental messages later.

I might be able to send you some small changes to optimize https://github.com/microsoft/vscode/blob/143d9674dd470c8f3d3a5e4b930f5eb1fe9a91d3/extensions/notebook-renderers/src/index.ts , but I don't think performance will be good until you can tackle the above.

Or should I just tell my users to go to Terminal 🤣

r3m0t · 2023-05-25T21:39:52Z

Oh, and I think I'm basically suggesting unpicking #161023 - or optimizing for both cases (append usually but replace when \r or \b used as stream characters come into play)

amunger · 2023-05-25T22:35:32Z

We were considering incremental updates to the output here, but decided for a simpler change to fix that issue. As you noted, we look for characters that edit previous outputs and compress the buffer into a single output, which would be very difficult to support if we only sent text to append. We also need to consider 3rd party renderers that would break if we changed the message in such a way.

We have a hard limit of 5000 lines for text output in the renderer, so it might make sense to also limit the message sent to help mitigate this issue.

r3m0t · 2023-05-26T07:03:55Z

I don't think third party renderers need to be affected as the escape sequences are already only handled for a few hard coded MIME types. Are there other renderers for these types? Are there other MIME types that need to be handled incrementally? In the Jupyter protocol, you can't incrementally update an output of a MIME type. This can be private API to the built in stream renderers.

I do agree OutputItem[] is the wrong type when the escape sequence can "affect data behind it". I was thinking of an OutputItem.replaceVersionId which increments exactly when the new contents is different from the old contents in the range [0, oldContents.length). If it's unchanged then linkify, ANSI color processing and <span> creation can be skipped for the existing output.

r3m0t · 2023-06-12T10:08:49Z

Thanks for putting this on the iteration plan, can you tag it freeze-crash?

I'll leave you with my thoughts which may be useful in implementation, hopefully this doesn't send you down the wrong way!

I'm not sure my 18,556,686,495 bytes was accurate as I didn't check the behaviour of copyBufferIfNeeded in this case.
The first question if making big changes to this area is whether to keep stdout and stderr separate or mix them- more like a terminal or Visual Studio 2022 Python support. In VS the stdout and stderr are shown in the same place with different colours. It makes more sense to me and users at my employer than what Jupyter does. Jupyter wire protocol doesn't say how stdout/stderr should be displayed
As I mentioned there's inefficiencies at almost every layer of the stack wrt output.
- The efficient one is the ext host and messages to the main process due to the inclusion of ICellExecuteOutputEditDto.append.
- But when we reach ExtHostCell.setOutputItems it gets quadratic behaviour and we forget it's an append event.
- Then backLayerWebview.ts and a couple of stack frames in the webview side are paying the quadratic price.
- As well as the application/vnd.code.notebook.stdout renderer.
I would split CellExecutionUpdateType.OutputItems into two update types - one is always append and only contains application/vnd.code.notebook.stdout and/or stderr and the other is the normal one (which no longer needs an append case).
This follows all down the stack, if \r or \b is encountered then the event gets converted to the normal type of event before it gets to backLayerWebview & etc.
The debouncing is probably worthwhile even with these optimisations, as I couldn't find a generic way to avoid that Layout 10ms cost shown in the profiler, even if it was fully incremental all the way down. Once webviewPreloads.ts knows which showOutput messages are appendy, it'll be easy to implement the debouncing there.
A non-generic way I just thought of is to not measure the offsetHeight of the output element after rendering (

vscode/src/vs/workbench/contrib/notebook/browser/view/renderers/webviewPreloads.ts

Line 2563 in 70dff6d

const offsetHeight = this.element.offsetHeight;

) . After all, if there are enough lines for the scrollbar to appear, you know the output height will stay constant as line count increases. But, IMO you will hit inefficiencies elsewhere.

PS While writing I noticed MirrorNotebookDocument ignores NotebookOutputItemChangedEvent.append / NotebookOutputChangedEvent.append, I don't know the impact and there is a comment that might mean you already know this but I didn't understand it so I thought it was worth mentioning.

rebornix · 2023-07-12T18:31:04Z

@amunger and I discussed how we can introduce incremental rendering to the stream output renderer and his experiments are looking promising. Here are some perf logs we captured

With incremental rendering

Every output append takes around 2ms and it's very consistent. Our hypothesis is once we reach the max height of the scrollable element, the height will not change anymore and it won't trigger any resize to its ancestor elements. Every append operation is only changing its internal content height, that's why the layout is very minimal and almost identical each time. With https://developer.mozilla.org/en-US/docs/Web/CSS/contain, it can probably be reduced further.

Without incremental rendering

Without incremental update, every update is a "remove" and then "insert" and the "insert" operation is getting bigger, thus the rendering time for each operation is getting longer. Starting from around 2ms, all the way to tens or hundreds of ms

vscodenpa assigned rebornix May 16, 2023

rebornix assigned amunger May 25, 2023

rebornix added the notebook-output label May 25, 2023

r3m0t mentioned this issue May 25, 2023

VS Code crashes when running a cell that produces a lot of output microsoft/vscode-jupyter#11031

Closed

kieferrm mentioned this issue Jun 11, 2023

Iteration Plan for June 2023 #184844

Closed

74 tasks

kieferrm mentioned this issue Jul 9, 2023

Iteration Plan for July 2023 #187398

Closed

54 tasks

amunger mentioned this issue Jul 12, 2023

perf optimizations for large stream notebook outputs #187766

Merged

amunger mentioned this issue Jul 24, 2023

Test: large notebook output performance #188699

Closed

3 tasks

vscodenpa added the on-testplan label Jul 24, 2023

amunger closed this as completed Dec 5, 2023

github-actions bot locked and limited conversation to collaborators Jan 20, 2024

aiday-mar modified the milestones: Backlog Candidates, December / January 2024 Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash when notebook scrolling is on and a lot of output is streamed in #182636

Crash when notebook scrolling is on and a lot of output is streamed in #182636

r3m0t commented May 16, 2023

r3m0t commented May 16, 2023

r3m0t commented May 16, 2023

rebornix commented May 25, 2023

r3m0t commented May 25, 2023

r3m0t commented May 25, 2023

amunger commented May 25, 2023

r3m0t commented May 26, 2023

r3m0t commented Jun 12, 2023

rebornix commented Jul 12, 2023

Crash when notebook scrolling is on and a lot of output is streamed in #182636

Crash when notebook scrolling is on and a lot of output is streamed in #182636

Comments

r3m0t commented May 16, 2023

r3m0t commented May 16, 2023

r3m0t commented May 16, 2023

rebornix commented May 25, 2023

r3m0t commented May 25, 2023

r3m0t commented May 25, 2023

amunger commented May 25, 2023

r3m0t commented May 26, 2023

r3m0t commented Jun 12, 2023

rebornix commented Jul 12, 2023