Ark: Break up long inputs #4745
Labels
area: console
Issues related to Console category.
area: kernels
Issues related to Jupyter kernels and LSP servers
lang: r
Milestone
Joint work with @DavisVaughan.
Parent issue: #1326
When a code execution request is sent to Ark, it sends it to R via the
ReadConsole()
hook. This hook takes input via a buffer that has fixed size of 4096 bytes: https://github.com/r-devel/r-svn/blob/08656ceb6a8c0b6fd31f436a16cea03fb614327a/src/include/Defn.h#L1896When the input exceeds that size there is no straightforward way to break it up from our
read_console()
handler. Our previous solutions to this problem all had issues:We truncated the input to the buffer size, causing Ark and Positron to get into unexpected states. Positron only sends complete expressions but we ended up evaluating incomplete ones because of the truncation (see Ark: Truncated input causes weirdness if incomplete #2675).
We discarded the input and failed with an error message instead: Write an R error to the buffer when user input is too large ark#377. This solved the unexpected state, but it creates unexpected behaviour for users when they send large portions of R code to the console.
So instead we are going to change
read_console()
to break up input into multiple lines that will be sent one by one to R. With multi-line expressions, the input will be incomplete until the last line is reached and R calls us back for the next input. This turnsread_console()
into a simple state machine: If a line of input is pending, send it right away. Otherwise, proceed as normal.For simplicity we would prefer to not use the input boundaries routine implemented in posit-dev/ark#522. Instead we'll send the lines one by one. We can depend on the fact that if we get a prompt of type "incomplete", it means the previous line completed an expression that the R evaluator could interpret.
Things that require care:
When an error occurs, discard the remaining lines of input. This is similar to what RStudio does. When the frontend breaks up multiple expressions (Frontend should break up multiline selections that get sent to Console (by expression or by
\n
) #1326), it will have to do the same for consistency.When we get into the debug browser, just continue sending inputs? Again this is similar to RStudio behaviour. This is different to R's own behaviour when it gets multiple expressions as once because it runs nested REPLS. This means there is a stack of input buffers for each nested console. Achieving the same in Positron would be challenging for little benefit.
Alternatively we could discard the pending inputs like we do in case of error. It's unlikely the pending expressions will make sense in the debug context. However the user might wonder where did the pending inputs go.
There might be multiple expressions in a single input. For instance if a Jupyter chunk has multiple expressions, or when a selection is executed from Positron since we don't break up expressions yet (see Frontend should break up multiline selections that get sent to Console (by expression or by
\n
) #1326).When that is the case, the results of intermediate expressions should be emitted immediately to IOPub as
Stdout
stream to preserve the correct order of output lines. Only the result of the final expression should be emitted as anExecuteResult
on Shell.We need to trim trailing whitespace so that the last complete expression is not treated as an intermediate input.
If we rely on an incomplete prompt type to determine our state, we should make sure the prompt type is correct. I think we could improve https://github.com/posit-dev/ark/blob/a0c4890e2e52a670f629ba12c17a638c258c3a52/crates/ark/src/interface.rs#L781 with:
In conjunction with a fix for Unreliable readline prompt detection with active browser sessions #4742.
The text was updated successfully, but these errors were encountered: