-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for saving named requests in variables and improved json parsing #3642
Support for saving named requests in variables and improved json parsing #3642
Conversation
src/Microsoft.DotNet.Interactive.Http.Parsing/Parsing/HttpRequestParser.cs
Show resolved
Hide resolved
if (command.Parameters.TryGetValue("Document", out doc)) | ||
{ | ||
var parsedDoc = HttpRequestParser.Parse(doc); | ||
foreach (DeclaredVariable dv in parsedDoc.SyntaxTree.RootNode.TryGetDeclaredVariables(BindExpressionValues).declaredVariables.Values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above approach means that the user could see unexpected delays when executing requests in a large document (since we would need to bind all variables including some that are completely unrelated to the user's request).
An alternate approach may be to support a workflow where the code in the SubmitCode
command can either be the entire document or a single request. We could then include a parameter to indicate which case it was - for example the parameter could be called Execute
. If the entire document is being passed, we could set Execute
to false
and the kernel could parse and bind all the file level variables without executing any request. If Execute
is set to true
(or omitted as will be the case in notebooks and dotnet-repl), we could execute the specified request.
One potential advantage of this approach would be that as the user makes changes to the document, we could asynchronously pass the entire document down to the kernel with Execute
set to false
whenever there is a long enough pause in the user's typing. This would allow the kernel to parse and bind the variables in the background after the user has made the file changes but before the user wants to execute a request. Then when the user clicks on Send request
, we could pass down a SubmitCode
command with just the request that was selected for execution with Execute
set to true
(or with the parameter omitted...). This could reduce any delays caused due to binding in larger documents.
There may also other ways in which something like above could be implemented...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whichever approach we decide to go with should also work in the case of notebooks and dotnet-repl. So I think we would need to choose an approach where the 'default' implementation (i.e. where no additional parameters are specified) naturally supports http cells in notebooks and dotnet-repl...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bleaphar This may be worth a discussion. Lets chat offline once you are back.
I don't think this can be resolved merely by checking whether the file has changed since even minor whitespace changes to one character in the file could end up resulting in a long delay for the subsequent request. And storing and comparing contents of large files may also be problematic from a memory perspective. It may be better to hold on to a checksum / hash of some sort...
In general, and especially in large files, we need ways to
a) amortize some of the cost of parsing and binding so that we don't pay all this cost at the point when user wants to execute a request and
b) avoid redundant binding (and if possible, even avoid redundant parsing) when large portion of the file stays unchanged
For a, we would need to do something like what I mentioned above where we kick off parsing and binding in the background soon after the user stops typing so that we get a head start on parsing and binding before the user executes a request.
For b, it will be hard to avoid redundant parsing (unless we can do something like Roslyn does for C# by incrementally parsing just the lines that changed and fixing up the syntax tree to reflect the changes). But there may be ways to avoid redundant binding (e.g. check if the changed lines contain any variable declarations and only bind new variables that appear in changed code).
Otherwise, users could end up seeing large delays when executing requests and we could end up seeing feedback like the below one even in files that are not as large as the one the user mentions in the below case -
https://developercommunity.visualstudio.com/t/Big-HTTP-file-is-not-interpreted-by-Visu/10717885
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments.
The above approach means that the user could see unexpected delays when executing requests in a large document (since we would need to bind all variables including some that are completely unrelated to the user's request).
We should measure this before making a design decision. I strongly suspect that the difference is not perceptible and that, to the extent that it is, serialization is a larger overhead than redundant parsing. But we should measure it.
We could then include a parameter to indicate which case it was - for example the parameter could be called Execute.
The intent is different so I would suggest this is no longer a SubmitCode
. Does the RequestDiagnostics
command not already do what's needed here?
src/Microsoft.DotNet.Interactive.Http/Model/HttpNamedRequest.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.DotNet.Interactive.Http/Model/HttpNamedRequest.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.DotNet.Interactive.Http/Model/HttpNamedRequest.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.DotNet.Interactive.Http/Model/HttpNamedRequest.cs
Outdated
Show resolved
Hide resolved
return Task.CompletedTask; | ||
} | ||
|
||
async Task IKernelCommandHandler<SubmitCode>.HandleAsync(SubmitCode command, KernelInvocationContext context) | ||
{ | ||
string? doc = ""; | ||
if (command.Parameters.TryGetValue("Document", out doc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we planning to make this would work in the case of notebooks in VS Code or the dotnet-repl tool? Would we need changes for those clients to repeat the notebook cell contents in both 'Document' parameter as well as in SubmitCode.Code
to ensure that any variables included in the code are parsed and bound?
Do we need code in the http kernel above to also parse and bind variables present in the SubmitCode.Code
(in addition to the Document
parameter) to support the notebooks and dotnet-repl scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a property to the kernel that tracks the current document and if there are any changes from it then we'll go ahead with the re-parse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the potential to introduce bugs where the kernel state gets out of sync with the editor state. What's the performance impact of resending and reparsing the whole document?
var parsedDoc = HttpRequestParser.Parse(doc); | ||
foreach (DeclaredVariable dv in parsedDoc.SyntaxTree.RootNode.TryGetDeclaredVariables(BindExpressionValues).declaredVariables.Values) | ||
{ | ||
_variables[dv.Name] = dv.Value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to think about... There is a potential for _variables
to retain bunch of stale variables / named requests over time. Not super high pri - but especially for named requests which can have large payloads, it may be nice to clean up stale entries...
Now that we have a specific code path that re-binds all file-level variables, we may be able to drop all old file-level variables whenever we get a new request to parse and bind the document.
To do this, we would need to separate out the dictionaries for file-level variables, named requests and the remaining (.env) variables. The file-level dictionary can be cleared out whenever we get a new request to bind the document. Any named requests that no longer exist in the document (based on the new syntax tree) could also be cleared out from the dictionary for named requests at this point. But for named requests that still exist in the document, we would need to preserve the old responses.
The dictionary for .env vars would be harder to clean up since we don't have a way to know when .env files change at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would measure this before complicating the design. Separate dictionaries will increase cyclomatic complexity and the potential for bugs.
An alternative might be to store each variable with some metadata including a scope or origin concept, e.g. to differentiate variables whose values came from responses (maybe these are transient) versus those that came from a SendValue
command (which the kernel shouldn't have a say in clearing.)
Consider this. In the typical REPL or notebook workflow, a single cell can be edited and resubmitted. Let's say I run this in the C# kernel:
var x = 123;
Then let's say I delete that code and run this:
Console.WriteLine(x);
It would be incorrect for the variable x
to be cleared, and probably very surprising if the HTTP kernel had a different behavior.
It's very typical in REPL and notebook workflows for memory to grow over time, which is why restarting the kernel is a familiar gesture. But if we find there's a performance issue here that's significantly different from other use cases, I'd prefer to address it with an explicit user gesture.
@@ -2233,6 +2233,51 @@ public async Task responses_to_named_requests_can_be_accessed_as_symbols_in_late | |||
secondResult.Events.Should().NotContainErrors(); | |||
} | |||
|
|||
[Theory] | |||
[InlineData("json.response.body.$.slideshow.slides.title")] | |||
[InlineData("json.response.body.$.slideshow.slides.type")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we relying on httpbin.org to return a json that looks like this? I think it would be better to use a mock client with our own handler where we can provide our own json response (like we do in some other tests).
Otherwise, we could end up having to deal with failed builds in CI if httpbin.org happens to be down or if the format for the response ends up changing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be nice to see what the response shape looks like in the test so that it is easier to understand what the test is validating...
Looking forward to seeing this feature live! |
node.AddDiagnostic(diagnostic); | ||
wordParsedOnce = false; | ||
} | ||
if (CurrentToken is not null && (!(CurrentToken is { Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (CurrentToken is not null && (!(CurrentToken is { Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce)) | |
if (CurrentToken is not null && (CurrentToken is not ({ Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a CurrentToken is { ... }
pattern won't match null
, so the initial null check shouldn't be necessary.
); | ||
|
||
var variables = result.SyntaxTree.RootNode.TryGetDeclaredVariables().declaredVariables; | ||
variables.Should().Contain(n => n.Key == "host").Which.Value.Should().BeOfType<DeclaredVariable>().Which.Value.Should().Be("\"https://httpbin.org\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a little surprising to me that the quotes are part of the variable value.
Maybe related, are spaces in variable values supported like this then?
@blah = one two three
} | ||
|
||
[Fact] | ||
public async Task variables_submited_in_requests_are_not_valid_across_clients() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a particularly useful test.
@@ -111,6 +111,158 @@ public async Task can_handle_multiple_request_in_a_single_submission() | |||
requests.Select(r => r.RequestUri.AbsoluteUri).ToArray().Should().BeEquivalentTo(new[] { "https://location1.com:1200/endpoint", "https://location2.com:1200/endpoint" }); | |||
} | |||
|
|||
[Fact] | |||
public async Task variables_in_submit_request_are_saved() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest way to verify the variable is to use the RequestValue
command rather than via a second SubmitCode
. This approach should make the test simpler and easier to understand.
This PR introduces variable support for the kernel and changes the flow of Json grandchildren parsing
Highlights:
PR: #3632