Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for saving named requests in variables and improved json parsing #3642

Merged
merged 6 commits into from
Sep 11, 2024

Conversation

bleaphar
Copy link
Contributor

@bleaphar bleaphar commented Aug 16, 2024

This PR introduces variable support for the kernel and changes the flow of Json grandchildren parsing

Highlights:

  • HandleAsync looks if a submit code has the entire associated document in which case it will use that in creating declared variables which enables the support for saving the named request values in variables
  • Changed the chaining of the Json parsing that at the last node to evaluate one of the derived types and then convert to a string
  • Miscellaneous reformatting in the request parser

PR: #3632

if (command.Parameters.TryGetValue("Document", out doc))
{
var parsedDoc = HttpRequestParser.Parse(doc);
foreach (DeclaredVariable dv in parsedDoc.SyntaxTree.RootNode.TryGetDeclaredVariables(BindExpressionValues).declaredVariables.Values)
Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above approach means that the user could see unexpected delays when executing requests in a large document (since we would need to bind all variables including some that are completely unrelated to the user's request).

An alternate approach may be to support a workflow where the code in the SubmitCode command can either be the entire document or a single request. We could then include a parameter to indicate which case it was - for example the parameter could be called Execute. If the entire document is being passed, we could set Execute to false and the kernel could parse and bind all the file level variables without executing any request. If Execute is set to true (or omitted as will be the case in notebooks and dotnet-repl), we could execute the specified request.

One potential advantage of this approach would be that as the user makes changes to the document, we could asynchronously pass the entire document down to the kernel with Execute set to false whenever there is a long enough pause in the user's typing. This would allow the kernel to parse and bind the variables in the background after the user has made the file changes but before the user wants to execute a request. Then when the user clicks on Send request, we could pass down a SubmitCode command with just the request that was selected for execution with Execute set to true (or with the parameter omitted...). This could reduce any delays caused due to binding in larger documents.

There may also other ways in which something like above could be implemented...

Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whichever approach we decide to go with should also work in the case of notebooks and dotnet-repl. So I think we would need to choose an approach where the 'default' implementation (i.e. where no additional parameters are specified) naturally supports http cells in notebooks and dotnet-repl...

Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleaphar This may be worth a discussion. Lets chat offline once you are back.

I don't think this can be resolved merely by checking whether the file has changed since even minor whitespace changes to one character in the file could end up resulting in a long delay for the subsequent request. And storing and comparing contents of large files may also be problematic from a memory perspective. It may be better to hold on to a checksum / hash of some sort...

In general, and especially in large files, we need ways to
a) amortize some of the cost of parsing and binding so that we don't pay all this cost at the point when user wants to execute a request and
b) avoid redundant binding (and if possible, even avoid redundant parsing) when large portion of the file stays unchanged

For a, we would need to do something like what I mentioned above where we kick off parsing and binding in the background soon after the user stops typing so that we get a head start on parsing and binding before the user executes a request.

For b, it will be hard to avoid redundant parsing (unless we can do something like Roslyn does for C# by incrementally parsing just the lines that changed and fixing up the syntax tree to reflect the changes). But there may be ways to avoid redundant binding (e.g. check if the changed lines contain any variable declarations and only bind new variables that appear in changed code).

Otherwise, users could end up seeing large delays when executing requests and we could end up seeing feedback like the below one even in files that are not as large as the one the user mentions in the below case -

https://developercommunity.visualstudio.com/t/Big-HTTP-file-is-not-interpreted-by-Visu/10717885

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments.

The above approach means that the user could see unexpected delays when executing requests in a large document (since we would need to bind all variables including some that are completely unrelated to the user's request).

We should measure this before making a design decision. I strongly suspect that the difference is not perceptible and that, to the extent that it is, serialization is a larger overhead than redundant parsing. But we should measure it.

We could then include a parameter to indicate which case it was - for example the parameter could be called Execute.

The intent is different so I would suggest this is no longer a SubmitCode. Does the RequestDiagnostics command not already do what's needed here?

return Task.CompletedTask;
}

async Task IKernelCommandHandler<SubmitCode>.HandleAsync(SubmitCode command, KernelInvocationContext context)
{
string? doc = "";
if (command.Parameters.TryGetValue("Document", out doc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we planning to make this would work in the case of notebooks in VS Code or the dotnet-repl tool? Would we need changes for those clients to repeat the notebook cell contents in both 'Document' parameter as well as in SubmitCode.Code to ensure that any variables included in the code are parsed and bound?

Do we need code in the http kernel above to also parse and bind variables present in the SubmitCode.Code (in addition to the Document parameter) to support the notebooks and dotnet-repl scenarios?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a property to the kernel that tracks the current document and if there are any changes from it then we'll go ahead with the re-parse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the potential to introduce bugs where the kernel state gets out of sync with the editor state. What's the performance impact of resending and reparsing the whole document?

var parsedDoc = HttpRequestParser.Parse(doc);
foreach (DeclaredVariable dv in parsedDoc.SyntaxTree.RootNode.TryGetDeclaredVariables(BindExpressionValues).declaredVariables.Values)
{
_variables[dv.Name] = dv.Value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about... There is a potential for _variables to retain bunch of stale variables / named requests over time. Not super high pri - but especially for named requests which can have large payloads, it may be nice to clean up stale entries...

Now that we have a specific code path that re-binds all file-level variables, we may be able to drop all old file-level variables whenever we get a new request to parse and bind the document.

To do this, we would need to separate out the dictionaries for file-level variables, named requests and the remaining (.env) variables. The file-level dictionary can be cleared out whenever we get a new request to bind the document. Any named requests that no longer exist in the document (based on the new syntax tree) could also be cleared out from the dictionary for named requests at this point. But for named requests that still exist in the document, we would need to preserve the old responses.

The dictionary for .env vars would be harder to clean up since we don't have a way to know when .env files change at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would measure this before complicating the design. Separate dictionaries will increase cyclomatic complexity and the potential for bugs.

An alternative might be to store each variable with some metadata including a scope or origin concept, e.g. to differentiate variables whose values came from responses (maybe these are transient) versus those that came from a SendValue command (which the kernel shouldn't have a say in clearing.)

Consider this. In the typical REPL or notebook workflow, a single cell can be edited and resubmitted. Let's say I run this in the C# kernel:

var x = 123;

Then let's say I delete that code and run this:

Console.WriteLine(x);

It would be incorrect for the variable x to be cleared, and probably very surprising if the HTTP kernel had a different behavior.

It's very typical in REPL and notebook workflows for memory to grow over time, which is why restarting the kernel is a familiar gesture. But if we find there's a performance issue here that's significantly different from other use cases, I'd prefer to address it with an explicit user gesture.

@@ -2233,6 +2233,51 @@ public async Task responses_to_named_requests_can_be_accessed_as_symbols_in_late
secondResult.Events.Should().NotContainErrors();
}

[Theory]
[InlineData("json.response.body.$.slideshow.slides.title")]
[InlineData("json.response.body.$.slideshow.slides.type")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we relying on httpbin.org to return a json that looks like this? I think it would be better to use a mock client with our own handler where we can provide our own json response (like we do in some other tests).

Otherwise, we could end up having to deal with failed builds in CI if httpbin.org happens to be down or if the format for the response ends up changing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be nice to see what the response shape looks like in the test so that it is easier to understand what the test is validating...

@a-t-k
Copy link

a-t-k commented Sep 2, 2024

Looking forward to seeing this feature live!

node.AddDiagnostic(diagnostic);
wordParsedOnce = false;
}
if (CurrentToken is not null && (!(CurrentToken is { Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (CurrentToken is not null && (!(CurrentToken is { Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce))
if (CurrentToken is not null && (CurrentToken is not ({ Kind: TokenKind.Word or TokenKind.Whitespace } or { Text: "_" or "@" or "." }) || CurrentToken is { Kind: TokenKind.Word } && wordParsedOnce))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a CurrentToken is { ... } pattern won't match null, so the initial null check shouldn't be necessary.

);

var variables = result.SyntaxTree.RootNode.TryGetDeclaredVariables().declaredVariables;
variables.Should().Contain(n => n.Key == "host").Which.Value.Should().BeOfType<DeclaredVariable>().Which.Value.Should().Be("\"https://httpbin.org\"");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little surprising to me that the quotes are part of the variable value.

Maybe related, are spaces in variable values supported like this then?

@blah = one two three

}

[Fact]
public async Task variables_submited_in_requests_are_not_valid_across_clients()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a particularly useful test.

@@ -111,6 +111,158 @@ public async Task can_handle_multiple_request_in_a_single_submission()
requests.Select(r => r.RequestUri.AbsoluteUri).ToArray().Should().BeEquivalentTo(new[] { "https://location1.com:1200/endpoint", "https://location2.com:1200/endpoint" });
}

[Fact]
public async Task variables_in_submit_request_are_saved()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest way to verify the variable is to use the RequestValue command rather than via a second SubmitCode. This approach should make the test simpler and easier to understand.

@jonsequitur jonsequitur merged commit cf9998d into dotnet:main Sep 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants