Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web version of the DataFrame #10604

Closed
wants to merge 22 commits into from
Closed

Conversation

sadasant
Copy link
Contributor

@sadasant sadasant commented Jun 28, 2022

The only thing missing on the web to enable the DataFrame features was to have a web-ready version of the DataViewerDependencyService.

After discussing ideas with the team, our immediate goals are:

  • To ensure the web variable viewer can use the data viewer.
  • In the case the data viewer is accessed through the variable viewer, it should confirm that it has pandas and install it if it does not have it.
    • This is possible because we have a Kernel at hand thanks to the variable viewer.
  • In the case theta viewer is accessed through the debuggers:
    • It should work on Node.js as is today.
    • On the web, it should check if it has pandas installed. If so, it should work. If it doesn't have pandas, it should throw an error, asking users to install pandas.

To accomplish the above, this PR:

  • Changes the Node.js version of the DataViewerDependencyService to accept an IKernel besides the interpreter that it accepts today, and then retrieve the interpreter from the kernel to keep the current behavior intact.
  • Add a web version of the DataViewerDependencyService that would accept an IKernel and would also load the IJupyterVarialbes on the constructor, to either use the kernel or the debugger to at least get the version of pandas.
  • On the web, if no kernel is provided and if IJupyterVarialbes did load, the debugger is able to read the version of pandas if it exists, and the data viewer works as is then. If pandas does not exist, it will throw an error asking the user to install pandas.
  • When an IKernel is provided, it would use either %conda or %pip based on the IKernel's KernelConnectionMetadata. %conda if it had an interpreter (local case) that was conda based. Otherwise %pip.

Besides the above, which are the main changes of this PR, this PR also:

  • Moves the parseSemVer function from platform/common/utils.node to platform/common/utils.
  • Adds IDataViewerDependencyService to the extension-side web service registry.
  • removes the isWebconditional from thevariableExplorerButtonCellFormatter.tsx`.

Thus:
Fixes #9665
Fixes #10310

Here are some issues that are not fixed with this PR:

Finally, a comment about the current state of the debugger on the web:

  • The Python debugger (from the vscode-python extension) does not currently work on the openInBrowser approach to test our web features.

Feedback always appreciated!


  • Pull request represents a single change (i.e. not fixing disparate/unrelated things in a single PR).
  • Title summarizes what is changing.
  • Has a news entry file (remember to thank yourself!).
  • Appropriate comments and documentation strings in the code.
  • Has sufficient logging.
  • Has telemetry for feature-requests.
  • Unit tests & system/integration tests are added/updated.
  • Test plan is updated as appropriate.
  • package-lock.json has been regenerated by running npm install (if dependencies have changed).

@@ -311,7 +310,7 @@ export class DebuggerVariables
return results;
} else {
traceError(`Cannot evaluate ${code}`);
return undefined;
throw new Error(`Cannot evaluate ${code}`);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have code that cannot be evaluated by these methods in the debugger?

@@ -297,8 +297,7 @@ export class DebuggerVariables
this.importedGetVariableInfoScriptsIntoKernel.delete(key);
}

// eslint-disable-next-line @typescript-eslint/no-explicit-any
private async evaluate(code: string, frameId?: number): Promise<any> {
async evaluate(code: string, frameId?: number): Promise<DebugProtocol.EvaluateResponse['body']> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm exposing this function so I can use it in the debugger path within the data viewer dependency service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this

@@ -0,0 +1,121 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same file that was in src/test/datascience/data-viewing/dataViewerDependencyServic.node.test.ts. Only that checkAndInstallMissingDependencies now receives an object.

false
);
kernel = instance(mock<IKernel>());
dependencyService = new DataViewerDependencyService(instance(appShell), false, undefined);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the kernel-specific tests. This code works in Node and in the browsers.

};
appShell = mock(ApplicationShell);
variableProvider = instance(mock<IJupyterVariables>());
dependencyService = new DataViewerDependencyService(instance(appShell), false, variableProvider);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests use the debugger.

pythonEnv &&
(await this.dataViewerDependencyService.checkAndInstallMissingDependencies({
interpreter: pythonEnv
}));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interpreter is only being used in Node.js, in the web the interpreter is ignored and the debugger is used (in this case, because here we don't have a kernel attached).

_VSCODE_subprocess.check_call([_VSCODE_sys.executable, "-m", "pip", "install", "pandas"])`,
'_VSCODE_install_pandas()',
'del _VSCODE_subprocess, _VSCODE_sys, _VSCODE_install_pandas'
];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking on leaving this here for future reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonJayamanne you can see the approach I took here. Here I used sys.executable. This code above does work — except that it crashes after a while and makes the debugger unusable.

@codecov-commenter
Copy link

codecov-commenter commented Jun 28, 2022

Codecov Report

Merging #10604 (4de0be9) into main (440397a) will decrease coverage by 0%.
The diff coverage is 76%.

❗ Current head 4de0be9 differs from pull request most recent head 6648dc2. Consider uploading reports for the commit 6648dc2 to get more accurate results

@@           Coverage Diff           @@
##            main   #10604    +/-   ##
=======================================
- Coverage     71%      71%    -1%     
=======================================
  Files        472      474     +2     
  Lines      27993    28144   +151     
  Branches    4691     4720    +29     
=======================================
+ Hits       19946    20052   +106     
- Misses      6179     6205    +26     
- Partials    1868     1887    +19     
Impacted Files Coverage Δ
src/kernels/variables/jupyterVariables.ts 81% <0%> (-6%) ⬇️
src/kernels/variables/kernelVariables.ts 56% <0%> (-1%) ⬇️
src/kernels/variables/types.ts 100% <ø> (ø)
src/platform/common/utils.node.ts 53% <ø> (-5%) ⬇️
src/telemetry.ts 100% <ø> (ø)
src/webviews/extension-side/dataviewer/types.ts 100% <ø> (ø)
src/kernels/variables/debuggerVariables.ts 69% <50%> (ø)
...ide/dataviewer/dataViewerDependencyService.node.ts 82% <62%> (-4%) ⬇️
src/platform/common/utils/localize.ts 76% <64%> (+<1%) ⬆️
src/platform/common/utils.ts 80% <75%> (-1%) ⬇️
... and 27 more

@sadasant sadasant marked this pull request as ready for review June 28, 2022 03:18
@sadasant sadasant requested a review from a team as a code owner June 28, 2022 03:18
@@ -56,6 +57,7 @@ export interface IJupyterVariables {
): Promise<IJupyterVariable | undefined>;
// This is currently only defined in kernelVariables.ts
getVariableProperties?(name: string, kernel?: IKernel, cancelToken?: CancellationToken): Promise<JSONObject>;
evaluate(code: string, frameId?: number): Promise<DebugProtocol.EvaluateResponse['body']>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think IJuptyerVariables1 needs such a generic method.
Basically we're exposing the ability to execute random code in an interface used specifically to get variables.
Also frameId is specific to debugger, hence this doesn't belong here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could make a more generic service that you can execute a piece of code on. The 'executeSilently' method we use elsewhere seems like it would be a good candidate. It could keep track of the topmost frame id and use the debugger if the kernel was 'frozen' at a breakpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sounds good, however would be good to have one that takes DebugSession as an argument instead.
However I believe this would have to be a class, something that can listen to debugger events and track the frames.
Either way, we can discuss this at standup or next week

@@ -406,4 +407,8 @@ export class KernelVariables implements IJupyterVariables {
}
return this.enhancedTooltipsExperimentPromise;
}

evaluate(): Promise<DebugProtocol.EvaluateResponse['body']> {
throw new Error('Not implemented.');
Copy link
Contributor

@DonJayamanne DonJayamanne Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this isn't implemented here means that something is not quite right.
E.g. its possible for someone to end up calling this in the future and things will not work as expected.
I.e. there's nothing preventing anyone from using this API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this evaluate approach is approved (considerable if) I promise to create an issue to implement this in the next cycle.

Or I can implement it as part of this PR during the next cycle, given that it seems unlikely to get this PR merged soon.

The takeaway is that I do want to fix this asap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can implement it as part of this PR during the next cycle, given that it seems unlikely to get this PR merged soon.

I prefer this option.

export const IDataViewerDependencyService = Symbol('IDataViewerDependencyService');
export interface IDataViewerDependencyService {
checkAndInstallMissingDependencies(interpreter: PythonEnvironment): Promise<void>;
checkAndInstallMissingDependencies(options: IDataViewerDependencyServiceOptions): Promise<void>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should change to IKernel, we don't need an overload to pass an interprereter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that when we are in the debugger, we don’t have a kernel.

Copy link
Contributor

@DonJayamanne DonJayamanne Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd suggest passing either Ikernel or DebugSession so that its more obvious what the choices are.
From DebugSession the method can then determine the sys.executable & then get the interpreter & run the same old code as in the past.

This way the caller knows about the choises,

  • I.e. pass a kernel when we have one & when we don't then pass the debug session (or vice versa).
  • Else passing an interpreter means the caller can pass an interpreter in all scenarios (even in remote debugging - which isn't correct)

commands: string[],
options: IDataViewerDependencyServiceOptions
): Promise<(string | undefined)[]> {
const kernel = (options as IDataViewerDependencyServiceOptionsWithKernel).kernel;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right.
Dependency service shouldn't take a dependency on a variable provider.
At this point, all we need to do is:

  • Check if we have an active debug session
  • If we have a debug session, then use the debug sessions evaluate method to run some aribtrary code & get the desired output from there.

Today we check in other places whther we have an active Debug session using IDebugService.activeDebugSession and if that is not undefined then we know we have a debug session.
Next you can use IDebugService.activeDebugSesion.customRequest method to send a custom request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • To use activeSession.customRequest we need a frame Id, and the frame Id of the variable in the Python debugger used to show the data viewer does not work. It runs the code and shows to succeed even in the logs, but it does not yield any value, therefore that frameId is not useful, and no frameId doesn’t work either.
  • The debugger cannot currently be used to install pandas, the installation doesn’t finish and the debugger becomes unusable afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments here #10605 (comment)

Again, I don't think we should expose evaluate method onn a vairbla provider class for anyone to use.

Copy link
Contributor

@DonJayamanne DonJayamanne Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debugger cannot currently be used to install pandas, the installation doesn’t finish and the debugger becomes unusable afterwards.

I don't think we should use the debugger to install packages.
We can get the path to the executable using import sys;sys.executable and use the old code to install using the interpreter information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctiveSession.customRequest we need a frame Id, and the frame Id of the variable

Not following your comments here.
Are you saying that frameId is useless or it is required.
Looking at the code you have exposed frameId as an optional argument in some new methods. & if its not used, then I dont see the need to expose such an argument to methods.

ut it does not yield any value, therefore that frameId is not useful, and no frameId doesn’t work either.

Have you tried a frame id of 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use sys.executable here and it (later, in the middle of the installation) causes a major issue in the web. I believe that you mean only to extract the interpreter path and use the existing interpreter code then, if that’s correct I need to investigate how to make that code work on the web.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(later, in the middle of the installation) causes a major issue in the web. I believe that you mean only to extract the interpreter path and use the existing interpreter code then, if that’s correct I need to investigate how to make that code work on the web.

If we pass the IKernel, then we don't need the sys.executable in web. In web we know the kernel when debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In web we know the kernel when debugging.

That is not the case if a file is debugged before the user connects to the kernel.

(Though Python debugging doesn't quite work in the web)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python debugging doesn't quite work in the web)

Exactly, hence the only time we're debugging is when we launch the debugger for IW or notebooks & we know exactly what the IKernel is in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, let me summarize:

  • For the web, IKernel is enough.
  • For Node, interpreter is enough.

The edge cases are related to debugging where:

  • In IW and Notebooks, I can use IKernel.
  • In Python, it does not work, and if it ever gets implemented we can check for the python version with the debugger and ask users to install it themselves.

If that's correct, then the main issue with this PR I believe is that I'm using execute from the IJupyterVariables, which I need to do since I haven't figured out a way to get a response from the variable's frameId nor from frameId 0. If we solve the frameId issue, I can avoid using IJupyterVariables.


For the python debugger in the web case, we can either keep the two possible parameters to the dependency service and use the debugger to check the pandas version like how we're doing in this PR. Otherwise, we can discuss that design after we agree on the points above, before the break in this comment.

Copy link
Contributor

@DonJayamanne DonJayamanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable viewer should not expose any method to execute arbitarry code, that breaks the API.
We should only have one way to execute code:

  • Debugger session using the customRequest method
  • & methods on IKernel

Having other classes call into a variable class to evaluate code when it doesn't need anything from variables seems in correct.

I've provided a high level solution on how we could achieve this by passig in the IKernel instead of any other object.

@IanMatthewHuff
Copy link
Member

@DonJayamanne and @sadasant I don't believe there is a rush here for this particular release. At this point we are already in endgame. I don't believe we should be taking this big a change at this point regardless of the review state. It's the plus of the shorter release cycle it should just go on the next train.

I'd agree with Don that I don't want the evaluate access on the variables, that doesn't seem like the right spot for it to me. I'd expect that to be via a more general mechanism of some type, not just arbitrary code execution. That adds a really broad access point into a scoped interface. Regarding the IKernel access my understanding is that when debugging is launched via normal .py debugging (for the open in data viewer scenario) we won't have access to the IKernel so that scenario would need debugger install for the web.

I worry that we're yanking this PR around too much which isn't good for Daniel's time. This is just a proposal from my side, but what about we start with maybe a meeting on the broader architecture here. I feel like it might be more time efficient at a general interface level before the implementation get's fully hashed out.

@sadasant sadasant closed this Jun 28, 2022
@sadasant sadasant deleted the web/9665-6 branch July 20, 2022 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants