-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web version of the DataFrame viewer #10825
Conversation
src/webviews/extension-side/dataviewer/dataViewerDependencyService.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/dataViewerDependencyService.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/dataViewerDependencyService.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/dataViewerDependencyService.ts
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## main #10825 +/- ##
=======================================
Coverage 63% 63%
=======================================
Files 482 486 +4
Lines 33652 33736 +84
Branches 5488 5499 +11
=======================================
+ Hits 21295 21413 +118
+ Misses 10304 10274 -30
+ Partials 2053 2049 -4
|
src/webviews/extension-side/dataviewer/dataViewerDependencyService.node.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/dataViewerDependencyService.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/kernelDataViewerDependencyImplementation.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/kernelDataViewerDependencyImplementation.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/kernelDataViewerDependencyImplementation.ts
Outdated
Show resolved
Hide resolved
src/webviews/extension-side/dataviewer/kernelDataViewerDependencyImplementation.ts
Outdated
Show resolved
Hide resolved
constructor(private readonly applicationShell: IApplicationShell, private isCodeSpace: boolean) {} | ||
|
||
protected async execute(command: string, kernel: IKernel): Promise<(string | undefined)[]> { | ||
const outputs = await executeSilently(kernel.session!, command); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption that the kernel has a session (with the !) seems like it might be sneaking around the type system a bit. I don't see, at least at this point, an explicit reason the session would be undefined. Per the usage of this function it look like it should maybe just take a Session as a non optional parameter? I don't believe anything else on the Kernel is used, and the caller should guarantee that there is a session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see the check at the start of the dependency check later in the code. Seems like after that you could just use the session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called only by a function that checks for the session, but I like your idea! Passing the session directly, rather than the kernel. Thank you 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? c3341d7
const pandasVersion = await this.getVersion(kernel); | ||
|
||
if (pandasVersion) { | ||
if (pandasVersion.compare(pandasMinimumVersionSupported) > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to have another telemetry event here? It's the type of thing that we can probably calculate using other events, but an explict Pandas === Ok event might help calculating how many Pandas failures we see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? c0f10c7
export const kernelGetPandasVersion = | ||
'import pandas as _VSCODE_pandas;print(_VSCODE_pandas.__version__);del _VSCODE_pandas'; | ||
|
||
function kernelPackaging(kernel: IKernel): '%conda' | '%pip' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonJayamanne might know best here. But this will use %pip for envs like poetry. In those cases should we just not do the install and ask them to install manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agreed,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does that mean? how do I catch environments like poetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unlikely this will ever pick 'conda' either. When we have an IKernel, we're in a remote situation. Otherwise we'd be using the interpreter.
I think we can leave this as is for now, and then figure out some other way to detect the presence of the package manager later.
sendTelemetryEvent(Telemetry.PandasTooOld); | ||
// Warn user that we cannot start because pandas is too old. | ||
const versionStr = `${pandasVersion.major}.${pandasVersion.minor}.${pandasVersion.build}`; | ||
throw new Error(DataScience.pandasTooOldForViewingFormat().format(versionStr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the prompt be displayed at this point in time?
I.e. if the version is too old, shouldn't we display the prmonpt and ask to install.
sendTelemetryEvent(Telemetry.UserInstalledPandas); | ||
} | ||
} else { | ||
sendTelemetryEvent(Telemetry.UserDidNotInstallPandas); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are still duplicated, and will be duplicated in 3 places again.
Here's my suggestion
protected async installMissingDependencies(token) {
const pandasVersions = await this.getPandasVersion(token);
if (pandasVersion && ) {
if (pandasVersion.compare(pandasMinimumVersionSupportedByVariableViewer) > 0) {
return;
}
sendTelemetryEvent(Telemetry.PandasTooOld);
} else {
sendTelemetryEvent(Telemetry.PandasNotInstalled);
await this.installMissingDependencies(tokenSource);
}
protected abstract getPandasVersion(token);
protected installMissingDependencies(token) {
sendTelemetryEvent(Telemetry.PythonModuleInstall, undefined, {
action: 'displayed',
moduleName: ProductNames.get(Product.pandas)!,
pythonEnvType: interpreter?.envType
});
const promptResult = ....
sendTelemetry(...)
if (doNotInstall) {
sendTelemetryEvent(....)
} else {
sendTelemetryEvent(....)
}
}
this way most of the common code is in the base class, and the child classes will inherit the base class and only implement the code to provide the versions and fill inn the blanks for installing.
& all of the code for sending telemetry if prompt is cancelled or clicked ok, or the like is in one place.
Right now its still copied around and still duplicated & I believe it can still be improved to avoid the duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These kind of abstractions seem like early optimizations to me, given that they give the impression that the installer can become fully generic, when it isn't ready for doing so. For example:
- Why tie this to Pandas specifically?
- How do we pass the kernel or interpreter around? Would we use an abstract type on the abstract class to determine the type of "evaluator" that will be passed along?
- If not passed around, should the evaluator (kernel / interpreter) be part of the state of the class? If so, should we have a form of setup function that ensures that we always have that evaluator? because we can't use the constructor for this at the moment.
- Alternatively, should we move the instantiaton of the
*DataViewerDependencyImplementation
to thecheckAndInstallMissingDependencies
rather than keeping it in the constructor of theDataViewerDependencyService
? That would come at the compromise of having to instantiate a class on everycheckAndInstallMissingDependencies
call.
- Alternatively, should we move the instantiaton of the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Don is just suggesting that shared code should be in a base class. Simply to eliminate duplicate code. The design should be relatively close to the same I'd think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides that, the behavior is not exactly the same with the kernel as with the interpreter, because the interpreter version does have some of the installing abstracted out into the IInstaller
.
I'd like to reach to a level of compromise for this PR, then get the Plot Viewer working on the web, then come back to the debugger installer, then move all to the IInstaller
as the last step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what do you think of this?
export abstract class BaseDataViewerDependencyImplementation<TExecuter> implements IDataViewerDependencyService {
protected abstract _getVersion(executer: TExecuter): Promise<string | undefined>;
protected async getVersion(executer: TExecuter): Promise<SemVer | undefined> {
try {
const version = await this._getVersion(executer);
return typeof version === 'string' ? parseSemVer(version) : version;
} catch (e) {
traceWarning(DataScience.failedToGetVersionOfPandas(), e.message);
return;
}
}
Then
export class KernelDataViewerDependencyImplementation extends BaseDataViewerDependencyImplementation<IKernel> {
protected async _getVersion(kernel: IKernelWithSession): Promise<string | undefined> {
const outputs = await this.execute(kernelGetPandasVersion, kernel);
return outputs.map((text) => (text ? text.toString() : undefined)).find((item) => item);
}
I'm thinking I'd rather focus on telemetry wrappers than on splitting the steps into common functions.
Feedback appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is just an idea. Once I understand what kind of approach/pattern to take, I can extrapolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps more like this?
protected abstract _getVersion(executer: TExecuter): Promise<string | undefined>;
@traceDecoratorWarn(DataScience.failedToGetVersionOfPandas())
protected async getVersion(executer: TExecuter): Promise<SemVer | undefined> {
const version = await this._getVersion(executer);
return typeof version === 'string' ? parseSemVer(version) : version;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this look? sadasant#1
cc: @DonJayamanne , @IanMatthewHuff , @rchiodo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged that PR!
throw new Error(DataScience.failedToInstallPandas()); | ||
} | ||
} else { | ||
sendTelemetryEvent(Telemetry.UserDidNotInstallPandas); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending the telemetry after installing and when failing, or when not required to install is still duplicated and feel this can be moved into the base class.
} | ||
sendTelemetryEvent(Telemetry.PandasTooOld); | ||
// Warn user that we cannot start because pandas is too old. | ||
const versionStr = `${pandasVersion.major}.${pandasVersion.minor}.${pandasVersion.build}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can also be moved into the base class, all the implementations need to do is get the version and return it, and the base class can check if a version was available or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think a lof of the code can be moved into a base class, even if debugger implementation comes in later.
import { BaseDataViewerDependencyImplementation } from './baseDataViewerDependencyImplementation'; | ||
|
||
export const kernelGetPandasVersion = | ||
'import pandas as _VSCODE_pandas;print(_VSCODE_pandas.__version__);del _VSCODE_pandas'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems weird to have this one one line? Why not have line feeds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the debugger I'll definitely need to work in line feeds, at least the last time I tried it. So, we'll get there 👍
Reported a (new?) flaky test: #10860 |
Now I'm getting two different failures, one regarding DataScience Exports, and another one where the |
(The prior take on solving #9665 was PR: #10604)
Last month we tried to get the DataFrame viewer working on the web. In summary, we arrived at a set of bigger goals that are available here: #10638 . The first step actually does solve #9665 , but with an imperfect approach that is nonetheless good enough to get this feature out and get us started towards our longer term goals.
The first step, solved in this PR, goes as follows:
This PR contains those changes, and includes unit tests.
Fixes #9665
Feedback always appreciated!