Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kedro Language Server for VSCode to make navigation easier #3691

Closed
noklam opened this issue Mar 8, 2024 · 4 comments
Closed

Kedro Language Server for VSCode to make navigation easier #3691

noklam opened this issue Mar 8, 2024 · 4 comments

Comments

@noklam
Copy link
Contributor

noklam commented Mar 8, 2024

Introduction

Slack: https://kedro-org.slack.com/archives/C03R8N2M8KT/p1709892536797619

This ticket is a proposal to create a VSCode Plugin for Kedro, the goal of this plugin is to improve the developer experience and lower the learning curve of Kedro. It can also be a good driver for 0.19 adoption.

lsp

Background

Related: #2821
Kedro has an opinionated project structure and often users need to interact with multiple files includes:

  • catalog(s).yml
  • parameters(s).yml
  • pipelines.py

The reason for this is Kedro unique DataCatalog, Node and Pipeline, it uses "string" in these files, and thus traditional IDE features like navigation, static analysis is not possible.

Problem

The problem is quite obvious, using Kedro requires some muscle memory to understand which files to edit and jumping between multiple files is not uncommon. It's natural to use IDE to aid for traditional software development.

What's in scope

  • A VSCode plugin MVP
    • Support navigation of Config
    • Support tooltip (See the gif above)
  • Navigation works for spaceflights-pandas (no advance dynamic features)

What's not in scope

  • VSCode only, no PyCharm extension yet, Jupyter is unclear
  • environment
  • dataset patterns
  • variable interpolation
  • custom resolver

Measure of success

To measure the usage of this plugin is pretty straightforward, we don't need extra telemetry. PyPi package download stats and VSCode Extension download stats should be sufficient.

Design

https://pypi.org/project/kedro-lsp/ This is not a new idea, and indeed a PoC was implemented during 0.17.x series. The idea is implementing a Kedro Language Server, and a client (in this case, a VSCode extension).

The plugin may also need to support custom configuration for runtime only config:

  • environment (without knowing which environment users are using, it's impossible to resolve properly). The extension will be running from separate process so it wouldn't understand what user actually run.

Challenges

Challenge 1 - Resolve Origin of Configuration

Related:

The question of "Where is this configuration coming from" is crucial for this plugin in order to make IDE navigation works seamlessly.

Challenge 2 - limited support for dynamic pipeline

Most IDE feature work more or less on static information.

Alternatives considered

Do nothing or get rid of the "string" thing in Kedro so everything is a Python object and doesn't requires extra implementation. Earlier research suggest most user prefer using YAML instead of Python for configuration though.

Testing

TBD

Rollout strategy

This plugin will not be backward compatible and only works for 0.19+. On the other hand, any bugs or broken feature shouldn't poses concern for users as it's likely the IDE just doesn't respond.

Future iterations

See "What's not in scope", handle dynamic generated configuration betters.

@noklam noklam changed the title Kedro LSP for VSCode Kedro Language Server for VSCode to make navigation easier Mar 8, 2024
@noklam noklam added this to the Create IDE plugins for major editors milestone Mar 8, 2024 — with Slack
@merelcht merelcht added this to Roadmap Mar 28, 2024
@merelcht merelcht moved this to Near term in Roadmap Mar 28, 2024
@merelcht merelcht moved this to To Do in Kedro Framework Apr 10, 2024
@noklam
Copy link
Contributor Author

noklam commented Apr 18, 2024

I have started to work on it this week, so far my top priority is getting it packaged properly so I can easily ask someone to test it out.

Challenge 1 - Resolve Origin of Configuration
There are some additional things need to be implemented:

  • Environment resolution (there should be a settings in the IDE to allow user choosing things other than "base"), not the top priority

Essentially I need a magic function look like this:

def find_config(some_string):   # the string will be where the cursor at the IDE
       # Figure out which environment should be used
       # Figure out is it a `params` or `dataset`
       # Figure out what is the value of it
       # Figure out where is the source of the value, which file and line_no i.e. conf/base/parameters.yml:1-3
       # Is it in base/parameters.yml or local/parameters.yml or `base/parameters_data_science.yml`?
       # return a structured response

Approach 1 - Navigation to a unresolved version

  • Need to record information such as line number, which files, and record the resolution so it can trace to the "winner" properly
  • Handle resolver properly

This could be tricky because it seems that OmegaConf.load is more destructive compare to yaml. Some of the information are lost. In addition, the OmegaConfigLoader load and merge everything in a big function, so all these intermediate result will be thrown away and LSP has no way to access it.

Approach 2 - Navigation to a Resolved version of config

This approch is simpler for the LSP, but requires kedro compile exists in advance.

Approach 1 is arguably favoring the "read" and approach 2 is favoring the "write"

@datajoely
Copy link
Contributor

datajoely commented Apr 22, 2024

Just had a really great demo from @noklam on this:

  • I'm still supportive of kedro compile but that's a bigger conversation
  • I think 80% of the user value is found in the two user journeys:
    • Command+Click - YAML Catalog to 1 or more Python Pipeline references
    • Command+Click - Python Pipeline reference to YAML catalog entry (1:1)

Focus on that for MVP state and we're in business 💪

@astrojuanlu
Copy link
Member

This plugin now exists 🎉 https://marketplace.visualstudio.com/items?itemName=kedro.Kedro

We addressed some bugs in the first releases but the next steps here are

  • Do a bit more testing
  • Record a video about it
  • Announce it more publicly
  • Continue collecting user feedback

I'm closing this issue as "complete" 🔶 well done @noklam!

@github-project-automation github-project-automation bot moved this from To Do to Done in Kedro Framework May 13, 2024
@astrojuanlu
Copy link
Member

In the future we'll continue addressing the technical challenges @noklam exposed in #3691 (comment) to make the implementation more robust

@merelcht merelcht moved this from Near term to Current in Roadmap May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Status: Current
Development

No branches or pull requests

4 participants