Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook related rule updates #8669

Closed
4 tasks done
dhruvmanila opened this issue Nov 14, 2023 · 6 comments · Fixed by #8872
Closed
4 tasks done

Notebook related rule updates #8669

dhruvmanila opened this issue Nov 14, 2023 · 6 comments · Fixed by #8872
Assignees
Labels
rule Implementing or modifying a lint rule

Comments

@dhruvmanila
Copy link
Member

dhruvmanila commented Nov 14, 2023

This is an open discussion around linting Jupyter Notebooks using certain rules that doesn't map well with the notebook model.

For some context, Notebooks are made up of cells, each cell can be assumed to act as a single Python file but that's not exactly true as anything defined in a cell is available throughout the notebook via the global scope.

There are certain lint rules which doesn't map well with notebooks and this issue is to keep track of them in a central place. We can discuss on possible solutions here.

  • E402: module-import-not-at-top-of-file - Usually imports definition in a notebook is scattered across multiple cells. The current workaround is to ignore this rule but one could argue that this rule should work at cell level i.e., "Module level import not at top of cell".
  • B018: useless-expression, B015: useless-comparison - Expressions in a cell are not useless if they're at the end because the result of that expression is the output of the cell. So, these rules should be updated to avoid raising any violation if it's the last expression in the cell.
  • E703: useless-semicolon - On the other hand, semicolons are used to avoid the output mentioned in the previous point. If there's a trailing semicolon for the last expression in a cell, it's probably best to avoid raising a violation for it. The formatter does a similar thing: Preserve trailing semicolon for Notebooks #8590
  • D100: undocumented-public-module - This is irrelevant for Notebooks.
@alkatar21
Copy link

I think another rule that makes less sense in notebooks is D100. Module docstrings are there most likely in markdown.
And I'm also not shure about T201, because you need this from time to time if something within the cell should also be output and not just the last line?

@dhruvmanila
Copy link
Member Author

dhruvmanila commented Nov 20, 2023

Thanks for chiming in on the issue!

Can you help me understand when does D100 would give a false positive in Notebook? Do you mean that the content of the module docstring will be as a separate markdown cell? If so, then I think yeah we should avoid triggering this rule. Also, I think D100 in general isn't relevant for Notebooks as they can't be imported.

For flake8-print rules, I would actually recommend to disable them for Notebooks as some users might actually not want them to be present.

@alkatar21
Copy link

Do you mean that the content of the module docstring will be as a separate markdown cell?

Yes, I would put everything I write in a module in the module docstring in a notebook in a markdown cell, which is why it makes no sense to have a module docstring.

If so, then I think yeah we should avoid triggering this rule. Also, I think D100 in general isn't relevant for Notebooks as they can't be imported.

That's what I basically wanted to say. That it makes no sense to me in notebooks that the rule triggers.

@dhruvmanila dhruvmanila self-assigned this Nov 21, 2023
dhruvmanila added a commit that referenced this issue Nov 22, 2023
This PR avoids triggering `D100` for Jupyter Notebooks.

Part of #8669
dhruvmanila added a commit that referenced this issue Nov 22, 2023
## Summary

This PR updates `B015` and `B018` to ignore last top-level expressions
in each cell of a Jupyter Notebook.

Part of #8669

## Test Plan

Add test cases for both rules and update the snapshots.
dhruvmanila added a commit that referenced this issue Nov 23, 2023
## Summary

This PR updates the `E703` rule to avoid flagging any semicolons if
they're present after the last expression in a notebook cell. These are
intended to hide the cell output.

Part of #8669 

## Test Plan

Add test notebook and update the snapshots.
@dhruvmanila
Copy link
Member Author

I'm a bit torn on how to go for E402. There are a few solutions for this:

  1. We recommend disabling them for notebooks
  2. By default we disable them for notebooks
  3. For notebooks, we'd check the rule for each cell so "Module level import not at top of cell"

I'm thinking of going with (3) in preview.

dhruvmanila added a commit that referenced this issue Nov 29, 2023
## Summary

This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.

The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.

Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.

## Test Plan

Add a notebook with imports in multiple cells and verify that the rule
works as expected.

resolves: #8669
@redeboer
Copy link

Thanks for maintaining this great project!

Another potential solution is to introduce E403 rule that is specifically for notebooks that works at cell level while E402 will be disabled for notebooks.
#8872 (comment)

In some of our projects, we work with notebooks where we prefer to put all imports in one cell at the top of the notebook. If I understand correctly, there is no rule anymore that can enforce this. Might it be an idea to define a new rule E403 specifically for notebooks, like mentioned by @dhruvmanila, but that works on the whole notebook?

If this idea is worth pursuing, I can post it as a separate issue.

@dhruvmanila
Copy link
Member Author

@redeboer Yeah, I think we can discuss this in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants