Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatter: Add --skip-string-normalization option (parity with Black)? #7525

Closed
mikeckennedy opened this issue Sep 19, 2023 · 27 comments · Fixed by #8822
Closed

Formatter: Add --skip-string-normalization option (parity with Black)? #7525

mikeckennedy opened this issue Sep 19, 2023 · 27 comments · Fixed by #8822
Labels
cli Related to the command-line interface formatter Related to the formatter wish Not on the current roadmap; maybe in the future

Comments

@mikeckennedy
Copy link

mikeckennedy commented Sep 19, 2023

Hey folks. I see that Black added --skip-string-normalization ( psf/black#118 (comment) ) but ruff format --skip-string-normalization leads to error: unexpected argument '--skip-string-normalization' found. Any chance of adding this feature to ruff format?

Alternatively, preferably, have ruff potentially prefer '' rather than "" for strings as an option. That is, still normalize strings, but to '' if specified.

I saw two other issues include this text, but neither seemed to be related to this actual functionality. Sorry if it's already addressed.

@charliermarsh
Copy link
Member

Hey Michael! We haven't implemented it yet but we're considering it -- there's some discussion here (#7305). The formatter does support parameterizing on preferred quote style (single vs. double) -- it's not exposed on the CLI or configuration yet (just internally), but will be soon. So the main thing we're trying to figure out is whether the ability to select a preferred quote style obviates the need for --skip-string-normalization, or whether there are still important use-cases where --skip-string-normalization is required. Any thoughts? What do you tend to use it for?

@charliermarsh charliermarsh added the formatter Related to the formatter label Sep 19, 2023
@mikeckennedy
Copy link
Author

mikeckennedy commented Sep 19, 2023

Thanks for the quick thoughts @charliermarsh! On the Black conversation/issue there are lots of times that people mentioned 'string' has less visual noise than "string" and I totally agree and prefer '' for that reason.

Another, albeit less common one, that applies to me is RSI issues. I've long had to deal with RSI issues so I'm very careful about how and how much I type (not a huge deal these days but do have to be careful).

When typing "k", you press 5 keys and use two hands. When typing 'k' you press three and only have to use one hand. Minor but when typing code all day, it adds up.

Those are my two reasons.

If you gave us a way to prefer ' over " for normalization, I would not use --skip-string-normalization personally. I do want them normalized but want less visual noise and fewer incentives to type more (striving for parity visually).

@charliermarsh
Copy link
Member

Thanks, that's super helpful. We are planning to expose an option to prefer ' over " (it's already implemented internally, but not yet exposed -- coming in one of the next few releases). We'll probably start there and then see if folks still have a need for --skip-string-normalization.

@mikeckennedy
Copy link
Author

Awesome, I'll keep my eye out for it. :)

@T-256
Copy link
Contributor

T-256 commented Sep 19, 2023

@charliermarsh Should optional features different between Ruff and Black documented? What Ruff has, Black has not, and vice versa.

@coady
Copy link

coady commented Sep 19, 2023

I would still be interested in the neutral --skip-string-normalization. Before black, there was a common (but seemingly forgotten) convention to use ' for short identifiers and " for text. Which was also convenient for text with apostrophes.

@MichaReiser
Copy link
Member

I would still be interested in the neutral --skip-string-normalization. Before black, there was a common (but seemingly forgotten) convention to use ' for short identifiers and " for text.

Can you give an example for a short identifier where you prefer using '?

Which was also convenient for text with apostrophes.

Ruff doesn't change the quote if it would lead to more escapes. Ruff preserves the double quotes for "Ruff doesn't change the quotes of this string" because rewriting it to use single quotes would require escaping ' which reduces readability.

@coady
Copy link

coady commented Sep 21, 2023

Can you give an example for a short identifier where you prefer using '?

if 'id' not in data:
    warnings.warn("Data is missing id field.")

@MichaReiser MichaReiser added this to the Formatter: Beta milestone Sep 25, 2023
@MichaReiser MichaReiser added the needs-decision Awaiting a decision from a maintainer label Sep 25, 2023
@charliermarsh
Copy link
Member

@mikeckennedy - Just trying to gather data: do you use single quotes for docstrings and multi-line strings? E.g.:

def function(x):
    '''Some docstring.'''

Or only for "inline" strings, like function('argument')?

@153957
Copy link
Contributor

153957 commented Sep 25, 2023

Personally I use triple double quotes for docstrings, and single quotes (including triple single quotes) for most other strings except if the string contains single quotes (and no double quotes) then I use double quotes.

But I might be fine with a preference for single quotes everywhere (so also for docstrings).

@MichaReiser MichaReiser removed this from the Formatter: Beta milestone Sep 27, 2023
@MichaReiser MichaReiser added wish Not on the current roadmap; maybe in the future and removed needs-decision Awaiting a decision from a maintainer labels Sep 27, 2023
@charliermarsh
Copy link
Member

Our current plan is not to implement this setting as it exists in Black, and first see how far folks can get with our quote-style configuration (see: #7615). If there's still strong demand, we'll of course revisit this flag (and so we'll leave this issue open).

Another potential outcome here is that we add something like quote-style = "preserve", to avoid changing quotes on strings but still reformat unnecessary escape sequences and similar.

@mikeckennedy
Copy link
Author

mikeckennedy commented Sep 28, 2023

@charliermarsh Exciting to hear, thanks. For me, I think it's always " for multi-line strings for me. I recall some sort of recommendation (at least via PyCharm but where did they get it?) for " to be used for doc strings (and kinda all the """ this style """). But I don't have strong feelings here.

@kbd
Copy link

kbd commented Oct 19, 2023

+1 for quote-style = "preserve". A habit I picked up a long time ago is to use single quotes for identifiers, like dictionary keys, and use double quotes for strings.

@MartinCura
Copy link

MartinCura commented Oct 25, 2023

Chiming in as i test the new formatter (awesome work you're doing, btw!):

I'm testing ruff fmt in an existing codebase that has skip-string-normalization = true and the result is over 150 files changed just because of normalizing strings 😅

If you ask me, i prefer to just normalize strings in all my projects and move on... but on the other hand we have cases like this project i'm working on with a team that doesn't want to bother with this. I am a fan of opinionated tools so i'm not really asking that you implement this, but i can see why it's useful for migrating between tools (black -> ruff fmt) as painlessly as possible. my 2 ¢!

@charliermarsh
Copy link
Member

Thanks @MartinCura -- totally understand, this kind of feedback is really valuable for us as we try to make a decision here :)

@AndydeCleyre
Copy link

AndydeCleyre commented Oct 26, 2023

I usually use " for strings I expect to be printed or seen during the program's use, and ' for all internal strings, making exceptions to avoid escaping. This will keep some of my projects using black for now.

For triple quoted docstrings, whatever makes the docstring linter happy.

For other triple quoted docstrings, I prefer single quotes because it's less noisy and more obvious what I'm looking at, but it's not important to me.


EDIT: Oh yeah, and I use " for f-strings.

@MichaReiser
Copy link
Member

MichaReiser commented Oct 26, 2023

One consideration with supporting the option is (we could look at black) what the expected behavior is when using the preview formatting that automatically joins and splits strings to make them fit. We then face the situation that we need to join multiple strings, where each string might use different quotes. In which case at least some sort of normalization becomes necessary.

@AndydeCleyre
Copy link

FWIW those long strings usually fall under the user-facing-English-content category for me, which I associate with double quotes.

@MartinCura
Copy link

but on the other hand we have cases like this project i'm working on with a team that doesn't want to bother with this

small unnecessary update: normalized strings in my repo when no one was looking 👀 😅

@mmerickel
Copy link
Contributor

mmerickel commented Oct 27, 2023

As someone converting some codebases to ruff, I ran into this immediately as we are using skip-string-normalization with black. I personally do not see the need to normalize what quote-style is used by folks and would like the option for a quote-style = "preserve" to keep us operating as we were with black. Thanks for this amazing tool regardless!

@MichaReiser MichaReiser added this to the Formatter: Stable milestone Oct 30, 2023
@pmhahn
Copy link

pmhahn commented Nov 6, 2023

I inherited some 20y old code, where " and ' are both used. Sadly I'm not the only code owner so I cannot enforce one style. Therefore a quote-style = "preserve" would help until the powers above me withdraw their blockage to just move to just one consistent style 😞

@chockseswaramurthy
Copy link

Adding comment for more data points :) . We also ran into this issue when trying to replace Black with ruff formatter in our 10yr old big’ish Django code base with skip-string-normalization=true configured currently. We use both single and double quotes and have preferred single quotes historically but also allow double quotes for some things. Internal consensus is leaning towards preserving existing quote styles to co-exist. From reading above, I think quote-style = "preserve" should help us adopt ruff formatter with minimal moving changes.

@sciyoshi
Copy link
Contributor

Re-reading the comments in this thread, it doesn't seem like there's been a final decision on what the best approach is. I've opened a PR which adds "preserve" as a quote-style option, which is what we would want in order to adopt ruff format for a codebase which uses skip-string-normalization. We don't particularly care about having consistent quotes across the codebase which is why that option is useful for us. Also, maybe "ignore" would be more descriptive than "preserve"?

sciyoshi added a commit to sciyoshi/ruff that referenced this issue Nov 22, 2023
DimitriPapadopoulos added a commit to DimitriPapadopoulos/skeleton that referenced this issue Nov 23, 2023
I have based this commit on:
- ruff-pre-commit README.md | Using Ruff with pre-commit
  https://github.com/astral-sh/ruff-pre-commit/blob/main/README.md
- The Ruff Formatter | Conflicting lint rules
  https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules

Support for the ruff formatter was added in pytest-ruff by commits from
October 2023, released the same day as versions 0.2 and 0.2.1. Hence, it
makes sense to require pytest-ruff ≥ 0.2.1 now that we use the ruff
formatter.

I cannot find in ruff the equivalent for `skip-string-normalization` in
black. This is currently an open issue in ruff:
  astral-sh/ruff#7305
  astral-sh/ruff#7525
I have settled on single quotes, for now.
DimitriPapadopoulos added a commit to DimitriPapadopoulos/skeleton that referenced this issue Nov 23, 2023
I have based this commit on:
- ruff-pre-commit README.md | Using Ruff with pre-commit
  https://github.com/astral-sh/ruff-pre-commit/blob/main/README.md
- The Ruff Formatter | Conflicting lint rules
  https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules

Support for the ruff formatter was added in pytest-ruff by commits from
October 2023, released the same day as versions 0.2 and 0.2.1. Hence, it
makes sense to require pytest-ruff ≥ 0.2.1 now that we use the ruff
formatter.

I cannot find in ruff the equivalent for `skip-string-normalization` in
black. This is currently an open issue in ruff:
  astral-sh/ruff#7305
  astral-sh/ruff#7525
  astral-sh/ruff#8822
I have settled on single quotes, for now.
DimitriPapadopoulos added a commit to DimitriPapadopoulos/skeleton that referenced this issue Nov 23, 2023
I have based this commit on:
- ruff-pre-commit README.md | Using Ruff with pre-commit
  https://github.com/astral-sh/ruff-pre-commit/blob/main/README.md
- The Ruff Formatter | Conflicting lint rules
  https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules

Support for the ruff formatter was added in pytest-ruff by commits from
October 2023, released the same day as versions 0.2 and 0.2.1. Hence, it
makes sense to require pytest-ruff ≥ 0.2.1 now that we use the ruff
formatter.

I cannot find in ruff the equivalent for `skip-string-normalization` in
black. This is currently an open issue, in progress:
  astral-sh/ruff#7305
  astral-sh/ruff#7525
  astral-sh/ruff#8822
I have settled on single quotes, for now.
@charliermarsh
Copy link
Member

FWIW I support adding a "preserve" option. @MichaReiser is back next week so will wait to get his reaction before moving forward though.

sciyoshi added a commit to sciyoshi/ruff that referenced this issue Nov 27, 2023
@MichaReiser
Copy link
Member

Could people who want the --skip-string-normalization feature please take a look at #8822 to verify if it satisfies their needs and, if not, leave a comment. Thanks

sciyoshi added a commit to sciyoshi/ruff that referenced this issue Nov 30, 2023
sciyoshi added a commit to sciyoshi/ruff that referenced this issue Dec 3, 2023
MichaReiser pushed a commit to sciyoshi/ruff that referenced this issue Dec 7, 2023
@mmerickel
Copy link
Contributor

mmerickel commented Jan 3, 2024

I ran this against our large codebase at $work and the two differences from black that I see are:

  • This always converts docstrings to use double quotes. Either " or """ based on what version of single quotes was used before.
  • This seems to prefer to change ''' or r''' to double quotes, at least in the examples I saw. Even when not in a docstring.

This isn't what I want (would prefer for it to just not touch the string quoting ever), but not the end of the world. Hopefully the observations are helpful since the feature is still in preview.

@charliermarsh
Copy link
Member

@mmerickel - Thank you, it's super helpful to get feedback like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Related to the command-line interface formatter Related to the formatter wish Not on the current roadmap; maybe in the future
Projects
None yet
Development

Successfully merging a pull request may close this issue.