Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading commas are unhandy #95

Closed
DavHau opened this issue Feb 14, 2022 · 20 comments
Closed

Leading commas are unhandy #95

DavHau opened this issue Feb 14, 2022 · 20 comments

Comments

@DavHau
Copy link

DavHau commented Feb 14, 2022

I know that leading commas seem to be a standard in haskell and somehow everyone does it, but actually it's unhandy when editing code.

Explanation of why leading commas negatively impact editing experience is already given in the equivalent issue on nixpkgs-fmt: nix-community/nixpkgs-fmt#248 (comment)
To summarize what has already been said:

  • leading commas make argument lists inconsistent. The first line starts with { and subsiding ones with ,.
  • it makes re-ordering arguments more complicated, one cannot just shift lines around without braking stuff.
  • inserting a new argument in front of the first one is more complicated, as one cannot just duplicate the first line to then edit it.

In the other thread, there were a lot of opposing opinions to what I'm saying, but if we are having this discussion on a scientific basis, any opposing argument to this should either contain an explanation of why my statements are wrong or irrelevant, or explain disadvantages of trailing commas.
I am still waiting for such a response.

I think if we want to achieve the best possible result, we need to leave personal preferences out of this as well as statements like but this is what everyone does.

@kamadorueda
Copy link
Owner

I know that leading commas seem to be a standard in haskell

What's Haskell ? 😉

but actually it's unhandy when editing code.

Un-unhandied in: #98

Explanation of why leading commas negatively impact editing experience is already given in the equivalent issue on nixpkgs-fmt: nix-community/nixpkgs-fmt#248 (comment) To summarize what has already been said:

  • leading commas make argument lists inconsistent. The first line starts with { and subsiding ones with ,.
  • it makes re-ordering arguments more complicated, one cannot just shift lines around without braking stuff.
  • inserting a new argument in front of the first one is more complicated, as one cannot just duplicate the first line to then edit it.

In the other thread, there were a lot of opposing opinions to what I'm saying, but if we are having this discussion on a scientific basis, any opposing argument to this should either contain an explanation of why my statements are wrong or irrelevant, or explain disadvantages of trailing commas. I am still waiting for such a response.

Thank you for supporting your point with arguments, this is the way to go about style discussions

I think if we want to achieve the best possible result, we need to leave personal preferences out of this as well as statements like but this is what everyone does.

Alejandra is different

@TLATER
Copy link

TLATER commented Feb 16, 2022

Just for reference because this made me curious as to why people even consider leading commas; There are basically two arguments in favor of leading commas:

  1. It makes missing commas easier to spot
  2. At least in Haskell and SQL, apparently prepending is more common than appending, so trailing comma results in no fewer diff changes, but leading comma does.

I don't really see the former point as particularly relevant, because a missing comma isn't that hard to spot, at least if you have tooling shouting at you that your code is broken before it is deployed.

I can see the argument if the language implementation doesn't give you good error messages for missing tokens, or if checks for this are somehow entirely runtime-only, but I don't think that applies to nix, and I think error messages that bad are both uncommon in modern languages and significantly less problematic thanks to the in-line highlighting provided by virtually all editors/IDEs. Hell, GitHub's font-lock becomes very screwy if you mess up basic syntax.

The latter argument doesn't seem to really apply to nix, and the formatting suggested by Alejandra lends itself to both single-line diff appending and prepending, so it seems poorly placed for the language and irrelevant in practice.

There are also some dubious analyses of github projects that apparently show that projects with leading commas are more successful on average, but that is the definition of cargo culting and hence I just reject any argument based on that on principle.

@Atemu
Copy link

Atemu commented Feb 17, 2022

I don't think this is a good change.

Leading commas are used all over Nixpkgs for arguments and it's basically the standard next to long line argument blobs.

I take issue with the argument of not being able to edit the first one as easily as all the others in two ways:

  1. The first argument almost always stays the same (stdenv or lib for packages, lib or pkgs for modules)
  2. Not being able to edit them easily is an editor's problem, not a formatter's. Even with bog standard emacs -Q, you can easily move leading comma arguments around. Not to mention that you can simply break the formatting to make it easier to edit then and have the formatter fix it for you afterwards.

I don't buy the inconsistency either, the last argument in the postfix comma style has no comma. That's even less consistent IMO.

@piegamesde
Copy link
Contributor

the last argument in the postfix comma style has no comma

This is not true, Nix has had trailing commas for a while now. If Nix didn't have this, I'd agree with out that leading commas are the better choice, because as you said the first function argument usually stays the same. In fact, the style we currently use initially started as a workaround when there were no trailing commas.

@DavHau
Copy link
Author

DavHau commented Feb 18, 2022

Leading commas are used all over Nixpkgs for arguments and it's basically the standard next to long line argument blobs.

Again the everyone does it argument, which doesn't explain anything about advantages / disadvantages.

I take issue with the argument of not being able to edit the first one as easily as all the others in two ways:

  1. The first argument almost always stays the same (stdenv or lib for packages, lib or pkgs for modules)

Good point. But this is a specific property of the nixpkgs repo and the way arguments are ordered there (non alphabetic ordering etc.). You cannot even be sure if it will always stay like that in nixpkgs in the future. Also there can be function args in many places other than at the top of package expressions where neither lib, nor pkgs, nor stdenv will usually be the first argument.

  1. Not being able to edit them easily is an editor's problem, not a formatter's.

You can always fight issues with better tools and better skills. But wouldn't it be better if the design wouldn't require that in the first place?

Even with bog standard emacs -Q, you can easily move leading comma arguments around.

Not sure how this statement demonstrates superiority of either leading or trailing commas.

Not to mention that you can simply break the formatting to make it easier to edit then and have the formatter fix it for you afterwards.

Sounds not so convenient.
I think a good formatter should format in a way, so that the rules are simple to follow by hand even if the formatter is not present all the time. Requiring the formatter to consistently re-format the file while working on it because it is hard to do it right by hand, indicates poor style guideline decisions in my opinion.

I don't buy the inconsistency either, the last argument in the postfix comma style has no comma. That's even less consistent IMO.

This is why trailing commas are allowed even for the last argument (see last example below)

Examples:

let..in lines all ending with ;

let
   a = 1;
   b = 2;
in

attrsets all ending with ;

{
  a = 1;
  b = 2;
}

function arg lists all ending with ,

{
  a ? 1,
  b ? 2,
}:

This last example is valid nix code and is consistent with all other brace-like object structures.
Why would you break the rule? And why just only for function args?

@Atemu
Copy link

Atemu commented Feb 18, 2022

Again the everyone does it argument, which doesn't explain anything about advantages / disadvantages.

No, it doesn't. It's an argument for keeping the diff small (reducing formatting noise) and defaulting to what the majority is used to, minimising the need for mental adjustment when reading code.

But this is a specific property of the nixpkgs repo and the way arguments are ordered there (non alphabetic ordering etc.).

To my knowledge, Nixpkgs is the largest repository of Nix expressions and de-facto upstream. "Downstream" users don't have to oblige by its rules by any means but if we have to chose a default, choosing what the most significant code source does is the only sensible choice.

You cannot even be sure if it will always stay like that in nixpkgs in the future.

No but it's a good assumption and, should opinions change later, it'd be a major community effort like an RFC and that would warrant a change in alejandra at that point.

Also there can be function args in many places other than at the top of package expressions where neither lib, nor pkgs, nor stdenv will usually be the first argument.

Some, yes; a minority though I'd argue. And those probably don't need to change very often either.

You can always fight issues with better tools and better skills. But wouldn't it be better if the design wouldn't require that in the first place?

It would but not if it comes at the cost of readability. Code needs to be read much, much more often than it needs to be written and altering the first argument (the only place this matters) is even rarer.

The primary purpose of a code formatting making code better to read. Making it easier to edit is a nice side-effect of that at best.

Not sure how this statement demonstrates superiority of either leading or trailing commas.

It doesn't. It demonstrates that the supposed superiority of trailing commas doesn't matter when proper tools are used.

I'd argue anyone that cares about something as insignificant as adjusting a few more characters should also be interested in using better tools to do their job more efficiently.

This is why trailing commas are allowed even for the last argument (see last example below)

A comma implies a statement following it and the absence of which would confuse a reader.

Again, readability comes first.

I personally quite like enforcing optional following commas for keeping diffs tidy but readability of the code is more important than the readability of diffs.

This last example is valid nix code and is consistent with all other brace-like object structures.
Why would you break the rule? And why just only for function args?

Because that rule has already been broken by languages whose syntax almost everyone is used to. I'd like to see you try to find developers who haven't used C, Java and/or JavaScript (and many other languages with C-ish syntax) where commas are used for separating function arguments and commas without a following argument are either a syntax error or frowned upon. I don't think you'll find many.

No, it's not consistent. The syntax for attrset arguments isn't consistent with anything either (commas instead of semicolons, no key = value; but key ? value...), so I don't see why we should care about consistency here.


I think the formatter should keep as much about the original author's intent as possible, so I wouldn't be opposed to optionally allow following comma syntax if it's already present but I'd also prefer if we unified on one way of doing things because a formatter should also be a tool to combat bikeshedding.

Converting a block with multiple following comma arguments per line to a stack of leading comma, one argument per line is also something you could want off a formatter; to enforce a unified style a repository has agreed upon.

IMO any formatter preference should be configurable (a la editorconfig) and default to whatever the most significant part of the community most commonly does.
(Or wants to do. The way I see it is that, currently, Nixpkgs users prefer a one argument per line style.)

@TLATER
Copy link

TLATER commented Feb 18, 2022

It's an argument for keeping the diff small (reducing formatting noise)

I'd suggest trying to format using the web editor's random button for a while (or even just reading nixpkgs). The absolute majority of packages use a single line for their args. Even packages that use leading comma often have one line that has many args. Others used nixfmt in the past and are incompatible with either approach. I don't think any choice here will keep diffs meaningfully smaller.

If you really care, it would be interesting to see actual numbers for this. How many pieces of the codebase actually use leading comma syntax consistently, and would be negatively affected? How many already use trailing comma?

I'd argue anyone that cares about something as insignificant as adjusting a few more characters should also be interested in using better tools to do their job more efficiently.

The problem with that argument is that more changed characters means bigger diffs for the rest of eternity, and there simply is no alternative. I'm not sure I care much personally, but there is a distinct advantage here that no amount of tool reliance can circumvent. Trailing commas are objectively better for this detail, and this change is based on that detail.

Anyway, we're talking about making one of these better tools right now, so I don't think this is the time for insulting people about their inferior tool usage.

Again, readability comes first.

Code readability is extremely subjective, and, as you say yourself in reference to tool usage, this is a very minor point that is unlikely to actually confuse anyone. I personally find trailing comma less surprising, and it probably will be for many developers and non-developers alike - after all, written natural languages frown upon leading commas, so you don't need to subscribe to C-isms to be exposed to this; just any language using a Roman alphabet. But this is subjective, and we will likely never agree on which is more readable.

If you read around the nixpkgs repo today, you'll find a lof of inconsistency, including the style of function args, among which many formatting choices were made with not much prior thought (as is bound to happen over time without tools or a standard). Setting a new standard that improves on what is there currently should not be out of scope - so please, keep arguments as far from subjectivity as possible.

@SomeoneSerge
Copy link

I'm rather surprised to see this merged and so quickly. Personally, I'd tolerate whichever convention, as long as we have an actually "uncompromising formatter" (like black for python - I them, I use them).

I'm not reading the whole thread now, but have we checked just which convention causes fewer changes in nixpkgs?

@piegamesde
Copy link
Contributor

piegamesde commented Feb 19, 2022

have we checked just which convention causes fewer changes in nixpkgs?

No, but given that many parts of nixpkgs use leading comma at the moment I don't think there is much to check. However, this is besides the point. In my opinion, the important questions are:

  • Are trailing commas better? (IMO: yes)
  • Is the temporary disruption because of that change worth it in the long run?

@SomeoneSerge
Copy link

I don't immediately know what units of measurements are fit for assessing which commas are superior, but I can see how to measure the diff size, and how the diff size could be relevant to mass adoption in nixpkgs :)

@piegamesde
Copy link
Contributor

Ask the following questions (in no particular order):

  • How consistent is this style with the rest of how Nix code is written
  • How consistent is it compared to other programming languages? What might beginners expect? (principle of least surprise)
  • Do changes diff well?
  • Are changes easy to make locally or do they get intertwined with adjacent code?
  • Are there any pleasant symmetries? Is there any special casing (e.g. for first and last entry)?

the diff size could be relevant to mass adoption in nixpkgs

To be honest, I don't see a strong relationship there. I'd argue that it doesn't really matter whether the final formatting PR touches 80% or 85% of all the nixpkgs code. (I generally agree that a lower diff is preferable because it indicates that it is better aligned with the current, organic nixpkgs style. However, I consider it more as a heuristic than a rule to follow. Well-justified exceptions, like in this case, are acceptable to me.)

@SomeoneSerge
Copy link

To be honest, I don't see a strong relationship there

Well, I just ran alejandra on the PR I've been preparing and got what I consider a predictable reply:
2022-02-20_04-10

@TLATER
Copy link

TLATER commented Feb 20, 2022

That is indeed to be expected, and why an RFC for adopting a general code formatter across the repository is currently open.

The idea being that you can't easily use a formatter right now, because the repository is inconsistent, while formatters would be consistent, so if you modify someone else's code you're going to get irrelevant diffs at some point. Those cannot be accepted, because there isn't a standard, so the next person using a different formatter will just create the same irrelevant noise in their PR.

The problem isn't really the format you're using as much as it is the fact that you're changing the existing format for no reason, and causing a diff that contains things irrelevant to your actual change, which is confusing and puts additional burden on a reviewer - they need to understand if your change actually does anything now. Please try to avoid doing that, in general, as a courtesy to reviewers.

If a formatter for the nixpkgs repo is decided on, though, a big patch introducing the new format could be applied. It would be changing things for a very clear reason, i.e. freeing us all of manual code formatting chores. Alejandra specifically does its best to also change as few semantics as possible - ideally this would be none, but a small number can still be hand-reviewed, while the rest can be checked by asserting nix doesn't rebuild anything.

It doesn't matter that much if such a patch would change 80% of the codebase or 85% of it - regardless of the actual choices made here it probably will change most of the repo, because the formatting currently is very inconsistent. So we might as well pick a format that has actual design decisions behind it, rather than organic adaptations of other language's rules that people were used to before writing nix.

In the mean time, the pragmatic option is to apply alejandra only on your diffs somehow. You'll have to wait until that RFC is through and implemented to use a formatter for full files on nixpkgs, which is pending on any of the projects becoming mature and accepted enough for that to happen :)

@SomeoneSerge
Copy link

These were mine diffs, and I applying alejandra to them meets the resistance

@TLATER
Copy link

TLATER commented Feb 20, 2022

That being this change?

I think that's an edge case; the problem is that you put burden on the reviewer there because they had already looked at your code before, and now you changed it completely, so they would need to review it again. If you had written that file with alejandra originally I believe you would not have had any resistance (because there are other modules that don't follow this style, or at the very least because different people seem to have different, but clearly valid, opinions on the matter of which style is preferable, so the outcome would depend on the reviewer).

Anyway, we're way off topic here. We should pick this back up if the question of leading commas becomes contentious in the RFC, or if there are additional arguments for adopting leading commas.

@Atemu
Copy link

Atemu commented Feb 20, 2022

If you had written that file with alejandra originally I believe you would not have had any resistance

No, I don't think that is the case.

From my previous interactions with them, they are very strict about "minor" things like this. This can be annoying at times but it's for good reason I believe.

As one of the top Nixpgks reviewers, they read much, much more Nix than any of us do. We should ask them on their opinion on this matter.

@SuperSandro2000

@piegamesde
Copy link
Contributor

I have run into the same situation with the same person more than once. I am not okay with people unilaterally deciding on some code style and then enforcing it through mass code review. Note that this specific style question is not the only one: I remember bulk PRs changing {} to { } and similar stupidities.

This is the main reason why we are having this discussion right now. So that we finally decide on a style, and make these endless review comments that have nothing to do with the actual content of the changes stop.

@SuperSandro2000
Copy link

SuperSandro2000 commented Feb 20, 2022

This last example is valid nix code and is consistent with all other brace-like object structures.
Why would you break the rule? And why just only for function args?

, and ; stand for very different things. Having one in the beginning and one at the end of the line makes it easier to see if we have a listing or a line is ending. Having both at the end makes it actually harder because they only differ in one dot.


Almost all files in nixpkgs are formatted with the comma in the beginning of inputs. I didn't come up with that and there are only advantages to keep it like that and having a very similar style across all files. Changing that would be stupid and create a massive diff, would require to change a very big chunk of PRs and changing the default formatters everyone is using. If you want to do it different in your personal projects go ahead but introducing it to nixpkgs is something we shouldn't do. It should have been done different in the beginning but now we are stuck with that decision ages ago.


like black for python

black is a very good example why uncompromising formatters are not great. It tries to create shorter lines by putting ( ) on a single line which looks stupid. I am very okay with 120 chars or even 200 per line which reduces the stretching of the code.


Is the temporary disruption because of that change worth it in the long run?

No and it isn't temporary and also not short. It will take weeks or more likely months and a lot of time for people to adjust and change all things. Is that worth it for an opinionated change? No, especially because we have bigger issues where our time should be invested.


It doesn't matter that much if such a patch would change 80% of the codebase or 85% of it - regardless of the actual choices made here it probably will change most of the repo, because the formatting currently is very inconsistent. So we might as well pick a format that has actual design decisions behind it, rather than organic adaptations of other language's rules that people were used to before writing nix.

The problem with alexandria is that it changes significant more lines than nixpkgs-fmt and the default style and does not allow people to do opinionated formattings where it makes sense. So far we couldn't even agree on nixpkgs-fmt which changes much less and allows more freedom, so alexandria has a far higher burden. First agreeing on nixpkgs-fmt would be an entry to have an accepted formatter. If nixpkgs-fmt and alexandria would be compatible it would be ideal and I would happily accept both tools. If someone else later only formats the file with nixpkgs-fmt and isn't undoing half the changes both tools could coexist with each other and everyone could benefit.


I have run into NixOS/nixpkgs#137131 (comment) more than once

Hey, at least I am consistent and I treat every contributor the same. :)

I am not okay with people unilaterally deciding on some code style and then enforcing it through mass code review.

I told you that before and you are still ignoring it: I didn't decide on this. I didn't come up with this. It was the coding style I was introduced in the beginning and the one currently the most people use and most files are formatted in. It naturally only made sense to adopt that one to keep the friction with the majority of people the lowest.

This is the main reason why we are having this discussion right now. So that we finally decide on a style, and make these endless review comments that have nothing to do with the actual content of the changes stop.

Yes and I choose the path that has the least changes from the current situation while others try to redefine the style from the ground up which is far more work and way harder to get everyone to agree on.


So, I am now continuing playing Assassin's Creed Valhalla and viking more. 🎮 Happy Sunday everyone!

@kamadorueda
Copy link
Owner

[SuperSandro2000] I choose the path that has the least changes from the current situation

You can choose $ cat as formatter, it has the least changes from the current situation

[piegamesde] This is the main reason why we are having this discussion right now. So that we finally decide on a style, and make these endless review comments that have nothing to do with the actual content of the changes stop.

❤️

@SomeoneSerge
Copy link

s/black for python/elm-format for elm/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants