Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper order of imports with Hyphens and Periods in the paths #87

Closed
akaustav opened this issue Oct 1, 2021 · 17 comments · Fixed by #93
Closed

Improper order of imports with Hyphens and Periods in the paths #87

akaustav opened this issue Oct 1, 2021 · 17 comments · Fixed by #93

Comments

@akaustav
Copy link

akaustav commented Oct 1, 2021

Hi @lydell,

Kudos

We are trying to convert an Angular 10 application from the deprecated tsLint dependency to using eslint. There were strong recommendations to use this eslint plugin. And I agree with the recommendations - your extensive research is well documented with every reasoning and finding there is to know about sorting import statements within Javascript / TypeScript files. Good work!

Issue

However, there is one thing bothering me while using this plugin. In my opinion, your plugin seems to incorrectly sort import statements which have the same set of characters from the beginning of the "from path string" followed by:

  1. a hyphen character (- a.k.a. the hyphen-minus unicode character U+002D)
  2. versus a period character (. a.k.a the full-stop unicode character U+002E).

Example from path strings

  1. 'my-package' vs 'my.package' (Common part = 'my)
  2. './widget/abc-def.tsx' vs './widget/abc.defgh.tsx' (Common part = './widget/abc)
  3. '../notification-list-routing.module.ts' vs '../notification-list.component.ts' (Common part = '../notification-list)

According to normal string sorting, hyphens come before periods. However, the default configuration of this plugin, seems to sort the from path strings containing periods first followed by the hyphen character.

Example

The following import sequence seems to be correct in my opinion (as well as TSLint's ordered-import rule).

import { fileNameWithHyphens } from './file-with-hyphens.js';
import { fileNameWithPeriods } from './file.with.periods.js';

is auto-fixed into:

import { fileNameWithPeriods } from './file.with.periods.js';
import { fileNameWithHyphens } from './file-with-hyphens.js';

Sample Repository

To demonstrate this issue clearly, I have created a small sample plain-vanilla JavaScript-based NodeJS 14.x compatible application here: https://github.com/akaustav/import-order-test. The index.js file disables your plugin rule on line 1 - otherwise the simple-import-sort/imports rule keeps complaining about the order of the import lines on line 2 and 3. It even uses your documented comparator function - see this file.

Question

Is this expected behavior?

Ameet

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

Hi!

I think this happens due to the way I implemented directory structure sorting:

// Sort by directory level rather than by string length.
source: source
// Treat `.` as `./`, `..` as `../`, `../..` as `../../` etc.
.replace(/^[./]*\.$/, "$&/")
// Make `../` sort after `../../` but before `../a` etc.
// Why a comma? See the next comment.
.replace(/^[./]*\/$/, "$&,")
// Make `.` and `/` sort before any other punctation.
// The default order is: _ - , x x x . x x x / x x x
// We’re changing it to: . / , x x x _ x x x - x x x
.replace(/[./_-]/g, (char) => {
switch (char) {
case ".":
return "_";
case "/":
return "-";
case "_":
return ".";
case "-":
return "/";
// istanbul ignore next
default:
throw new Error(`Unknown source substitution character: ${char}`);
}
}),

Personally I have no idea which order punctuation is “usually” sorted in so I would never have noticed something like this myself.

I have one question: Does it matter to you that . and - are flipped in order? If so, can you explain more?

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

Thanks for responding and sharing the code where the problem might lie, @lydell. I will try to analyze each of the regular expressions and determine which code line might be causing this switch.

I have one question: Does it matter to you that . and - are flipped in order? If so, can you explain more?

Yes, it is one of those opinionated things, but apparently, it matters to me and our team. I am just trying ensure that my code changes are as minimally invasive as possible to all files within the current Angular 11 repository. The Angular 11 repository that I am working on is at-least 6+ years in the making. It started off as an AngularJS (Angular v1) application and has been steadily kept up to date with the latest - 1 version of Angular - while adding more and more functionality around the core code. It is hosted on a private GitHub repository and contains the code for a publicly available website.

Lot of the original developers who wrote the original code have left - and the tribal knowledge about many parts of the website code is lost with them - no comments, no documentation, etc. However, they did implement strict tslint rules which helps us keep the sequence and version history of every code line intact. That is one of the primary reasons to keep these eslint rules as close to the tslint rules as possible - hence keeping the sequence of statements the same and thereby keeping the version history of the code lines intact. Using tools like GitLens, the development team is trying to keep this codebase maintained while triaging defects in the existing code by determining when a particular line (including import statements) was added / modified / deleted and why. GitLens directly links to the old PRs where those lines were added / modified. Additionally, the good set of tslint rules were automatically applied on each commit thereby ensuring developers (especially new developers) are forced to write clean and clear cut code from the start.

We also have a strict 3 reviewer policy on our PRs. Longer PRs (especially these conversion PRs with 1000+ file changes) take longer turn around time to get reviews. At the moment I am dealing with this import order problem - because it's swapping out the hyphen and period import statements in around 300+ files. It was flagged by one of our senior developers and the PR wouldn't be approved until I fix this problem.

Ameet

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

I will try to analyze each of the regular expressions and determine which code line might be causing this switch.

It’s the one with the switch:

.replace(/[./_-]/g, (char) => {
switch (char) {
case ".":
return "_";
case "/":
return "-";
case "_":
return ".";
case "-":
return "/";
// istanbul ignore next
default:
throw new Error(`Unknown source substitution character: ${char}`);
}
}),

version history

Have you looked into using a .git-blame-ignore-revs file?

For me personally, I don’t care about git history for imports. To me, imports are basically “useless” code that I collapse in my editor. I automatically import what’s needed. There’s no logic in them, so it’s not interesting to git blame.

It was flagged by one of our senior developers and the PR wouldn't be approved until I fix this problem.

Note that tslint compatibility is not a goal of this plugin. I created this plugin before I even used TypeScript.

I think your next steps are:

  1. Research what can be done about this.
  2. Post your results here.
  3. Make a PR or fork this plugin. I might be able to fix this instead, depending on the results of your research. Note that if the code becomes much more complicated I might not be interested in merging.

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

@lydell - I think I have found the root cause of this problem. But I don't know why it happens. Posting my analysis here shortly.

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

Analysis so far

@lydell - From your switch statement:

  • periods (.) are converted into underscores (_), and
  • hyphens (-) are converted into forward-slashes (/)

When comparing hyphens and forward-slashes using the vanilla JavaScript comparator (without passing thru the ES Internationalization Collator API), I see that forward slashes are /:

// Vanilla JS comparator
const cmp = (a, b) => a < b ? -1 : a > b ? 1 : 0;

cmp('/', '_');
// Output: -1
// Indicating ascending sort order of ['/', `_`]
// Reversing your `switch` statement logic yields ['-', '.']
// This is GOOD for my use-case

However, when passing thru Intl.Collator.prototype.compare() method, I see the reverse behavior:

new Intl.Collator('en', { sensitivity: 'base', numeric: true }).compare('/', '_');
// Output: 1
// Indicating ascending sort order ['_', '/']
// Reversing your `switch` statement logic yields ['.', '-']
// This is BAD for my use-case

Need to find out why the Intl.Collator.prototype.compare() method is behaving differently than the vanilla JS comparator.

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

Yes, I noticed vanilla JS comparator and Intl.Collator sorted differently last time I fiddled with this too.

What I did last time was to generate a list with all ASCII characters and sort them to learn what order they end up in. I saved the start of the sorted list of ASCII punctuation in this comment:

   // The default order is: _ - , x x x . x x x / x x x 
   // We’re changing it to: . / , x x x _ x x x - x x x 

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

@lydell - Thanks for confirming my suspicions. It feels like this is a bug in the implementation of the Intl.Collator.prototype.compare() method specifically when comparing forward slash (/) and underscores (_).

When comparing period (.) and hyphen (-) it seems to yield the same result.

// Vanilla JS comparator
const cmp = (a, b) => a < b ? -1 : a > b ? 1 : 0;
cmp('.', '-');
// Output: 1

new Intl.Collator('en', { sensitivity: 'base', numeric: true }).compare('.', '-');
// Output: 1

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

@lydell - So, I asked this question on Stack Overflow - and I got one explanation. https://stackoverflow.com/questions/69405786/why-does-javascript-intl-collator-prototype-compare-method-yield-different-r

His explanation seems sound to me for real characters - however, he could not explain why the comparison differs specifically for _ and / characters.

A couple of questions / thoughts on my mind:

  1. I still do not clearly understand the purpose of that switch statement - I mean I know it's replacing specific characters with others, but why replace in the first place?

  2. I did not understand what this comment meant.

      // The default order is: _ - , x x x . x x x / x x x
      // We’re changing it to: . / , x x x _ x x x - x x x 
    
    1. What does "default order" mean?
    2. And why are we changing the order?
    3. What do the x x x denote?
  3. Clearly we can't rely on Internationalization collation comparison to match the default sorting techniques for specific characters. Are the underscore (_) and forward slashes (/) purely placeholder symbols? Or do they have a specific meaning and ca NOT be replaced with other characters? If they are merely placeholders, maybe we can replace them with some other set of characters which yield the same results when comparing using the Internationalization API vs the traditional way?

  4. If the above cannot be achieved, would you be willing to keep the sorting logic behind a boolean setting - let's say collateIntl? (I know you strongly feel the need of not having any options). If collateIntl === true, then the current logic stays. Else if collateIntl === false, the Internalization collation comparison is skipped and the radiational comparison remains. Something like this maybe?

    export function compare(a, b) {
      const regularComparison = (a < b ? -1 : a > b ? 1 : 0);
    
      if (settings.collateIntl === false) { // settings.collateIntl is the boolean setting
        return regularComparison;
      } else { // default = existing behaviour
        return collator.compare(a, b) || regularComparison;
      }
    }

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

  1. . and / are used in stuff like from "../../utils". Moving . and / sort first is a hacky part of directory structure sorting. Remove the replace and see which tests fail!

  2. I’ll try to explain:

    1. “Default” order means what you get when sorting with Intl.Collator.
    2. See above.
    3. Other chars that don’t matter for the directory structure sorting hacks. Make a list of all ASCII punctuation and sort it with Intl.Collator and compare with the comments and you’ll see.
  3. I’m not sure what you mean?

  4. No.

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

Is it just me? Or are unit tests already failing in the main branch without any code changes?

C:\dev\lydell\eslint-plugin-simple-import-sort>npm test 

> @ pretest C:\dev\lydell\eslint-plugin-simple-import-sort
> prettier --check . && eslint . --report-unused-disable-directives

Checking formatting...
[warn] .eslintrc.js        
[warn] .github\workflows\check.yml
[warn] .github\workflows\test.yml
[warn] .prettierrc.json
[warn] build.js
[warn] CHANGELOG.md
[warn] jest.config.js
[warn] package-lock.json
[warn] package-real.json
[warn] package.json
[warn] README.md
[warn] src\exports.js
[warn] src\imports.js
[warn] src\index.js
[warn] src\shared.js
[warn] test\examples.test.js
[warn] test\exports.test.js
[warn] test\helpers.js
[warn] test\imports.test.js
[warn] Code style issues found in the above file(s). Forgot to run Prettier?
npm ERR! Test failed.  See above for more details.

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

They pass for me. And in CI.

@lydell
Copy link
Owner

lydell commented Oct 1, 2021

But it seems like you use Windows. I don’t. My guess is that your git setup has changed the newlines in all the files to CRLF, while Prettier wants just LF.

@akaustav
Copy link
Author

akaustav commented Oct 1, 2021

But it seems like you use Windows. I don’t. My guess is that your git setup has changed the newlines in all the files to CRLF, while Prettier wants just LF.

You are correct, @lydell. I cloned your repo on my Windows 10 machine.
I verified all files of the cloned repo (including the list of failing files) in my local have line endings set to CRLF instead of LF.
This is despite the fact that I have set core.autocrlf to input.

C:\dev\lydell\eslint-plugin-simple-import-sort>git config --get core.autocrlf     
input

For now, let me try to change the line endings from CRLF to LF manually and run the unit tests on your repo.

@akaustav
Copy link
Author

akaustav commented Oct 2, 2021

@lydell - I was finally able to run the unit tests on your repo. All passed.
I removed that switch statement replacement code and - like you said - lots of tests failed.
Analyzing the results now.

------------|---------|----------|---------|---------|-------------------
File        | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
------------|---------|----------|---------|---------|-------------------
All files   |     100 |      100 |     100 |     100 |
 exports.js |     100 |      100 |     100 |     100 |
 imports.js |     100 |      100 |     100 |     100 |
 index.js   |     100 |      100 |     100 |     100 |
 shared.js  |     100 |      100 |     100 |     100 |
------------|---------|----------|---------|---------|-------------------
Snapshot Summary
 › 13 snapshots failed from 3 test suites. Inspect your code changes or run `npm test -- -u` to update them.

Test Suites: 3 failed, 3 total
Tests:       15 failed, 366 passed, 381 total
Snapshots:   13 failed, 98 passed, 111 total
Time:        16.719 s
Ran all test suites.

Side Note: I was also able to figure out the line-endings LF -> CRLF problem. Apparently, there is a bug in the GitHub Desktop software where it ignores the system level settings (bug details - desktop/desktop#10666). I had to set the core.autocrlf option on the global level instead of relying on the settings on the system level:

git config --global core.autocrlf input

And then re-clone your repo. All files show line endings as LF now.

@akaustav
Copy link
Author

akaustav commented Oct 2, 2021

@lydell - I think I understand your requirement a bit better now after looking at the sorting expectations outlined between lines 880 and 992 - why you wanted to sort . and / before other special characters. And I am starting to see the problem too - more to follow.

|import {} from "...";
|import {} from ".../";
|import {} from "../../..";
|import {} from "../../../";
|import {} from "../../../,";
|import {} from "../../../_a";
|import {} from "../../../[id]";
|import {} from "../../../a";
|import {} from "../../../utils";
|import {} from "../..";
|import {} from "../../";
|import {} from "../../,";
|import {} from "../../_a";
|import {} from "../../[id]";
|import {} from "../../-a";
|import {} from "../../a";
|import {} from "../../utils";
|import {} from "..";
|import {} from "../";
|import {} from "../,";
|import {} from "../_a";
|import {} from "../[id]";
|import {} from "../-a";
|import {} from "../a";
|import {} from "../a/..";
|import {} from "../a/...";
|import {} from "../a/../";
|import {} from "../a/../b";
|import {} from ".//";
|import {} from ".";
|import {} from "./";
|import {} from "./,";
|import {} from "./_a";
|import {} from "./[id]";
|import {} from "./-a";
|import {} from "./A";
|import {} from "./a";
|import {} from "./ä"; // “a” followed by “̈̈” (COMBINING DIAERESIS).
|import {} from "./ä";
|import {} from "./a/.";
|import {} from "./a/-";
|import {} from "./a/0";
|import {} from "./B"; // B1
|import {} from "./B"; // B2
|import {} from "./b";
|import img1 from "./img1";
|import img2 from "./img2";
|import img10 from "./img10";
|import {} from ".a";

@akaustav
Copy link
Author

akaustav commented Oct 3, 2021

@lydell - The deeper I follow the rabbit hole, the more I come to realize the following (sorry, if I repeated myself - trying to collect my thoughts together and document it clearly for myself here):

  1. Your eslint plugin is sorting using Internationalization collation sorting out of the box. And maybe I got side-tracked / distracted and was expecting to retrofit it to match regular sorting (a.k.a. ASCII sorting, or code-point sorting) that Array.prototype.sort() spits out of the box - because that is what tslint was doing.

  2. The sort orders for the five concerned characters - comma (,), hyphen (-), period (.), forward-slash (/) and underscore (_), - in the various methods are as follows (the x x x in your code comments was confusing to me at first and then it made sense when attempted to sort a bunch of special characters in my repository - via the sortSpecialCharacters function: https://github.com/akaustav/import-order-test/blob/029f18c327ac6b8a85366b4b087d972145932556/src/index.js#L22-L70):

    1. Regular sorting: , - . / _
    2. Intl sorting: _ - , . /
    3. Your directory structure sorting "hack": . / , _ -
  3. In both Regular as well as International sorting the hyphen (-) character comes before the period (.) character - so that is still a valid requirement and was my original request on this issue. However, your directory structure sorting "hack" changes the order so that period (.) character comes before the hyphen (-) character.

    1. Additionally, I found that the order of underscore (_) and hyphen (-) switches between regular and international sorting. Fortunately this is not a problem for our Angular 11 repository because it uses only hyphens in file names - and no underscores.
  4. The regular-expression hack to deal with the directory structure is messing with the order of periods (.) characters within file names. Maybe treat ./ and ../ differently? I know your goal was NOT to match tslint. I just searched thru the tslint codebase trying to figure out how this was being handled over there. Looks like instead of a regular expression replacement hack, this was being done (lines 749 thru 763) - I noticed no reference to the Intl API: https://github.com/palantir/tslint/blob/285fc1db18d1fd24680d6a2282c6445abf1566ee/src/rules/orderedImportsRule.ts#L749-L763

  5. I still fail to understand how to hack really works. I see that you appended comma (,). Why did you choose comma and not a different character - like semicolon (;) or colon (:) or exclamation (!), or really any other character?

  6. I was tinkering around in my fork and was able to achieve the correct sorting on period and hyphens - drafted a work-in-progress PR Fix Sorting of Special Characters #91. However, there are a few things which I will need to clean up and understand why and how it works:

    1. cleanup assumptions to replace based on ASCII order.
    2. cleanup the "hack" reversal (or maybe this is what fixed the problem?? - need to check)
    3. fix the unit test ordering of './/'
    4. fix the unit test ordering of hyphens vs commas (../-a vs ../,)
    5. cleanup the NodeJS 16 changes (or keep it based on your preference)

@lydell
Copy link
Owner

lydell commented Oct 9, 2021

As mentioned in my comment on your PR (#91 (comment)) this is not trivial to “fix.” But in my opinion there’s nothing to fix. Punctuation don’t have a natural order. The order just is what it is. Most people don’t notice. It seems like you have a very specific situation, so I think you’d be better off using a custom solution to your problem.

I’m updating the readme in #93.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants