Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡: Typos only or non-strict mode to only flag common issues or forbidden words #5687

Closed
1 task done
Jason3S opened this issue May 29, 2024 · 13 comments
Closed
1 task done

Comments

@Jason3S
Copy link
Collaborator

Jason3S commented May 29, 2024

Problem

One common frustration of CSpell users is that it flags variable names and imported libraries, when they are just wanting the spell checker to look for common spelling mistakes. A "Typos" mode would tell the spell checker to be less strict. To only mark issues that are clear mistakes or are in the list of words to be flagged.

Solution

  • add a new boolean configuration option strict, which is true by default.
    • true - the current behavior - all unknown words are marked as errors.
    • false - only common spelling issues and flagged words will be marked as errors.
  • add document directives to enable/disable strict mode within a document.

Alternatives

The other alternative is to provide more control over the parts of a document that is checked by parsing the document and tagging sections (i.e. string, variable, comment) and allowing the user to enable checking.

Note: this proposal can be used in conjunction with parsing.

Additional Context

image

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Jason3S Jason3S changed the title 💡: Typos only or non-strict mode or to only flag common issues or forbidden words 💡: Typos only or non-strict mode to only flag common issues or forbidden words May 29, 2024
@ccoVeille
Copy link

Do you have an example about the solution you suggest, because I'm unsure to understand what you are suggesting, but also maybe the problem you are trying to fix

@Jason3S
Copy link
Collaborator Author

Jason3S commented May 29, 2024

@ccoVeille,

Turning off strict mode would effectively have CSpell behave like codespell or typos.

image

See also image above.

The main difference is that it would be controllable from file to file based upon configuration and even within a file by using directives within the file.

@ccoVeille
Copy link

It's clearer now.

Maybe "strict" might not be the best choice, then.

Because this term is already used for accent and case checking

-s, --no-strict Ignore case and accents when searching for words.

@nschonni
Copy link
Collaborator

Maybe quick or simple. Although I'm not sure if either really are accurate

@ccoVeille
Copy link

Wording is always a real pain 😅

@Jason3S
Copy link
Collaborator Author

Jason3S commented May 30, 2024

It's clearer now.

Maybe "strict" might not be the best choice, then.

Because this term is already used for accent and case checking

-s, --no-strict Ignore case and accents when searching for words.

Very good point.

Thinking out loud. This option will:

  • Hide Unknown Words - turn them into warnings/info instead of errors.
  • Keep showing flagged words
  • Show issues with preferred suggestions
  • Show common spelling mistakes (another way to say they have preferred suggestions).

@ccoVeille
Copy link

ccoVeille commented May 30, 2024

I think you were somehow asking for help in terms of wording. Here are the few that came to my mind:

  • ignore-unknown-words: true/false
  • ignore-unrecognized-words: true/false
  • unknown-words-severity (or level?) =info,warning,error, debug,ignored

Things like that, but I'm not the best with wording ideas

My suggestion is made on the fact, it will change almost nothing expect for unrecognized words

@ccoVeille
Copy link

Thinking out loud. This option will:

  • Hide Unknown Words - turn them into warnings/info instead of errors.
  • Keep showing flagged words
  • Show issues with preferred suggestions
  • Show common spelling mistakes (another way to say they have preferred suggestions).

Now I understand what you wanted to achieve, I can tell you that is exactly what led me to prefer typos, vale, or codespell for years.

I love cspell, but as the logic is "not in the list then failure", it prevented me to use it in CI.

I keep using it manually when I want to make sure nothing was left behind by typos or codespell.

So this feature would be very appreciated, yes.

@Jason3S
Copy link
Collaborator Author

Jason3S commented May 31, 2024

I think you were somehow asking for help in terms of wording. Here are the few that came to my mind:

  • ignore-unknown-words: true/false
  • ignore-unrecognized-words: true/false
  • unknown-words-severity (or level?) =info,warning,error, debug,ignored

Things like that, but I'm not the best with wording ideas

My suggestion is made on the fact, it will change almost nothing expect for unrecognized words

Thank you for the suggestions.

Some other ideas.

  • treat-unknown-words-as-misspelled
  • skip-unknown-words

Even though something like treat-unknown-words-as-misspelled is the most accurate, it is too long. For now, I'm going to go with ignore-unknown-words.

@Jason3S
Copy link
Collaborator Author

Jason3S commented May 31, 2024

A long time ago, an issue was opened suggesting that words without simple fixes should be ignored (I looked for it, but didn't find it). At the time, the spell checker didn't have a list of common misspellings, so it didn't make a lot of sense.

I think it might be worth revisiting the idea.

ignore-unknown-words will ignore ALL unknown words, even ones with simple spelling mistakes, like simpel instead of simple because they are not in the list of common mistakes. In addition, all words from languages besides English will be ignored because there isn't a common misspelling dictionary for them.

image

One way to address this is to use these two options:

  • ignore-words-without-suggestions - false by default.
  • ignore-words-suggestion-search-level
    • 0 - no search. Same as ignore-unknown-words defined above.
    • 1 - single letter change or swap.
    • 2 - two changes.
    • higher than 2 doesn't make a lot of sense and will slow things down too much.

@ccoVeille
Copy link

I think it would lead to have too many parameters.

Someone may find ignore-unknown-words useful, but then as they won't look for other settings that may imply to change its behavior, they might think the setting is not the droid they are looking for.

Maybe a better approach would be to use one parameter that will try to mitigate everything

something like

  • unknown-words-behavior=ignore,report,whatever …
  • unknown-words-modifier=ignore,report,whatever …
  • unknown-words=ignore,report,whatever …

where report would be the default value (as it does now)

so there would be only one parameter and not 2.

I dislike ignore-words-without-suggestions because:

  • it's too long, an argument you already used for treat-unknown-words-as-misspelled
  • it implies to understand how cspell is working internally, and what are suggestions
  • in a few weeks, you will discover you need to implement another use cases, and it's name won't match the behavior you want to add

@Jason3S
Copy link
Collaborator Author

Jason3S commented May 31, 2024

@ccoVeille,

Thank you for the feedback.

unknown-words= could work. The challenge is the whatever.

  • ignore - clear - all unknown words are ignored -- I don't think this is what most people will want. Too many simple mistakes will get missed. Low noise, but high possibility of mistakes.
  • report - clear - all unknown words are reported. High noise, because of the high chance of false positives in code.
  • report-simple - report unknown words that can be fixed with simple changes. With a setting like this, the idea is to keep the noise relatively low while also picking up simple mistakes. (I think if someone doesn't want report, then this option is the next best)

Maybe it is better to flip it:

  • report - the default
  • ignore - ignore most unknown words except ones with simple fixes.
  • ignore-all - ignore all unknown words.

Here is a example that reports common and simple spelling mistakes:
image

@ccoVeille
Copy link

ccoVeille commented May 31, 2024

Wordings is always a pain

I think I shared my thoughts, and your suggestions are good.

But maybe it's now time to ask for other people to look at the discussion. Discussing something as important as this with one person only may lead to issues.

I'm a random guy, my point of view matter as much as an other people who have interest in this project.

I would say we could wait.

@Jason3S Jason3S pinned this issue May 31, 2024
@streetsidesoftware streetsidesoftware locked and limited conversation to collaborators Jun 14, 2024
@Jason3S Jason3S converted this issue into discussion #5751 Jun 14, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants