Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨Feature/plugin non breaking spaces #55

Merged
merged 18 commits into from
Nov 5, 2021

Conversation

filip-dot-cube
Copy link

@filip-dot-cube filip-dot-cube commented Oct 20, 2021

— Adds plugin for replacing single-letter characters with non-breaking spaces

— So far only for Czech language, however patterns list could be extend via plugin options

— Later on we could base general find and replace plugin on this one

@filip-dot-cube filip-dot-cube requested a review from a team October 20, 2021 11:19
@filip-dot-cube filip-dot-cube self-assigned this Oct 20, 2021
Comment on lines 47 to 49
const replaced = line.value.replace(pattern, replacement);

line.setValue(replaced);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use this syntax

Suggested change
const replaced = line.value.replace(pattern, replacement);
line.setValue(replaced);
line.setValue(value => value.replace(pattern, replacement));

} else {
options.logger.warn(
`Pattern for current language ${language} was not found`
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Are you sure you want to log warning for each line in language that doesn't have pattern to replace?
  2. What if I want use same patterns for all supported languages, am I forced to define same patterns for each language separately?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Good point, no. Moved it to transformFullOutput
  2. Using same pattern for all languages could not be wise. Languages have different rules. Sure it makes sense in english uk and us, but thats about it. So I don't know, what do you suggest?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Ok, I understand, lets keep it as it is, I didn't realize the gramatics differencies the feature is based on.

We could maybe provide default definitions for some most used languages like EN/DE. Please create an enhancement ticket for it 🙏

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, makes sense.

#57

Comment on lines 42 to 44

if (Object.keys(patterns).includes(language.toLowerCase())) {
const pattern = patterns[language.toLowerCase()];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Object.keys(patterns).includes(language.toLowerCase())) {
const pattern = patterns[language.toLowerCase()];
const pattern = patterns[language.toLowerCase()];
if (pattern) {

Comment on lines 1 to 19
import { Patterns, CustomPatterns } from ".";

export const lowerCaseKeys = (patterns: CustomPatterns) =>
Object.keys(patterns).reduce((prev, current) => {
prev[current.toLowerCase()] = patterns[current];
return prev;
}, {} as CustomPatterns);

const convertStringToRegex = (string: string) => new RegExp(string, "gim");

export const regexifyValues = (patterns: CustomPatterns) => {
const regexified: Patterns = {};

Object.keys(patterns).forEach((key) => {
regexified[key] = convertStringToRegex(patterns[key]);
});

return regexified;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty fine to use lodash in lokse as it is cli utility. With it you can simplify your utils 💡

import { lowerCase, mapKeys, mapValues } from 'lodash';

export const lowerCaseKeys = (patterns: CustomPatterns) =>
 mapKeys(patterns, lowerCase);

export const regexifyValues = (patterns: CustomPatterns) =>
 mapValues(patterns, convertStringToRegex);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, used a bit differently tho

// provide custom regex patter without flags
// default flag is gim
// use the language code as key (has to be the same as your lang in the spreadsheet)
"ad-HD": "(\\s|^)(a|i|k|o|s|u|v|z)(\\s+)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to repeat (\\s|^)( and )(\\s+) in each definition? Isn't it a common pattern when looking for a prepositions (or other non standable characters) that could be included in the plugin?

}
]
}
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document default patterns that work out of the box 🙏

@filip-dot-cube filip-dot-cube requested a review from a team November 3, 2021 09:56
@filip-dot-cube
Copy link
Author

Copy link

@horaklukas horaklukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes to make, but otherwise really good job 👏 after that you're good to merge it 😉

@@ -0,0 +1,12 @@
import { mapKeys, mapValues } from "lodash";

import { CustomPatterns } from ".";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about this import from '.' ? 🙂


import nonBreakingSpacesPlugin from "..";

import Transformer from "../../../core/lib/transformer/json";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't break packages encapsulation 🙏

Suggested change
import Transformer from "../../../core/lib/transformer/json";
import { transformersByFormat, OutputFormat } from "@lokse/core";
const Transformer = transformersByFormat[OutputFormat.JSON];

} else {
options.logger.warn(
`Pattern for current language ${language} was not found`
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Ok, I understand, lets keep it as it is, I didn't realize the gramatics differencies the feature is based on.

We could maybe provide default definitions for some most used languages like EN/DE. Please create an enhancement ticket for it 🙏

@filip-dot-cube
Copy link
Author

Minor changes to make, but otherwise really good job 👏 after that you're good to merge it 😉

Thanks :)

@filip-dot-cube filip-dot-cube merged commit c0bbd1d into master Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants