Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix #69

Merged
merged 7 commits into from
Aug 26, 2021
Merged

Bugfix #69

merged 7 commits into from
Aug 26, 2021

Conversation

felginep
Copy link
Contributor

Fix #58

Add correct warning messages when spreadsheet is empty

WARN-- : Your spreadsheet is empty. Add content and retry.

Fix #59 #67

Add warning messages when spreadsheet is corrupted.

With the following spreadsheet:

,key,en,
,aaa.bbb,first,
,aaa.bbb.ccc,second,
,bbb.ccc.ddd,first,
,bbb.ccc,second,
,zzz,first

The warnings emitted are:

WARN -- : Corrupted input. Trying to insert a value for key 'aaa.bbb.ccc' but a value already exists for 'aaa.bbb'. Skipping.
WARN -- : Corrupted input. Trying to insert a value for key 'bbb.ccc' but values already exist for keys 'bbb.ccc.*'. Previous values will be lost.

Fix #61

There was a performance issue that was slowing down the export. The issue was that we looped through all the existing_keys (the keys of the csv we already treated) for each key of the csv. This was useless and inefficient.
The goal was to avoid issues when there are duplicates or plural / singular mismatch.

The solution is to use a hash table to store the treated keys along with their label. That way, if a label already exist in the hash table, that means this is not the first time we see the key.

Some warnings have been added to notify the user that the spreadsheet is bad. With the following spreadsheet:

,key,fr,comment fr,en,comment en
#Generic actions,,,,,
,duplicate,Duplicate,,Duplicate,
,duplicate,Erreur,,Error,
#Plurals,,,,,
,plural_before_singular##{one},Plural before Singular,,Plural before Singular,
,plural_before_singular##{one},Plural before Singular,,Plural before Singular,
,plural_before_singular##{other},Plural before Singular,,Plural before Singular,
,plural_before_singular,Erreur,,Error,,
,singular_before_plural,Singular before Plural,,Singular before Plural,,
,singular_before_plural##{one},Erreur,,Error,,

The warnings emitted are:

WARN -- : Some values already exist for key 'duplicate'. Remove duplicates.
WARN -- : Some values already exist for key 'plural_before_singular'. Remove duplicates.
WARN -- : A plural value already exist for key 'plural_before_singular'. Remove duplicates.
WARN -- : A singular value already exist for key 'singular_before_plural'. Remove duplicates.

When downloading an empty spreadsheet, the content type of the downloaded file is "inode/x-empty"
When shared configuration is misconfigured, the content type of the downloaded file is "text/html"
For each key, we looped through all the keys already read. This was very unefficient.
Now we use a hash table to store the previous keys.

Results:
- 5 minutes export before optimization
- less than 5 seconds after optimization

What's more, we added some warnings along the way when we see duplicates, or singular / plural mismatch.
Comment on lines 36 to 37
else
LOGGER.error("invalid export request. check the spreadsheet share configuration")
if export_request.has_empty_files?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tu peux utiliser un elsif ici, ça devrait donner le même résultat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bien vu, je corrige

Comment on lines 20 to 21
existing_plural_key = existing_key.label == current_key.label && existing_key.plural? && current_key.singular?
existing_singular_key = existing_key.label == current_key.label && existing_key.singular? && current_key.plural?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tu peux ajouter 2 méthodes dans key ? Une same_singular_as? et l'autre same_plural_as? Ça va compléter l'interface de la classe, qui a déjà same_as? et ça simplifiera le code ici.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum tu vois cette implémentation pour tes deux méthodes :

def same_singular_as?(key:)
  label == key.label && key.singular?
end

def same_plural_as?(key:)
  label == key.label && key.plural?
end

Parce que vu le naming, j'aurais plutôt tendance à faire cette implémentation :

def same_singular_as?(key:)
  label == key.label && singular? && key.singular?
end

def same_plural_as?(key:)
  label == key.label && plural? && key.plural?
end

Or du coup ca ne correspond plus à la condition de mon code, qui vérifie que bien que les labels soient égaux, les singuliers / pluriels ne correspondent pas.

Copy link
Contributor

@jvigne jvigne Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah bon point 🤔, je pensais à une implémentation équilivalente au code de existing_plural_key et existing_singular_key mais c'est vrai que same_**_as? évoque une égalité totale. Le naming ne convient pas.

Tu en as un autre en tête ? J'ai pensé à same_label_as_singular?|same_label_as_plural? ou singular_with_same_label_as_plural?|plural_with_same_label_as_singular? ou singular_with_same_plural_as?| plural_with_same_singular_as?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum pas évident le naming ici. Je vais réfléchir. D'autant qu'au final ca me fait bizarre d'avoir cette méthode sur Key directement. Peut être qu'il faudrait juste en faire une méthode privée de la classe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aussi, ça m'irait

@@ -10,6 +10,14 @@ def map(locale_wording:)
hash[translation.key.label] = {} unless hash.key? translation.key.label
hash[translation.key.label][translation.key.plural_key] = translation.value
else
unless hash.is_a?(Hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment hash pourrait ne pas être un hash ? Ça voudrait dire que le code fait une mutation non voulue puisque hash est initialisé comme un hash avec le each_with_object({})

Copy link
Contributor

@jvigne jvigne Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ce ne serait pas hash[inner_key.to_s] que tu veux vérifier ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hash peut être une string. Par exemple si tu veux inserer les deux paires suivantes dans cet ordre:

aaa.bbb => first
aaa.bbb.ccc => second

Une fois que tu as inséré la première clé et que tu passes dans la seconde, tu as

hash === { "aaa" => { "bbb" => "first } }

Donc quand tu arrives à inner_key === "ccc", ton hash vaut en fait "first" et pas {}.

Copy link
Contributor

@jvigne jvigne Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah oui je vois, c'est la L25 qui fait la mutation de hash hash = hash[inner_key]. C'est bon pour moi

Copy link
Contributor

@jvigne jvigne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merci pour cette PR, dans l'ensemble c'est bon. Je t'ai suggéré quelques améliorations liées au coding style et je t'ai posé une question sur LocaleWordingToHash

@felginep
Copy link
Contributor Author

Pour le problème de perf, je vois qu'on a un code similaire ici. Je vais voir pour le modifier également.

@jvigne
Copy link
Contributor

jvigne commented Aug 25, 2021

Super, cette modif devrait avoir beaucoup d'impact pour la majorité des utilisateurs car il est impliqué dans l'export depuis une spreadsheet

Comment on lines +49 to +53
def ==(o)
o.class == self.class &&
o.raw_label == raw_label
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent, c'est super d'y avoir pensé

end
translations.concat(new_translations)
current_key = row_translations.first.key
next if validator.has_warnings?(current_key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C'est beau 😄

@felginep felginep merged commit 0c5cef4 into main Aug 26, 2021
@felginep felginep deleted the feature/bugfix branch August 26, 2021 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants