Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToWords should be localised #15

Closed
MehdiK opened this issue Aug 30, 2013 · 17 comments
Closed

ToWords should be localised #15

MehdiK opened this issue Aug 30, 2013 · 17 comments
Labels

Comments

@MehdiK
Copy link
Member

MehdiK commented Aug 30, 2013

No description provided.

@hazzik
Copy link
Member

hazzik commented Aug 30, 2013

  1. In Russian hundreds have own names:
    • 100 - сто
    • 200 - двести
    • 300 - триста
    • 400 - четыреста
    • 500 - пятьсот
    • 600 - шестьсот
    • 700 - семьсот
    • 800 - весемьсот
    • 900 - девятьсот
  2. Numbers have declension and gender.

For example:

  • 1 234 000 = один (one) миллион (million) двести (two hundred) тридцать (thirty) четыре (four) тысячи (thousand)
  • 1 234 = одна (one) тысяча (thousand) двести (two hundred) тридцать (thirty) четыре (four)
  • 200 000 = двести (two hundred) тысяч (thousand)
  • 2 000 000 = дв_а_ (two) миллона (million)
  • 2 000 = дв_е_ (two) тысячи (thousand)

So keys should be really customizable.

@MehdiK
Copy link
Member Author

MehdiK commented Aug 30, 2013

Hmmm, you lost me there! You are right - this takes a flexible solution; but I am not sure what that is as I don't even know the requirements for it. So I guess you either have to teach me a bit of Russian or tackle this yourself ;)

@hazzik
Copy link
Member

hazzik commented Sep 5, 2013

I've added some more explanations and I hope that this is now more clear and understandable

@MaximRouiller
Copy link

Maybe have a provider that could, given a number, return the proper string that represents it.

Would look like this:

var provider = new FrenchNumberOrdinalizer();
var myString = 1025.ToWords(provider);

Just an idea.

@MehdiK
Copy link
Member Author

MehdiK commented Oct 16, 2013

Thanks for the nice idea @MaximRouiller

I think the translation logic is so intertwined into the implementation and the variations on different locales are so diverse and different that any provider or localisation effort would require a re-implementation of the whole thing. That said it could be helpful for apps with globalization needs to have this implemented one way or another.

@hazzik
Copy link
Member

hazzik commented Feb 4, 2014

What are we going to do if we do not have "Ordinalizer" for specific culture?

@MehdiK
Copy link
Member Author

MehdiK commented Feb 4, 2014

There is an ongoing effort to add arabic ToWord. There was absolutely nothing to be reused from the current ToWords implementation; but the implementation of the arabic ToWords, although done from scratch, is clean and fits nicely into the existing API.

I think different languages have different enough rules around ToWords that warrant reimplementation. Reuse here seems to come with more pain than gain and this is one of those cases that duplication, if needed, pays well.

@hazzik
Copy link
Member

hazzik commented Feb 4, 2014

There was absolutely nothing to be reused from the current ToWords implementation

But I see a lot of similarities.

I mean IOrdinalizer interface with one single ToWords method, which is the same as IFormatter for TimeSpans. So my question was - what are we going to do if ToWords for current locale is not implemented?

@MehdiK
Copy link
Member Author

MehdiK commented Feb 4, 2014

Ah, sorry. Yeah, we could create something similar to IFormatter when there are more localisations for ToWords.

Just like other methods, if the locale is not implemented, we'll just default to English. That will encourage more contribution to localisation :)

@hazzik
Copy link
Member

hazzik commented Feb 4, 2014

Also I want to make GrammaticalCase, GrammaticalGender and GrammaticalNumber first class citizens in Humanizer

@MehdiK
Copy link
Member Author

MehdiK commented Feb 4, 2014

Arabic support was added to ToWords in #73. This needs a fair bit of clean up both in the Arabic code and in where the code should live.

@akamud
Copy link
Contributor

akamud commented Apr 10, 2014

I'm doing pt-BR and Portuguese also has a few particularities:

As it happens in Russian, Portuguese's hundreds have their own names:

  • 100 - cem
  • 200 - duzentos
  • 300 - trezentos
  • 400 - quatrocentos
  • 500 - quinhentos
  • 600 - seiscentos
  • 700 - setecentos
  • 800 - oitocentos
  • 900 - novecentos

"100" changes depending if it is an exact number:

  • 100 - cem
  • 101 - cento e um (one hundred and one)
  • 102 - cento e dois (one hundred and two)

Numbers sometimes have plurals:

  • 1000 - mil (one thousand)
  • 2000 - dois mil (two thousand) (doesn't change for thousands)
  • 1000000 - um milhão (one million)
  • 2000000 - dois milhões (two million)

Numbers sometimes accepts two names:

  • 14 - quatorze OR catorze

Numbers sometimes are omitted:

  • 1000000 - um milhão
  • 1000 - mil (instead of um mil). While "um mil" can also be used as it is also technically correct, it is never used anywhere.

They also sometimes have genders:

  • 1 car -> um carro
  • 1 apple -> uma maçã
  • 2 cars -> dois carros
  • 2 apples -> duas maçãs
  • 3 cars -> três carros
  • 3 cars -> três maçãs (same for both genders)

Since we still don't have a way of providing gender differences, I'll be doing it in male gender, because it is the predominant gender in portuguese language, meaning: if there are two words in a sentence, one male and one female, the male gender is used to describe them both at the same time.

@thunsaker
Copy link
Contributor

Take a look at my PR #151 for suggestions. Spanish has many similarities,
obviously. :)

@akamud
Copy link
Contributor

akamud commented Apr 10, 2014

@thunsaker thank you for the heads up, I used your ES commit to optimize some of my functions and also some of my tests.

@MehdiK
Copy link
Member Author

MehdiK commented Apr 10, 2014

Thanks @akamud for your comment and @thunsaker for the response. @akamud - please checkout further discussions on #151. In #149 some changes were introduced which should be used as the foundation for ToWords localisations.

There are also some works on the way with regards to grammatical gender, number and case, as per #74, which might change the way you're implementing your ToWords.

@hazzik
Copy link
Member

hazzik commented Apr 10, 2014

I think the gender is trickiest part as the same word can have different gender in different languages.

@MehdiK
Copy link
Member Author

MehdiK commented Apr 11, 2014

Closing this now as there is proper infrastructure in place to deal with it (as of #149), and it's just a matter of adding more localisations, and many seem to be coming through!

@MehdiK MehdiK closed this as completed Apr 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants