Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Locale-independent strtod #5988

Merged
merged 1 commit into from
Mar 13, 2014
Merged

Conversation

tknopp
Copy link
Contributor

@tknopp tknopp commented Feb 28, 2014

Implement and use locale-independent strtod function. fixes #5928. Should also work on windows but I cannot test.

@JeffBezanson
Copy link
Member

Nice! We also need this for strtof in builtins.c.

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

yep. please wait...

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

... done. But seems to not work on linux (according to travis)

@JeffBezanson
Copy link
Member

The functions are there, but not exposed by standard headers.

  1. To expose strtod_l, define __USE_GNU (based on reading the headers)
  2. To expose newlocale and LC_ALL_MASK, define __USE_XOPEN2K8.

This seems pretty sketchy and I don't know how portable it will be.

@JeffBezanson
Copy link
Member

Some sources seem to define _GNU_SOURCE for this.

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

newlocale is posix and after including locale.h (instead of xlocale.h on OSX) travis only has problems with strtod_l. I give _GNU_SOURCE a go...

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

hmm still no luck despite the usage of _GNU_SOURCE. On Windows and OSX, these functions are clearly documented by the way.

@JeffBezanson
Copy link
Member

On linux let's just manually add the prototypes for strtod_l and strtof_l.

@JeffBezanson
Copy link
Member

They are mentioned in headers, and I seem to be able to look up their addresses with dlsym.

@JeffBezanson
Copy link
Member

I experimented with your branch. This patch seems to fix it:

diff --git a/src/support/strtod.c b/src/support/strtod.c
index fdc99a4..abe005a 100644
--- a/src/support/strtod.c
+++ b/src/support/strtod.c
@@ -30,7 +30,7 @@ LOCALE_T get_c_locale()
 #if defined(_OS_WINDOWS_)
     c_locale = _create_locale(LC_ALL,"C");
 #else
-    c_locale = newlocale(LC_ALL_MASK, NULL, NULL);
+    c_locale = newlocale(LC_ALL_MASK, "C", NULL);
 #endif
   }
   return c_locale;

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

Great I think we got it. Just for reference: https://developer.apple.com/library/ios/documentation/System/Conceptual/ManPages_iPhoneOS/man3/newlocale.3.html#//apple_ref/doc/man/3/newlocale

"In order to create a C locale_t value, use newlocale(LC_ALL_MASK, NULL, NULL)"

Now we just need someone to check if this is working on Windows.

@JeffBezanson
Copy link
Member

We might want to use "C" on linux only then.

@tknopp
Copy link
Contributor Author

tknopp commented Feb 28, 2014

Not sure about that. We also use setlocale(LC_NUMERIC, "C") in libsupportinit.c using "C" is also valid on OSX. Its just that there is also an alternative way.
Further I can confirm that it works on OSX.

@JeffBezanson
Copy link
Member

Ok, good.

@JeffBezanson
Copy link
Member

@ihnorton @cmundi Could somebody test this branch on windows?

@ihnorton
Copy link
Member

ihnorton commented Mar 1, 2014

looks like the exports aren't quite right:

../support/libsupport.a(strtod.o):strtod.c:(.text+0x34): undefined reference to `_create_locale'
../support/libsupport.a(strtod.o):strtod.c:(.text+0x97): undefined reference to `_create_locale'
../support/libsupport.a(strtod.o):strtod.c:(.text+0xd4): undefined reference to `_strtof_l'
../support/libsupport.a(strtod.o):strtod.c:(.text+0xf7): undefined reference to `_create_locale'
/cmn/mgwx/cross-w64-20131001/bin/../lib/gcc/x86_64-w64-mingw32/4.8.1/../../../../x86_64-w64-mingw32/bin/ld: ../support/libsupport.a(strtod.o): bad reloc address 0x0 in section `.pdata'

@ihnorton
Copy link
Member

ihnorton commented Mar 1, 2014

Ah, I see those are library functions. Supposedly Wine crt has them but I'll have to dig around and see why they are not found in my cross compile.

@cmundi
Copy link
Contributor

cmundi commented Mar 1, 2014

Wow. I go to get a snack and before I get back you have something to test. :) Looks like you have it figured out, but I'll look it over anyway,

@cmundi
Copy link
Contributor

cmundi commented Mar 1, 2014

Friends -- in all locales -- I have to vigorously agree with @StefanKarpinski in his comment on #5928. And I was going to write about my commercial experience with this exact discussion, until I realized I would sound like a ranting old FORTR^H^H^H^H^H codger. Instead I'll just say that I prefer to live with the problem in exchange for unambiguous portability of code and data. GUIs are a different story of course.

But if I'm wrong and people really want this, I'll help test and file off any rough edges.

@ihnorton
Copy link
Member

ihnorton commented Mar 1, 2014

Yeah, this appears to be problematic. MinGW doesn't have xlocale.h, and the MinGW runtime does not include _create_locale. [edit: libintl is not applicable]

@ihnorton
Copy link
Member

ihnorton commented Mar 1, 2014

msvcr has everything except strtof... but we can't use that anyway due to GPL (well, we can, but we can't redistribute it). baarrgh....

@cmundi
Copy link
Contributor

cmundi commented Mar 2, 2014

@ihnorton Avoid MSVCR at all costs unless you're developing on Windows for Windows only. There is no return from that path. :) On this point, the GPL got it right.

@tknopp
Copy link
Contributor Author

tknopp commented Mar 2, 2014

So this means that we don't use the native C library on windows? Because http://msdn.microsoft.com/de-de/library/kxsfc1ab(v=vs.90).aspx indicates that the function should be there.

Anyway, maybe the next step is to try the C++ solution but it has a little different interface. In particular I don't see how it will be possible to maintain the second parameter to strtod. I suppose that an exception will be thrown when the conversion is not valid.

@tknopp
Copy link
Contributor Author

tknopp commented Mar 2, 2014

Just for the record. The C++ version would look something like the following. @JeffBezanson I have not looked in detail on the usages of strtod but would this be ok?

#include <sstream>  
double strtod_c(const char* str, int* failed)
{
  std::stringstream os;
  os.imbue(std::locale("C"));
  double ret = 0.0;
  os << str;
  os >> ret;
  (*failed) = stream.fail();
  return ret;
}

@ihnorton
Copy link
Member

ihnorton commented Mar 2, 2014

They have strod_l but not strtof_l (the deprecated equivalent is atof_l, but the signature is different and overflow behavior is undefined).

@JeffBezanson
Copy link
Member

In builtins.c we are already using strtod instead of strtof on windows. So let's use strtod_l and cast.

@ihnorton
Copy link
Member

ihnorton commented Mar 2, 2014

That makes sense, but AFAIK would still require linking against MSVC runtime.

@JeffBezanson
Copy link
Member

Is the MSVC runtime not available on all windows systems?

@jiahao
Copy link
Member

jiahao commented Mar 2, 2014

IIRC the MSVC runtime was removed from the Windows base OS some time ago, and its license prohibits redistribution in binary form.

@ihnorton
Copy link
Member

ihnorton commented Mar 2, 2014

http://support.microsoft.com/kb/326922

"The shared CRT DLL has been distributed by Microsoft in the past as a shared system component. This may cause problems when you run applications that are linked to a different version of the CRT on computers that do not have the correct versions of the CRT DLL installed. This is commonly referred to as the "DLL Conflict" problem. To address this issue, the CRT DLL is no longer considered a system file, therefore, distribute the CRT DLL with any application that relies on it."

@tknopp tknopp deleted the strtod branch March 13, 2014 19:39
@vtjnash
Copy link
Member

vtjnash commented Mar 14, 2014

wow, i had no idea this would be so crazy to implement

(PS. is it some deep secret that we have always linked against msvcrt.dll?)

static locale_t c_locale;

locale_t get_c_locale()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for BSD, it should be safe to always return NULL, according to the xlocale man page

@tknopp
Copy link
Contributor Author

tknopp commented Mar 14, 2014

@vtjnash If we link against msvcrt.dll we should be able to use _strtod_l. My initial version of this PR used it.

@ihnorton
Copy link
Member

@vtjnash maybe you can explain why msvcrt.dll has _strtod_l but not _create_locale. I sure can't. If there is another way to get a _locale_t so that we can actually call _strtod_l that would be great. Google tells me we are not the first project to run in to this problem.

On Windows 7:

image
image

(as per my first comment in this thread), MinGW's libmsvcrt.a (which we statically link) does not have it either.

ihnorton@julia:~/mgwx/mingw-w64-dgn/source/mingw-w64-v3.0.0/mingw-w64-crt/lib64$ grep create_locale *
msvcr100.def.in:__create_locale
msvcr100.def.in:_create_locale
msvcr110.def.in:__create_locale
msvcr110.def.in:_create_locale
msvcr110.def.in:_wcreate_locale
msvcr90d.def.in:__create_locale
msvcr90d.def.in:_create_locale
msvcr90.def.in:__create_locale
msvcr90.def.in:_create_locale
ihnorton@julia:~/mgwx/mingw-w64-dgn/source/mingw-w64-v3.0.0/mingw-w64-crt/lib64$ grep strtod_l *
msvcr100.def.in:_strtod_l
msvcr110.def.in:_strtod_l
msvcr90d.def.in:_strtod_l
msvcr90.def.in:_strtod_l
msvcrt.def.in:_strtod_l

@tkelman
Copy link
Contributor

tkelman commented Mar 14, 2014

(PS. is it some deep secret that we have always linked against msvcrt.dll?)

Why would it be? Quoting mingw.org,

[MinGW] does depend on a number of DLLs provided by Microsoft themselves, as components of the operating system; most notable among these is MSVCRT.DLL, the Microsoft C runtime library.

Or for mingw-w64, http://sourceforge.net/apps/trac/mingw-w64/wiki/download%20filename%20structure

The mingw-w64 project maintains a C interface to msvcrt.dll that allow GNU GCC to run on and build for native windows
GCC is the compiler used with the mingw-w64 C interface to msvcrt.dll

@tkelman
Copy link
Contributor

tkelman commented Mar 15, 2014

This was bound to happen sooner or later, but this change broke building on RHEL/CentOS 5, where the newest version of GCC available from the package manager is 4.4. Error messages are here, expected declaration specifiers or "..." before "locale_t" and others: https://gist.github.com/tkelman/9563102

That distribution is terrible and should really come with a warning label that all the software versions are already obsolete as soon as you buy the machine, and annoyingly difficult to upgrade. But it's unfortunately still fairly common in the wild and difficult to replace for users who didn't know any better to pick a different distribution.

Can anyone suggest a fix? If not, I'll try searching for a yum repo with newer GCC and document my findings in the relevant section of Readme.md.

@tknopp
Copy link
Contributor Author

tknopp commented Mar 15, 2014

@tkelman: This should be fixable. There is a big #if block in strtod.c and currently the decision is made on the operating system. For gcc versions that don't support locale_t and/or strtod_l, the #else block should be taken.
In Glib they have this USE_XLOCALE define. Not sure if this comes directly from gcc but having a macro that tells us whether the necessary functions are the is IMHO the way to go. Since you have access to RHEL/CentOS 5, could you have a look at this issue?

@tkelman
Copy link
Contributor

tkelman commented Mar 15, 2014

Yeah, that was fairly simple. Changing the #if !defined(_OS_WINDOWS_) to #ifdef USE_XLOCALE appears to have worked, tests are passing. It looks like USE_XLOCALE isn't defined with MinGW or MSVC, which is consistent. Can folks check on Mac and recent Linux, gcc and clang, that it'll go down the code path you want it to?

(Found a yum repo here http://superuser.com/questions/381160/install-gcc-4-7-on-centos that looks promising, haven't tried building julia with it yet though)

@tknopp
Copy link
Contributor Author

tknopp commented Mar 15, 2014

Does not work on OSX as it uses the wrong branch.
Could you try #if defined(__APPLE__) || defined( HAVE_XLOCALE_H)

Unfortunately I don't have access to linux and cannot test it

@tkelman
Copy link
Contributor

tkelman commented Mar 15, 2014

Works on old-GCC Linux, and neither define is set in MinGW or MSVC. Someone else will have to chime in about recent-GCC or Clang with Linux.

@tknopp
Copy link
Contributor Author

tknopp commented Mar 15, 2014

Cool. For testing on Linux it it is important to check that the right branch is taken!

To the _create_locale issue: We could make a binary dump of whats behind a C locale...

@tkelman
Copy link
Contributor

tkelman commented Mar 15, 2014

For testing on Linux it it is important to check that the right branch is taken!

Yeah, it may be more useful to just compile and report the output of something dumb like

#define _GNU_SOURCE
#include <stdlib.h>
#include <locale.h>
#include <stdio.h>

int main() {
#ifdef HAVE_XLOCALE_H
printf("HAVE_XLOCALE_H defined\n");
#else
printf("HAVE_XLOCALE_H not defined\n");
#endif
#ifdef __APPLE__
printf("__APPLE__ defined\n");
#else
printf("__APPLE__ not defined\n");
#endif
return 0;
}

@pao
Copy link
Member

pao commented Mar 15, 2014

If you use #warning instead of printf you don't even have to run it, just read the compiler output.

@tknopp tknopp mentioned this pull request Apr 1, 2014
@quinnj
Copy link
Member

quinnj commented Jun 24, 2014

I think issue is dead and done with, but I'll just mention that double-conversion has a strtod implementation. Not sure if there would be any advantages to doing a julia implementation (it would use some of my new grisu code machinery), but it wouldn't be too much work if we wanted it for some reason.

@quinnj quinnj mentioned this pull request Jun 24, 2014
@tknopp
Copy link
Contributor Author

tknopp commented Jun 24, 2014

@quinnj Does your julia implementation has hex support? This was/is one major issue here.

But much more fundamental, strtod is used in the parser (flisp) and flisp cannot call julia code.

@StefanKarpinski
Copy link
Member

The Grisu algorithm is serious overkill for base 2^k printing – that's very straightforward, I just never got around to implementing it. Having support for other non-power-of-two bases would be super cool though :-)

@quinnj
Copy link
Member

quinnj commented Jun 25, 2014

@tknopp, unfortunately it sounds like no hex support from the strtod.h

// The buffer must only contain digits in the range [0-9]. It must not
// contain a dot or a sign. It must not start with '0', and must not be empty.
double Strtod(Vector<const char> buffer, int exponent);

To your next point, have you seen JuliaParser? :)

@StefanKarpinski, I'm not sure I understand your point; should we special case certain floats to print faster than calling grisu?

@StefanKarpinski
Copy link
Member

@tknopp was asking about hex printing – my point was just that the Grisu algorithm is serious overkill for printing floats in base 16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make parsing float literals locale-independent