Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strtod issue for MSVC #6349

Closed
tknopp opened this issue Apr 1, 2014 · 60 comments
Closed

strtod issue for MSVC #6349

tknopp opened this issue Apr 1, 2014 · 60 comments
Labels
system:windows Affects only Windows

Comments

@tknopp
Copy link
Contributor

tknopp commented Apr 1, 2014

In #6230 it came up that that the flisp unit test did not compile when using MSVC. The issue is that the strtod function does not support hex values when using the MSVC runtime. I have just tried this out on VS2010 and can confirm this.

Funnily this means that all the attempts in #5988 to use the MSVC runtime in order to use _strtod_l would have been useless.

To solve the issue I think we have to try
http://mirror.fsf.org/pmon2000/3.x/src/sdk/libc/stdlib/strtod.c
again. I once got it almost working but had an error which was actually unrelated (forgot to cleanall).

Would be intersting if the ability to parse hex values is really required for running julia.

(notify @tkelman out this issue)

@tkelman
Copy link
Contributor

tkelman commented Apr 1, 2014

Interesting, I missed that you had also seen an overflow from 0 while developing that PR. Digging through the Travis logs (I wish those were searchable in an easier way...) from 5988 I see you encountered the same flisp unit test failure too. That's a good sign that this should be fixable. Having to add more C code just for MSVC to work is annoying, but I guess that's the price you have to pay to try to use such a dramatically different runtime without MinGW's headers hiding away all the differences.

Hmm, looks like MinGW is using very closely-related code: https://github.com/mirror/mingw-w64/tree/master/trunk/mingw-w64-crt/gdtoa

@tknopp
Copy link
Contributor Author

tknopp commented Apr 1, 2014

Ah yes, now that you mention it I also remember the overflow when I typed 0.0.

I have looked a little bit closer to
http://mirror.fsf.org/pmon2000/3.x/src/sdk/libc/stdlib/strtod.c
but can't find 'x' in there so maybe this does not support hex either.

Another candidate is
http://svnweb.freebsd.org/base/stable/10/contrib/gdtoa/strtod.c?revision=256281
where I can find the 'x' but not the 'p' so I am not entirely sure if it supports hex. I just know that it was not so easy to integrate as it pulls in a lot of macros defined in other headers.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 1, 2014

For reference:
Here is a Microsoft issue telling us that this won't be fixed with VS2013:
https://connect.microsoft.com/VisualStudio/feedback/details/794104/strtod-does-not-implement-the-c99-c-11-standard

We are not alone:
http://lua-users.org/lists/lua-l/2007-01/msg00527.html

The later makes me wonder if parsing hex floats is really required to run Julia. Maybe could you comment on this Jeff (@JeffBezanson) ?

@tkelman
Copy link
Contributor

tkelman commented Apr 2, 2014

But it will be in the "next major release" whenever that is?

If hex parsing is actually required for Julia, maybe it could be created as an enhancement patch to the double-conversion library? The MinGW-w64 implementation for hex string-to-double appears to be here which calls gethex. Since double-conversion already has all the bignum machinery in place and is based on pretty much the same code for the fallback case, it might be easier to add the hex capability there.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

The MinGW implementation of that code actually looks cleaner than the freebsd one as all the #ifdefs for KR_headers are missing. Maybe stripping down the code from gdtoa is the easiest way to solve this issues. I am not sure how much (and which) files are required to get all the dependencies for strtod.c satisfied. The good thing is that we could throw away most of the code in our current strtod.c file as we can adapt this one to always use . as decimal delimiter

@tkelman
Copy link
Contributor

tkelman commented Apr 2, 2014

It looks like you might need the entire Bigint implementation, for Balloc (in misc.c) etc. That's why I suspect improving double-conversion (https://code.google.com/p/double-conversion/source/browse/#git%2Fsrc) to handle hex could end up requiring less additional code overall, since they have an alternate implementation of most of the same things the David Gay code does. You might be more concerned with code integration effort than code volume, so go for whichever approach you feel like taking.

The issue where hex-floats were added to Julia and flisp was #2863, and PR 2925 referenced therein. Hard to tell how much they're used in application code. I might try a cherry-pick reversal of that commit, see if it lets MSVC get any further.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

Ok but the double conversion library is currently not a build time dependency for libjulia, right? More problematic is that the interface of the c++ code is not the same as for strtod. More precisely, the flisp parser uses the endpoint parameter which seems to be not available in that implementation.

If this would not be the issue I would try the most simple C++ solution using stringstreams.

@tkelman
Copy link
Contributor

tkelman commented Apr 2, 2014

No, currently it is not. But adding new build time dependencies for libjulia, at least if they are small, does not seem to be too bad - for example utf8proc was added a few months ago in PR 5462.

Hm okay if the API is significantly different then that may be a non-starter.

If there's a really easy C++ solution, why not use it for now, but only in the case of MSVC where we know the runtime's strtod is deficient? err nevermind I guess you're saying there are aspects of flisp's usage of strtod that are more closely tied to the expected API of strtod than just the basic functionality? Yep, you can see the uses of the end pointer pretty clearly in builtins.c and flisp/read.c.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

The issue with the C++ solution is that it does not give us the endpoint character. I am not sure how bad this is. Jeff indicated in #5988 (comment) that it would be a problem.

The C++ solution definately solves the delimiter issue. I am not sure about the hex thing though.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

Tried it with VS2010 and using the stringstream method did not support hex floats...

@tkelman
Copy link
Contributor

tkelman commented Apr 2, 2014

Bah. Seems silly to me to have to re-implement a piece of the C runtime, but I guess it's either that or selectively disable the hex-parsing capability somehow?

I'm also wondering out loud whether the more serious syntax: overflow in numeric constant "0" problem might not have anything to do with hex parsing?

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

Indeed this is silly. And actually its as silly as one of the major C compiler vendor not implementing C99...

I am also not sure if the overflow error has anything to do with the strtod hex issue. One way to check this is to enable the fallback implementation of strtod_c in strtod.c and jump to some error when an x is detected (like in the original python implementation)

@tknopp
Copy link
Contributor Author

tknopp commented Apr 2, 2014

I have just tried what I suggested in my previous post and it worked (on OSX): no hex floats working but 0.0 did not overflow. @tkelman: Could you try the following: In strtod.c line 148 you could introduce an goto invalid_string; if we are compiling with MSVC. This will finish this function cleanly.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

You meant inside the if statement about hex floats, right? That's a few lines lower now.

But yeah, tried it, still overflow in numeric constant "0": https://ci.appveyor.com/project/tkelman/julia/build/1.0.334

How do we attempt to debug that?

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Yes I meant inside the if statement.

Hm, not sure how to go further. I am not entirely sure at which point the error happens. Is it when compiling the system image?

I am right that you can currently only run this through AppVeyor and thus have no possibilities to debug it direktly, right?

What comes to my mind is to get libjulia.dll and julia.exe out of AppVeyor (from a debug build) and debug this locally with WinDbg or Visual Studio. Maybe Jamson has an idea (@vtjnash)?

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

Yes it's while building the system image, but the error looks like it's coming from the parser.

I installed VS 2013, but I couldn't get your build of LLVM to work. Errors about conflicting stdlibs, libcmt (/MT) vs msvcrt (/MD). I'm not sure which the default LLVM Cmake setup uses, versus what was used in these VS 2012 builds that I've been using on AppVeyor.

What do we need for a debug build, besides /Zi flag on the compiler and/or /DEBUG flag on the linker? I also probably need to upload the pdb file (?), but I'm not sure where MSVC is creating those by default.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

I think the flags will be sufficiant. Then one has to put files inside of a julia development environement, execute the command to build the sys image and attach via a debugger. If you could get a backtrace from where the error happens, this will certainly help understanding whats going on.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

I can't get a backtrace because I have no idea how to set breakpoints for errors that happen in julia-parser.scm. If anyone has any better ideas, here is a debug build of libjulia.dll, julia.exe, and libgetopt.dll. The getopt port is LGPL, from here http://www.codeproject.com/Articles/157001/Full-getopt-Port-for-Unicode-and-Multibyte-Microso

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Could you have a look at this thread where Jameson gives some very good debugging intructions: JuliaGraphics/Gtk.jl#69
In the end we did a remote session where he did some debugging ninja action which only can be done when one has sufficient understanding of the Julia source.

Anyway. The trick is to set a breakpoint at jl_throw and then look at the backtrace. using lldb/gdb it helps a lot to execute jl_(ex) on expression (prints it in a nice way). Not sure if this is possible on windows.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

I've had some luck with debugging and breakpoints in the past with the assembly issues or other system image crashes, but with this one I can't seem to get any useful debugging done - it's not stopping at any of the breakpoints I've set. If I ask for a call stack after the error has already been shown, I get this:

Child-SP          RetAddr           Call Site
00000000`0017ed48 00000000`77a941bb ntdll!ZwTerminateProcess+0xa
00000000`0017ed50 000007fe`e3d91d78 ntdll!RtlExitUserProcess+0x9b
00000000`0017ed80 000007fe`e342216b libjulia!uv_disable_stdio_inheritance+0x90b0b8
00000000`0017edf0 000007fe`e3354342 libjulia!jl_uv_readcb+0xebaa9
00000000`0017ee20 000007fe`e340a999 libjulia!jl_uv_readcb+0x1dc80
00000000`0017ef10 000007fe`e3408e08 libjulia!jl_uv_readcb+0xd42d7
00000000`0017f360 000007fe`e341fde6 libjulia!jl_uv_readcb+0xd2746
00000000`0017f390 000007fe`e341fbf6 libjulia!jl_uv_readcb+0xe9724
00000000`0017f540 00000001`3f18178a libjulia!jl_uv_readcb+0xe9534
00000000`0017f610 00000001`3f18191d julia+0x178a
00000000`0017f780 000007fe`e34118e4 julia+0x191d
00000000`0017f900 00000001`3f181b35 libjulia!jl_uv_readcb+0xdb222
00000000`0017f970 00000001`3f1840d6 julia+0x1b35
00000000`0017f9b0 00000000`7796652d julia+0x40d6
00000000`0017f9f0 00000000`77a9c541 kernel32!BaseThreadInitThunk+0xd
00000000`0017fa20 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Don't know whether or not that's meaningful.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

Got LLVM built locally so now I'm getting more useful backtraces:

Child-SP          RetAddr           Call Site
00000000`0026ef58 00000000`778e41bb ntdll!ZwTerminateProcess+0xa
00000000`0026ef60 000007fe`ef0d29a7 ntdll!RtlExitUserProcess+0x9b
*** WARNING: Unable to verify checksum for D:\code\msys64\home\Tony\julia\usr\bin\libjulia.dll
00000000`0026ef90 000007fe`e5fb034c MSVCR120!doexit+0x1c7 [f:\dd\vctools\crt\crtw32\startup\crt0dat.c @ 678]
00000000`0026f000 000007fe`e5eec372 libjulia!jl_exit+0x1c [d:\code\msys64\home\tony\julia\src\jl_uv.c @ 705]
00000000`0026f030 000007fe`e5f9887a libjulia!jl_errorf+0x92 [d:\code\msys64\home\tony\julia\src\builtins.c @ 60]
00000000`0026f110 000007fe`e5f96ce8 libjulia!eval+0x18da [d:\code\msys64\home\tony\julia\src\interpreter.c @ 446]
00000000`0026f560 000007fe`e5fadf72 libjulia!jl_interpret_toplevel_expr+0x18 [d:\code\msys64\home\tony\julia\src\interpreter.c @ 26]
00000000`0026f590 000007fe`e5fadd76 libjulia!jl_parse_eval_all+0x192 [d:\code\msys64\home\tony\julia\src\toplevel.c @ 522]
*** WARNING: Unable to verify checksum for julia.exe
00000000`0026f740 00000001`3f85179a libjulia!jl_load+0xb6 [d:\code\msys64\home\tony\julia\src\toplevel.c @ 553]
00000000`0026f810 00000001`3f85192e julia!exec_program+0x10a [d:\code\msys64\home\tony\julia\ui\repl.c @ 217]
00000000`0026f980 000007fe`e5f9f8f4 julia!true_main+0x10e [d:\code\msys64\home\tony\julia\ui\repl.c @ 269]
00000000`0026fb00 00000001`3f851b55 libjulia!julia_trampoline+0xe4 [d:\code\msys64\home\tony\julia\src\init.c @ 875]
00000000`0026fb70 00000001`3f8521ff julia!main+0x75 [d:\code\msys64\home\tony\julia\ui\repl.c @ 315]
00000000`0026fbb0 00000000`776b652d julia!__tmainCRTStartup+0x10f [f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c @ 626]
00000000`0026fbe0 00000000`778ec541 kernel32!BaseThreadInitThunk+0xd
00000000`0026fc10 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

@vtjnash
Copy link
Member

vtjnash commented Apr 3, 2014

If you walk the stack back to the jl_error function, you may be able to print the error message Julia wanted to print there

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

From interpreter.c:446 one can see that ex->head == error_sym || ex->head == jl_incomplete_sym. @vtjnash do you remember where in flisp one has to set the brakepoint to see when the parser fails?

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

I know what the error is, it's syntax: overflow in numeric constant "0" (and AFAICT not related to this issue about hex floats...). Trying to see what's causing it is the hard part.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Ok, if I look at src/flisp/read.c:119 I can see that s similar error is triggered when the errno is set to ERANGE. This can happen in strtod.

One try: Could you replace all occurrences (should be 4) of strtod_c by strtod in flisp and julia?

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Another idea is to look at the while loop in jl_parse_eval_all in toplevel.c and do jl_(form) if form is an expression.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

Hm, that is interesting, but the error is saying syntax: overflow rather than read: overflow. Replacing strtod_c with strtod does not appear to have any effect.

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Ok, was just a try. The actual error message is in the parser code as you have already mentioned. I think the most important thing is to understand, what is actually parsed when the error happens. What is the exact command that you call julia.exe?

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2014

cd base && ..\usr\bin\julia --build D:/code/msys64/home/Tony/julia/usr/lib/julia/sys0 sysimg.jl

@tknopp
Copy link
Contributor Author

tknopp commented Apr 3, 2014

Ok, so it basically executes the sysimg.jl file. Now it would be helpful to understand at which line/expression the error occurs. This can be either done be some printf like debugging using jl_ or maybe also by stripping down sysimg.jl. As I have not a lot experience with the system image, I don't know if the later is a good recommendation.

@tkelman
Copy link
Contributor

tkelman commented Jun 8, 2014

As I predicted, looks like the resolution for hex strtod is just waiting for Microsoft to release a newer version of their compiler that is finally caught up with the 15-year-old C standard

C99 library features: Most of the remaining C99 library features are implemented. snprintf is implemented, the printf and scanf families of functions now support the new C99 format string improvements, the strtod and scanf families of functions now support hexadecimal floating-point and library conformance is better improved by software updates and adjustments.

I might give the CTP a try if anyone is still interested in seeing MSVC Julia happen.

@vtjnash
Copy link
Member

vtjnash commented Jun 9, 2014

@tkelman just looked at your last post, and it looks like it ran out of stack space. this is unsurprising, since llvm needs ~2.5MB during sysimg compile, and the windows default is 1MB. we add a flag to all binaries built by mingw to get around this, changing the value to the unix default of 8MB. for gcc/clang the flag is -Wl,--stack,8388608. I don't know what the equivalent flag on msvc is.

an MSVC julia would still be nice to have. getting an issue closed without doing any specific work(-arounds) for it is also awesome.

@tkelman
Copy link
Contributor

tkelman commented Jun 9, 2014

Googling tells me /F for compiling, /stack for linking. I'll have a look, if I can remember where I left the MSVC work.

@tknopp
Copy link
Contributor Author

tknopp commented Jun 17, 2014

@tkelman Might it be a good idea to open a dedicated issue where the state of MSVC compilation of Julia is documented? You have got so far that it would be really nice to see this in actual action.

@tkelman
Copy link
Contributor

tkelman commented Jun 17, 2014

Feels like it would be noise right before a release. I'll get back to this eventually.

@tknopp
Copy link
Contributor Author

tknopp commented Jun 18, 2014

I don't think that all post 0.3 issues have to be silent before a release but of course you are right that it is not a pressing issue currently. I just realized that there are some MSVC compatibility issues (by you and me) that are already closed because the bits have been merged. And thus it would be nice to have an issue that is open as long as full MSVC compilation is achieved.

@jakebolewski
Copy link
Member

Closing due to #7761.

@tkelman
Copy link
Contributor

tkelman commented Oct 28, 2014

No, #7761 does not close this at all. Using the next version of MSVC might, but that's not out for at least another few months.

@JeffBezanson JeffBezanson reopened this Oct 28, 2014
@simonbyrne
Copy link
Contributor

I realise this is probably insane, but would it be at all feasible to implement string to float conversion in julia itself? Or is this required too early in the build step?

@tknopp
Copy link
Contributor Author

tknopp commented Nov 17, 2014

No not insane but it is required quite early in the parser (flisp). So one would have a boostraping issue :-)
Actually I think that just waiting for MSVC to catch up is not the worst of all Options.

@tkelman
Copy link
Contributor

tkelman commented Nov 17, 2014

I think the preview edition of the next visual studio is available now actually, if anyone's feeling adventurous. You'll need to build your own LLVM with it first though.

@simonbyrne
Copy link
Contributor

One (hypothetical) advantage of implementing it ourselves: we could have non-standard literals, e.g. bit float (0b1.0001p3) or oct float (0o1.01p3). Though I admit I may be the only audience for such things.

@mikewl
Copy link
Contributor

mikewl commented Feb 4, 2015

I tested and strod works with hex floating point literals on VS2015.
Also to be noted is that inline assembler on x64 seems to be working now!

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2015

Hey, that is really good news. Is VS2015 officially released yet, or is it still a ctp?

(unrelated but pretty neat - the .NET JIT and GC were just open-sourced: https://github.com/dotnet/coreclr)

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2015

Anyone want to build LLVM (not sure which version - either our patched 3.3, or 3.5.1, or the latest 3.6 branch?) using the latest VS 2015 CTP and post the binaries? I don't have it installed, and it's a pretty huge download.

AppVeyor has a test image with the latest CTP installed, but my LLVM binaries from VS 2013 aren't going to work there.

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2015

@Mike43110 are you sure you were building in 64 bit mode? C or C++ code?

Looks like the VS 2015 CTP is finally able to compile C99 in C mode, only 15 years late. Though I am having some issues with the inline assembly still. (working branch at https://github.com/tkelman/julia/compare/appveyor2)

task.c(330): error C2036: 'void *': unknown size 
task.c(332): error C4235: nonstandard extension used: '__asm' keyword not supported on this architecture 

@mikewl
Copy link
Contributor

mikewl commented Feb 4, 2015

I checked again.
Somehow while having x64 selected it was compiling in 32-bit mode : /
Oh well.

Is an x64 llvm necessary? I have (I think, need to test) a working x86 llvm. It builds fine at least. However, an x64 llvm fails to build.

I built (or I hope I built) llvm-svn. It takes about an hour to build, debug takes even longer. If there are any requirements for how llvm must be built please list. Once I am sure it's correct I will happily post the binaries. I should also be able to build an msvc one whenever its needed. My wonderful internet and the fact that my country is undergoing load-shedding (power cuts) may mean that it will take a few hours to post but hopefully it will help.

Two unrelated notes: The .NET news is quite good! Quite interested to test it once they release build guides for Linux. Hoping they will release gui related things. I understood it that it is only console and asp.net for now. Still progress!
The other note is that don't post at 3am... You tend to miss things. Like a release x64 build compiling as release x86 (still have no idea how I got VS to do that!)

I will download the 3.6 branch and build it as well.

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2015

Actually we won't need llvm binaries until we can figure out a workaround for the inline assembly, so don't worry about it too much at this point.

From some experiments on appveyor, it looks like compiling in 64 bit, in C mode (which now works for C99, it appears) instead of C++, the asm syntax without leading underscores might work, but there are still some error messages about unexpected colons.

@KristofferC
Copy link
Member

KristofferC commented Oct 15, 2019

Think this can be closed since building Julia with MSVC is no longer officially supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:windows Affects only Windows
Projects
None yet
Development

No branches or pull requests

8 participants