Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing fails on MinGW #1206

Closed
fredrik-johansson opened this issue Jan 13, 2023 · 23 comments
Closed

Testing fails on MinGW #1206

fredrik-johansson opened this issue Jan 13, 2023 · 23 comments
Milestone

Comments

@fredrik-johansson
Copy link
Collaborator

fredrik-johansson commented Jan 13, 2023

Factoring fails randomly on MinGW in the 2.10 branch:

The following tests FAILED:
	1302 - fmpz_factor-test-t-factor_smooth (Exit code 0xc0000374
)
Error: Process completed with exit code 8.

This seems to be a consequence of #1201 and the subsequent fix d8fc2fd.

On trunk, MinGW currently fails completely, presumably due to missing FLINT_DLL declarations.

@fredrik-johansson fredrik-johansson changed the title Factoring fails on MinGW Testing fails on MinGW Mar 6, 2023
@fredrik-johansson fredrik-johansson added this to the flint-2.10 milestone Mar 6, 2023
@albinahlback
Copy link
Collaborator

As of my PR moving everything to src/, the job fails earlier. Perhaps that just because of a change of directory for libflint.so, but I'm not sure.

I've been trying to fix the complaints from the CI about plugin for lto objects, but I cannot seem to fix it.

@fredrik-johansson
Copy link
Collaborator Author

Some progress understanding the build issue: apparently there is a bug in CMake which means CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS does not work on MinGW: https://gitlab.kitware.com/cmake/cmake/-/issues/22092

One of the suggested workarounds is to manually pass -Wl,--export-all-symbols to the linker.

Actually, the reason why we are seeing the issue in the first place is that we use __declspec (dllexport) for some but not for all functions (missing for the Arb/Calcium/GR modules), making the build system think that only some things should be exported. Removing this entirely enables automatic export and makes the problem go away. This means that we could potentially delete FLINT_DLL entirely, though maybe it's still needed for the _declspec (dllimport)? (Which I don't understand either.)

You can see on this branch https://github.com/fredrik-johansson/flint2/tree/mingw that building almost succeeds with an empty FLINT_DLL definition.

Later on there is still this build failure:

C:/Strawberry/c/lib/libmpfr.a(exceptions.o):(.text+0x0): multiple definition of `mpfr_set_emin'
lib/libflint.dll.a(d015535.o):(.text+0x0): first defined here
C:/Strawberry/c/lib/libmpfr.a(exceptions.o):(.text.unlikely+0x17): multiple definition of `mpfr_get_emin_min'
lib/libflint.dll.a(d015449.o):(.text+0x0): first defined here
C:/Strawberry/c/lib/libmpfr.a(exceptions.o):(.text+0x40): multiple definition of `mpfr_set_emax'
lib/libflint.dll.a(d015534.o):(.text+0x0): first defined here
...

Not sure how to fix that...

@isuruf
Copy link
Member

isuruf commented Mar 16, 2023

FLINT_DLL is needed for MSVC builds. Maybe FLINT_DLL can be empty for mingw only?

@albinahlback
Copy link
Collaborator

ChatGPT tells me that you can do something like link /dump /exports mydll.dll > mydll.def and then manually edit all the symbols to get a proper definition file (I suppose this can be done with sed). Perhaps worth a try?

@albinahlback
Copy link
Collaborator

Had problems with getting a MinGW to work with the regular configure and Makefile (before it used CMake). What solved the problem was to remove all *_DLL (or, equally, #define FLINT_DLL). Then all tests should pass except for MSVC, but there CMake should work.

@albinahlback
Copy link
Collaborator

I am trying to debug with gdb on MinGW via Github CI as I do not have access to any Windows machine. See look at my actions under the branch autoconf_mingw in my repo.

@albinahlback
Copy link
Collaborator

Looks like there is a bug: https://stackoverflow.com/a/74620721/13527214

Perhaps related to this as I only am able to get MinGW to fail with multithreaded make.

@albinahlback
Copy link
Collaborator

I very much believe it's related to the management of files in qsieve under MinGW. Trying to work on it.

@fredrik-johansson
Copy link
Collaborator Author

Yes, as far as I can tell this was working before merging #1201.

@albinahlback
Copy link
Collaborator

So I do think that mkstemp is the better solution here. However, there seems to be multiple bugs with the function on MinGW. I will think of a solution of how to fix this.

In the mean time I will implement a check, done during configuration, to check whether mkstemp works on the system.

@thofma
Copy link
Contributor

thofma commented Mar 18, 2023

In the mean time I will implement a check, done during configuration, to check whether mkstemp works on the system.

Will this hinder cross-compilation? If so, that should be avoided.

@albinahlback
Copy link
Collaborator

albinahlback commented Mar 18, 2023

I believe cross-compilation is not supported (that's what I remember Bill saying [Correction: I cannot back this up, and think I was wrong here.]). However, I would like to support cross-compilation, so if HOST and BUILD are not the same, then that cannot be tested as such tests needs to be executions rather than simply linking or compiling.

@thofma
Copy link
Contributor

thofma commented Mar 18, 2023

We have been cross-compiling for years with the blessing of the flint maintainers. I believe that if flint 3.0 does not support cross-compilation, this would be a big step backwards.

@albinahlback
Copy link
Collaborator

I believe I found the problem... The problem is that on Github's CI servers, we do not have write access to /tmp... Compare these two: working and non-working.

@albinahlback
Copy link
Collaborator

Why do we even need files for this? Do we really require so much space for factoring?

@thofma
Copy link
Contributor

thofma commented Mar 18, 2023

Last time I tried figuring this out I only found #708 as a breadcrumb.

@fredrik-johansson
Copy link
Collaborator Author

No, it can certainly be done in memory. Someone just has to rewrite the code.

@albinahlback
Copy link
Collaborator

I can't even see a single fwrite or fread there...

@fredrik-johansson
Copy link
Collaborator Author

fredrik-johansson commented Mar 18, 2023

There's flint_fprintf.

@albinahlback
Copy link
Collaborator

Yeah, but then one has to parse the input, right? We cannot read and write directly into and into memory, we must parse the data both ways

@fredrik-johansson
Copy link
Collaborator Author

I suppose you'd rewrite the code to store arrays of integers instead of strings that need to be parsed.

@albinahlback
Copy link
Collaborator

Closed as of #1279.

@oscarbenjamin
Copy link
Contributor

See the discussion in gh-1334

The fix in gh-1279 did not really fix the problem but just avoids it in CI. The problem still shows up outside of CI e.g.: flintlib/python-flint#43 (comment)

The OP here suggests that the previous MinGW CI failure was random. A deterministic reproducer of what I think is the same problem from python_flint is:

fmpz(10**35 + 1).factor()

The PR gh-1279 introduces a TMPDIR build configuration option whose only purpose seems to be for the MinGW CI job. It should not be used for the MinGW CI job because it is hiding a bug that would still be seen outside of CI. I think that the TMPDIR option (which was added since 2.9.0 and is only in 3.0.0-alpha1) should be removed so probably gh-1279 should basically be reverted in full.

This code seems to have changed quite a few times since gh-1201 with various conditional preprocessor macros for different environments and different versions of the code. That makes it difficult to track exactly when it was or was not working on different platforms.

Apart from avoiding the use of temporary files altogether a better option might be to use tmpfile instead of tmpnam or mkstemp as discussed at #1334 (comment). It is possible that tmpfile could be used on all platforms without needing any #ifdef for Windows.

At least for the MinGW case something else needs to be used because the current code always (except in CI) tries to write to /tmp which does not exist on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants