Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meep build failure for ppc64 architecture #13

Open
michelmno opened this issue Feb 17, 2015 · 4 comments
Open

meep build failure for ppc64 architecture #13

michelmno opened this issue Feb 17, 2015 · 4 comments

Comments

@michelmno
Copy link
Contributor

  • as reported on osc server (1_) some tests for meep package are failing
    for ppc/ppc64 but not for ppc64le.
  • I started to work on the aniso_disp failing test for ppc64.
  • using gdb (2_) I identified a problem for ppc64
    in generated assembly code when calling harminv_get_amplitude from do_harminv
  • as detailed in (4_) the gcc version is 4.8.3 for two opensuse environments.

(1_) https://build.opensuse.org/package/show/openSUSE:Factory:PowerPC/meep
=== extract of tests/test-suite.log:
harminv: failure on line 853 of harminv.c: argument out of range in harminv_get_amplitude

(2_) what is abnormal is the assembly code when calling harminv_get_amplitude
that is passing 3 registers r3 r4 r5 (with hd in r4 and 0 in r5)
while the library is expecting only r3 r4 (with hd in r3 and 0 in r4)

Breakpoint 2, 0x00003fffb7276204 in .harminv_get_amplitude () from /usr/lib64/libharminv.so.2
(gdb) bt
#0 0x00003fffb7276204 in .harminv_get_amplitude () from /usr/lib64/libharminv.so.2
#1 0x00003fffb7ee97b4 in meep::do_harminv (data=, n=, dt=0.0025000000000000001, fmin=0, fmax=1, maxbands=, amps=0x3fffffffe730,
freq_re=0x3fffffffe6b0, freq_im=0x3fffffffe6f0, errors=0x0, spectral_density=, Q_thresh=50, rel_err_thresh=1e+20, err_thresh=0.01, rel_amp_thresh=-1,
amp_thresh=-1) at bands.cpp:472
#2 0x00000000100020a0 in main (argc=1, argv=0x3ffffffff8e8) at aniso_disp.cpp:131

0x00003fffb7ee97a0 meep::do_harminv()+784: ld r2,40(r1)
0x00003fffb7ee97a4 meep::do_harminv()+788: mr r3,r22
0x00003fffb7ee97a8 meep::do_harminv()+792: mr r4,r31
0x00003fffb7ee97ac meep::do_harminv()+796: li r5,0
0x00003fffb7ee97b0 meep::do_harminv()+800: bl 0x3fffb7edd028 <00000017.plt_call.harminv_get_amplitude>
0x00003fffb7ee97b4 meep::do_harminv()+804: ld r2,40(r1)

=> 0x00003fffb7276204 <.harminv_get_amplitude+20>: blt- cr7,0x3fffb727628c <.harminv_get_amplitude+156>
0x00003fffb7276208 <.harminv_get_amplitude+24>: lwz r9,20(r3)
0x00003fffb727620c <.harminv_get_amplitude+28>: mr r31,r3
0x00003fffb7276210 <.harminv_get_amplitude+32>: cmpw cr7,r9,r4
0x00003fffb7276214 <.harminv_get_amplitude+36>: ble- cr7,0x3fffb727628c <.harminv_get_amplitude+156>

@michelmno
Copy link
Contributor Author

I received the following comments that pointed to the reason of the failure, but no obvious solution:
(from Ulrich Weigand)

the problem seems to be in the harminv library:

#if defined(__cplusplus)

include

extern "C" {
typedef std::complex harminv_complex;

#elif defined(_Complex_I) && defined(complex) && defined(I)
/* C99 <complex.h> header was included before harminv.h */
typedef double _Complex harminv_complex;

#else
typedef double harminv_complex[2];
#endif
[...]
extern harminv_complex harminv_get_amplitude(harminv_data d, int k);

where the type "harminv_complex" is defined in three different ways.

Note that the third variant is obviously never used, since it would
lead to a compile-time error: you cannot use an array type as
function return type in C.

However, the harminv library itself it compiled in C, using the
second definition above:
typedef double _Complex harminv_complex;

while the meep application is compiled in C++, using the first
definition above:
typedef std::complex harminv_complex;

Note that those are two different types!

Now, on some platforms, the ABI happens to be defined in a way such
that these two types are passed in the same manner, so you do not
actually notice the type mismatch. For example, on ppc64le, both
"double _Complex" and "std::complex" are returned in a
pair of floating point registers.

However, on ppc64 (and ppc, I think), the ABI for those two types
is in fact different: double _Complex is still returned in a pair
of floating point registers, while "std::complex" is
returned on the stack (so that harminv_get_amplitude gets a
hidden first argument which is a pointer to the stack slot where
the return value is to be written).

This mismatch causes the crash you're seeing.

Not sure what the best way to fix this would be. On the one hand,
it is a bug in the harminv sources to assume that those two types
must be ABI-compatible -- nothing guarantees that. On the other
hand, there is no standard-compliant way to express the C99 type
double _Complex directly in C++ as far as I know -- G++ does
accept "double complex" as GNU extension however. Not sure
if that won't cause other issues, depending how harminv_complex
is used elsewhere in C++ code ...

@stevengj
Copy link
Collaborator

@michelmno, those definitions date from when I was using harminv_complex only for pointers to arrays, i.e. harminv_complex *array. In that case, the three definitions are equivalent. (The C++ standard was amended a few years ago to specify that std::complex<double> be laid out in memory equivalent C99 complex, which is defined in turn as being equivalent to double[2], and at the time it was amended all C++ implementations already used this format since it is the only sensible memory layout anyway.) For function return values, however, the data is wrapped in an ABI-dependent way as you point out, and double[2] does not work at all.

The solution is to modify harminv_get_omega and harminv_get_amplitude to be of the form:

extern void harminv_get_omega(harminv_complex *omega, harminv_data d, int k);

and then to modify Meep to call it as such.

@michelmno
Copy link
Contributor Author

as suggested I have two patches (1) for harminv (2) for meep,
that seems to allow the harminv and meep packages to build successfully for ppc64 archi.
(I verified that also working for ppc64le archi)

(1) https://build.opensuse.org/package/view_file/home:michel_mno:branches:science/harminv/harminv_change_protos_with_harminv_complex.patch?expand=1
(2) https://build.opensuse.org/package/view_file/home:michel_mno:branches:science/meep/meep_change_protos_with_harminv_complex.patch?expand=1

@stevengj
Copy link
Collaborator

Thanks @michelmno, can you please file git pull requests against the harminv repo and the meep repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants