Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpine build (musl) fails on determinism checks #707

Open
1 of 2 tasks
breznak opened this issue Oct 4, 2019 · 6 comments
Open
1 of 2 tasks

Alpine build (musl) fails on determinism checks #707

breznak opened this issue Oct 4, 2019 · 6 comments
Labels
bug Something isn't working code code enhancement, optimization, cleanup..programmer stuff platform

Comments

@breznak
Copy link
Member

breznak commented Oct 4, 2019

I have briefly tested Alpine build on Docker (alpine:amd64-latest-stable) and

  • builds fine
  • tests fail on determinism checks in SpatialPoolerTest.ExactOutput

Related #659

@breznak breznak added bug Something isn't working code code enhancement, optimization, cleanup..programmer stuff platform labels Oct 4, 2019
@breznak
Copy link
Member Author

breznak commented Oct 4, 2019

CC @pepedocs

@breznak
Copy link
Member Author

breznak commented Oct 4, 2019

Alpine build is now provided by our Dockerfile

@breznak
Copy link
Member Author

breznak commented Oct 4, 2019

ExactOutput failing also on Arm (on Alpine using musl C)

[ RUN      ] SpatialPoolerTest.ExactOutput
/usr/local/src/htm.core/src/test/unit/algorithms/SpatialPoolerTest.cpp:2100: Failure
Expected equality of these values:
columns
Which is: SDR( 200 ) 4, 6, 17, 85, 113, 125, 133, 153, 172, 173
gold_sdr
Which is: SDR( 200 ) 4, 64, 74, 78, 85, 113, 125, 126, 127, 153
[  FAILED  ] SpatialPoolerTest.ExactOutput (3270 ms)

So apparently our C++-fu is not yet 100% deterministically secure.
@dkeeney how did you pin-point the stuff in your former c++ SP investigations?

@dkeeney
Copy link

dkeeney commented Oct 5, 2019

how did you pin-point the stuff in your former c++ SP investigations?

With a great deal of difficulty...

I first located the point (the cycle number) when they started to diverge. Then I inserted debug trace statements to try and identify the function in which they diverged in that cycle number (actually it turned out to be in the previous cycle). Lots of debug trace statements into a log file and performed a diff. Then I had some good luck in that I hit on the cause.

@breznak
Copy link
Member Author

breznak commented Oct 6, 2019

hmm.. this will be pain, could be a bug in libstdc, compiler,... I'm not sure how far do we want to pursue the multiplatform deterministic builds (well, identical builds, we could just have "deterministic results per platform") I suspect the problem in glibc/musl C, as the same err happens on amd64/arm64 on musl

@breznak
Copy link
Member Author

breznak commented Oct 31, 2019

With #736 all tests (incl. determinism) are passing on CI for MUSL (added its custom results), but the results are not the same for GLIBC and MUSL libc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working code code enhancement, optimization, cleanup..programmer stuff platform
Projects
None yet
Development

No branches or pull requests

2 participants