-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility issues #46
Comments
After looking a bit more at |
For reproducibility you must also set This is a consequence of the asynchronous SGD method used by UMAP (which is basically the same as LargeVis). The internal PRNGs should be (indirectly) seeded via |
@jlmelville Isn't this the default? I'm assuming this is related to the discussion we had about avoiding race conditions waaaay back. Anyway, @juba, you don't mention what machines you're using. For example, 32-bit Windows is notorious for giving numerically different answers due to some kind of difference in the precision of the calculations - I think it is something to do with the x87 floating point instruction set. Because UMAP is a iterative approach, even small differences propagate over iterations to increase in size. t-SNE was definitely the worst, a 16th decimal place difference in initialization values would yield completely different results after 1000 iterations. The other possibility is that we have UB or a memory leak somewhere, but I hope not. As an aside, I do not think it would be wise to hard-code expected UMAP return values in your tests. That makes your testing framework very fragile, liable to break upon changes to things that don't really matter to your end users (who just want to see a pretty plot, really). |
Thanks for your quick answer. When I launch
Am I doing something wrong ? |
Well, what are these machines? Linux? Mac? Windows? BSD? ... Solaris? |
@LTLA Sorry, I didn't see your message. I've tested it with a total of 6 machines :
The three that I copy/pasted are all linux boxes. Well, as I try to use umap inside a package function, it seems quite important to me that the results can be reproducible from one run to another, otherwise a user could re run an analysis later and get different outputs... |
Hm. Well, that's interesting. What happens if you turn off multithreading with Another possibility is processor intrinsics, but I don't think we use them. RcppAnnoy might use them but Edit: I get something different as well (Mac):
... and the results don't change with |
Yes, I can confirm, with |
Sorry about the incorrect statement about @juba, @LTLA, could either of you try running: set.seed(13); head(umap(iris, a = 1, b = 1), 5) and seeing if that gives better results in terms of reproducibility? For the record, for both Windows and my Linux VM, I get the output of: [,1] [,2]
[1,] -8.185683 -2.3828293
[2,] -6.676588 -0.6949825
[3,] -6.222400 -1.0343811
[4,] -6.009416 -0.5351661
[5,] -8.006457 -2.1892876 If that checks out then it looks like there is a numerical issue. I haven't fully nailed it down yet, so here are my preliminary findings.
On Linux:
Unfortunately, manually setting those values still isn't enough to make the results agree. However, if you just use the first four decimal places and set set.seed(13); head(umap(iris, approx_pow = TRUE, a = 1.8956, b = 0.8006), 5)
[,1] [,2]
[1,] -13.89810 -3.735377
[2,] -13.35056 -2.117875
[3,] -12.75033 -2.292292
[4,] -12.84131 -2.176536
[5,] -13.94507 -3.658790 So probably the error lies in the power calculation in the gradient. I've not taken a closer look there yet, but I hope to get enough time shortly. Perhaps the use of |
@jlmelville Both of those examples give the same results as posted above, though I am on my Linux box (Ubuntu 18.04) at home rather than my Mac at work. Incidentally, running
Those errors can amplify pretty quickly, if the t-SNE experience was anything to go by. |
Thanks for your detailed and quick replies ! Here are the results on my machines. First exampleLinux box 1 : > set.seed(13); head(uwot::umap(iris, a = 1, b = 1), 5)
[,1] [,2]
[1,] -12.32907 -7.980761
[2,] -14.15007 -6.869046
[3,] -14.62219 -7.096888
[4,] -14.76138 -7.277305
[5,] -12.37633 -7.737141 Linux box 2 : > set.seed(13); head(uwot::umap(iris, a = 1, b = 1), 5)
[,1] [,2]
[1,] -8.185683 -2.3828293
[2,] -6.676588 -0.6949825
[3,] -6.222400 -1.0343811
[4,] -6.009416 -0.5351661
[5,] -8.006457 -2.1892876 Linux box 3 : > set.seed(13); head(uwot::umap(iris, a = 1, b = 1), 5)
[,1] [,2]
[1,] -8.742520 -0.644353479
[2,] -6.608887 -1.108330991
[3,] -6.484410 -0.005383829
[4,] -6.325613 -0.405331981
[5,] -8.540133 -0.360579988 Second exampleLinux box 1 : > set.seed(13); head(uwot::umap(iris, approx_pow = TRUE, a = 1.8956, b = 0.8006), 5)
[,1] [,2]
[1,] -13.29704 -1.89562185
[2,] -13.77207 -0.16590410
[3,] -13.77290 0.14170374
[4,] -13.45787 0.09581583
[5,] -13.35800 -1.76931200 Linux box 2 : > set.seed(13); head(uwot::umap(iris, approx_pow = TRUE, a = 1.8956, b = 0.8006), 5)
[,1] [,2]
[1,] -13.89810 -3.735377
[2,] -13.35056 -2.117875
[3,] -12.75033 -2.292292
[4,] -12.84131 -2.176536
[5,] -13.94507 -3.658790 Linux box 3 : > set.seed(13); head(uwot::umap(iris, approx_pow = TRUE, a = 1.8956, b = 0.8006), 5)
[,1] [,2]
[1,] -4.225160 -10.458575
[2,] -2.987507 -9.383705
[3,] -3.448075 -9.188345
[4,] -3.296139 -9.051793
[5,] -4.168506 -10.302560 So this seems a bit strange to me : I get the same results as you on one of the boxes (which runs R 3.6.1, by the way), but they are quite different on the two others (which run R 3.6.2). |
@jlmelville Get same results as you on my Windows 64, R3.6.2, uwot 0.1.5, in case it may help. |
A couple of things to try:
Some results based on the above on my Windows and Linux VM with R 3.6.2: set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, init = matrix(rnorm(300), ncol=2), n_epochs = 1), 5)
[,1] [,2]
[1,] 0.6147206 -2.76264281
[2,] -0.2198782 1.79669863
[3,] 1.8355571 -1.14669579
[4,] 0.2477138 -0.23195193
[5,] 1.2029198 0.01220871
set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, approx_pow = TRUE, init = matrix(rnorm(300), ncol=2), n_epochs = 1), 5)
[,1] [,2]
[1,] 0.6147206 -2.76264281
[2,] -0.2198782 1.79669863
[3,] 1.8355571 -1.14669579
[4,] 0.2477138 -0.23195193
[5,] 1.2029198 0.01220871 At least early on then, the results agree. Also with set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, approx_pow = TRUE, init = matrix(rnorm(300), ncol=2)), 5)
[,1] [,2]
[1,] 8.840334 12.03719
[2,] 10.490212 13.65378
[3,] 10.532321 12.99068
[4,] 10.511846 13.09358
[5,] 8.911314 12.19013 But not when using set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, init = matrix(rnorm(300), ncol=2)), 5)
[,1] [,2]
[1,] 4.076168 8.341118
[2,] 2.102707 9.429637
[3,] 1.973279 8.806576
[4,] 2.076288 8.900060
[5,] 3.891764 8.247353 On my Linux VM, I get: set.seed(13); head(umap(iris, a = 1.8956, b = 0.8006, init = matrix(rnorm(300), ncol = 2)), 5)
[,1] [,2]
[1,] 1.8333098 6.916174
[2,] 0.2878910 5.478771
[3,] 0.1653603 6.152372
[4,] 0.2099545 6.045371
[5,] 1.7626021 6.757470 I haven't been very successful at finding out how Thanks @SamGG for the extra testing. |
On your two first examples (first with When using the last example, I still get the same results on the three machines, but not the same as what you get : > set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, init = matrix(rnorm(300), ncol = 2)), 5)
[,1] [,2]
[1,] 4.191379 8.028221
[2,] 3.445829 9.990005
[3,] 2.992829 9.464233
[4,] 3.100481 9.648516
[5,] 4.226606 8.193149 |
@jlmelville still get exact same results on Windows. |
I get the same results as @juba on my Ubuntu machine for all three scenarios (including the last). So, it is at least somewhat reproducible, at least between @juba and me! If it is a
For completeness, the glibc is:
|
I am using Pop OS on my VM:
@LTLA: I take it you are using the 18.04 LTS? If you confirm, I will see if I can reproduce results on the same OS. |
Yes, that's right. Nothing special with my set-up beyond keeping updated. |
The three linux boxes I have access to run :
|
Ok, running Ubuntu 18.04 LTS on my VM, I have the same compiler setup as @LTLA:
and I also get the same results: > set.seed(13); head(uwot::umap(iris, a = 1.8956, b = 0.8006, init = matrix(rnorm(300), ncol=2)), 5)
[,1] [,2]
[1,] 4.191379 8.028221
[2,] 3.445829 9.990005
[3,] 2.992829 9.464233
[4,] 3.100481 9.648516
[5,] 4.226606 8.193149 |
Maybe it's worth having a look at what #include <iostream>
#include <cmath>
#include <limits>
typedef std::numeric_limits< double > dbl;
int main() {
std::cout.precision(dbl::max_digits10);
std::cout << std::pow(1.8956, 0.8006) << std::endl;
} Compiling and running this gives me If we can confirm that the discrepancy is introduced by |
Running this on Ubuntu 18.04 (gcc 7.4.0) and Debian stable (gcc 8.3.0) gives me the same result as you. On the machine running CentOS7 (gcc 4.8.5) |
@LTLA the number you get is the same as what I get on Windows and both my Pop OS VM and the Ubuntu 18.04 VMs. I have attempted to debug this further, but it is slow going. I can see discrepancies in the 12th decimal place of coordinate differences emerge pretty early, but I don't have a root cause yet. |
Thanks for your work on this. I imagine how time consuming tracking this could be. |
Here are some examples of #include <iostream>
#include <cmath>
// [[Rcpp::export]]
void stdpow(double a = 1.8956, double b = 0.8006) {
std::cout.precision(std::numeric_limits<double>::max_digits10);
std::cout << std::pow(a, b) << std::endl;
} On Windows: > stdpow(62.906748749432943)
27.544634264263522
> stdpow(4.0950943003678653)
3.0915657259912637 On Ubuntu 18.04 LTS: > stdpow(62.906748749432943)
27.544634264263525
> stdpow(4.0950943003678653)
3.0915657259912637 On Pop OS: > stdpow(62.906748749432943)
27.544634264263525
> stdpow(4.0950943003678653)
3.0915657259912641 These sort of differences seem sufficient to eventually accumulate and cause noticeable changes. I am open to any suggestions on remediation, in addition to @LTLA's suggestion to make |
Note that these are not operating system differences so much as differences in how the underlying implementation of the C++ or C standard library (e.g., Thus, a single (and perhaps an accurate) implementation of
|
Yes, I think we're all aware of that. Hence my comments on
Yes, if one such implementation exists that is well-tested and portable. Boost is my usual go-to for things where the standard library lets me down (cough I was wondering what R itself was doing, but it seems that it just calls into |
I've done some more debugging and the problem is not restricted to power calculations, sadly. Consider: //[[Rcpp::export]]
void vecprint(Rcpp::NumericVector v, std::size_t idx) {
std::cout.precision(std::numeric_limits<double>::max_digits10);
std::cout << v[idx] << std::endl;
} When I run: set.seed(13); vecprint(rnorm(140000), 71718) I get In addition, there is some "clever" filtering of edges in the input graph based on the number of epochs, which also seems to result in differences between Windows and Ubuntu about when an edge is skipped in sampling. I have yet to see what effect turning this off does (it isn't going to solve the problem anyway). Thank you @peteroupc for the link to the Bruce Dawson blog which helps reframe the problem as being about floating point determinism rather than accuracy. Would doing these calculations in |
Truncating the input sounds like a sensible option, at least to resolve the input issue. Probably no one would notice a round-trip through Internally, Boost has some arbitrary-precision types that could also be used to avoid throwing out too much precision when switching from
gives the same precision as a |
If feasible, an alternative is to use fixed-point numbers and arithmetic rather than floating-point arithmetic. One case of its use I am aware of is in the article "The Butterfly Effect: Deterministic Physics in The Incredible Machine and Contraption Maker". I say "if feasible", however, because switching this code repository to use fixed-point rather than floating-point numbers is anything but trivial, and I am not aware of any popular "well-tested and portable" fixed-point math libraries. On the other hand, |
I think that by using uwot::umap(mnist, n_epochs = 500, init = matrix(rnorm(nrow(mnist) * 2), ncol = 2)) However, with default initialization I still see divergence. This might be because the spectral initialization is not currently deterministic. So more digging needed on that. Assuming some sort of resolution is reached, this will for sure be a breaking change to the output of uwot. On the one hand I could make this an option, but it would require some rewriting of the C++ code to make it more templated. On the other hand, now that I know uwot is being used by the likes of seurat and monocle I don't want to further break reproducibility on a single machine. On an anatomically improbable third hand, I have yet to even get to uwot 0.2, and really want to be able to break things where necessary. On a now metaphor-shattering fourth hand, I have never actually documented anywhere a deprecation policy or followed semantic versioning or anything like that, so I should probably at least do that. I welcome opinions on this one. |
Not a big deal. In fact, I wouldn't even say it's a breaking change to the output. Now, we might be breaking the end-user's expectation of reproducibility, but that was always much more difficult to achieve, and I'm willing to bet that a version bump in uwot is the least of the user's problems with respect to updates across the scRNA-seq analysis stack. If people want exact same results every time, any time, they had better be using pinned versions in a container. So I reckon: make the change and bump the version. Everyone's code will still run, it's just that the plots will be a bit different, and that's okay. Bona fides: I'm using uwot to power the UMAPs throughout https://osca.bioconductor.org (via scater wrappers), and even then, it literally does not bother me if the figures change. I'll give them a once-over to check that they still look pretty but that's about the limit of my concern. |
Note that projects still at version 0.x, such as this one, can theoretically include "breaking changes" between 0.x versions; as the semantic versioning spec says: "Anything MAY change at any time [between 0.x versions]. The public API SHOULD NOT be considered stable." Thus, changing the version from 0.1 to 0.2 should be unproblematic even if they differ by only one change to the code. But see the answer to "How do I know when to release 1.0.0?" in that spec's FAQ. |
uwot 0.1.8 has introduced the changes in this discussion. Closing for now. |
Hi, Just to let you know that since upgrading to 0.1.8, just adding So many thanks for your hard work on this, it is really appreciated. |
First of all, thanks for this package. It is very convenient to have a pure R implementation of UMAP which is fast and reliable !
I am meeting a small and a bit strange problem of reproducibility of results. To be able to get the same UMAP results twice, I use
set.seed
before callingumap
, and locally on my machine it works well :The problem is, sometimes if I run the same code on another machine, for example during tests for a package CRAN check, the test fails because the results are different.
I've tried several things : testing that the
uwot
version are the same, and even testing thatset.seed
give the same suite of random numbers on every machine. This is true, but when I compareumap
results they are different.I'm not sure I'm completely clear here... But if you have any idea on why this could happen, I'd be glad to hear it :-)
The text was updated successfully, but these errors were encountered: