Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New macOS errors at CRAN #128

Closed
eddelbuettel opened this issue Jun 20, 2024 · 13 comments
Closed

New macOS errors at CRAN #128

eddelbuettel opened this issue Jun 20, 2024 · 13 comments

Comments

@eddelbuettel
Copy link
Owner

eddelbuettel commented Jun 20, 2024

Following the 0.3.8 we now see results with errors on macOS arm64:

  ----- FAILED[data]: test_nanotime.R<143--143>
   call| expect_identical(p, as.POSIXct(as.nanotime(p)))
   diff| Attributes differ
  ----- FAILED[data]: test_nanotime.R<147--147>
   call| expect_identical(p, as.POSIXct(as.nanotime(p)))
   diff| Attributes differ
  Error: 2 out of 1250 tests failed
  Execution halted

We know arm64 has less precision that x86_64 but could something else be going on? The code is

## See: https://github.com/eddelbuettel/nanotime/issues/115
## This one should break if multiplying the double by 1E9 directly
## without considering the integer and fraction parts separately.
p <- as.POSIXct(as.numeric('0x1.91f18e4d0065p+30'), origin = '1970-01-01')
expect_identical(p, as.POSIXct(as.nanotime(p)))
## This one is negative, making sure that negative doubles are also consistent.
p <- as.POSIXct(as.numeric('-0x1.c6e8c4d077ae4p+30'), origin = '1970-01-01')
expect_identical(p, as.POSIXct(as.nanotime(p)))

and the reference in the comment to #115 is a hint as #116 may play a role here.

@DavorJ @lsilvest Any thoughts? Would accurate=FALSE help?

PS I guess not. The next test asserts that. So ... should we skip these two tests on arm64?

@DavorJ
Copy link

DavorJ commented Jun 20, 2024

   call| expect_identical(p, as.POSIXct(as.nanotime(p)))
   diff| Attributes differ

Hi @eddelbuettel. I don't have an ARM system to test, but the difference seems to be the attribute, not the double itself?

Printing dput(p) should show the attributes, and sprintf("%a", p) the binary representation of the double. This would allow better comparison.

Disabling this test on ARM is OK, or even disallowing it as you suggest in #129, if urgent. But a real solution would require some debugging.

@eddelbuettel
Copy link
Owner Author

I do not have arm64 access either besides the ones everybody has access to namely

  • the macOS-latest runners here (which switched a few weeks ago from x86_64 to arm64)
  • the macOS builder machine provided by Simon Urbanek on behalt of the R team

If you have debugging ideas, can you try either? I may put out a stop-gap release as per this PR sooner rather than later, we should then have time for a deeper look.

@DavorJ
Copy link

DavorJ commented Jun 20, 2024

From Win11 ARM64 VM with R4.4.1 and nanotime_0.3.8:

image

And second one:

image

From Win11 x64 with R4.1.3 and nanotime_0.3.8:

image

And second one:

image

So it seems the POSIXct -> nanotime -> POSIXct consistency idea doesn't hold on ARM systems, indeed...

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Jun 20, 2024

Hm, I thought it was known that arm64 has slightly less double precision but now I cannot find a good reference to back this up. But maybe some operations are different? Can you maybe run a series over powers of then similar to the following I something use to where the 'delta' between two POSIXct is no longer noticeable?

> now <- Sys.time()
> for (i in seq(-1,-8,by=-1)) cat("As 10^", i, " can see difference: ", now + 10^i != now, "\n", sep="")
As 10^-1 can see difference: TRUE
As 10^-2 can see difference: TRUE
As 10^-3 can see difference: TRUE
As 10^-4 can see difference: TRUE
As 10^-5 can see difference: TRUE
As 10^-6 can see difference: TRUE
As 10^-7 can see difference: FALSE
As 10^-8 can see difference: FALSE
> 

(Also, if I may, copy and pasting code rather than screenshots is much preferred.)

@DavorJ
Copy link

DavorJ commented Jun 20, 2024

From Win11 ARM64:

> now <- Sys.time()
> for (i in seq(-1,-8,by=-1)) cat("As 10^", i, " can see difference: ", now + 10^i != now, "\n", sep="")
As 10^-1 can see difference: TRUE
As 10^-2 can see difference: TRUE
As 10^-3 can see difference: TRUE
As 10^-4 can see difference: TRUE
As 10^-5 can see difference: TRUE
As 10^-6 can see difference: TRUE
As 10^-7 can see difference: FALSE
As 10^-8 can see difference: FALSE

(Sorry for screenshots: had some copy/paste RDP issues.)

@DavorJ
Copy link

DavorJ commented Jun 20, 2024

Why would ARM64 have any less precision? Eventually it is just bits and conformance to standards. See here for example, or here, here. Seems more likely a compiler discrepancy?

Or maybe you mean how interim calculations are done on x64 with extended precision, like explained here?

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Jun 20, 2024

I am sorry, I am a little busy and was brief. I was not asking you to re-run the example I used to 'see' the minimally supported diff on POSIXct.

You were the one introducing the round-robin on the 'unusual' character string. My thinking was that maybe you coould construct a similar test for arm64 to determine up to which precision we can possible round-turn.

(We do not in general have control over "unusual" compiler switches. CRAN barks when you use one they consider non-portable, see release notes for this 0.3.8 release for an example.)

(My casual googling suggested arm64 does in fact have standard double precision. Which makes sense as we do generally have to dumb tests down. It may just be something with this test so skipping still seem like a fair choice.)

@DavorJ
Copy link

DavorJ commented Jun 20, 2024

On x64 systems, the round robin works 100 % (except on very small numbers where double x 1E9 has a fraction and cannot be stored within the 64bit integer, and same with doubles exceeding the 64bit integer boundaries). That the round-robin works 100 % within these bounds I tested here by brute force.

On ARM64, if this round-robin doesn't hold for 2023-06-04 10:12:35.250385 UTC which is 1685873555.2503853, then why expect it to work consistently on any current date? And even if there was a valid range, it would be a 15-16 digit integer (as double) corresponding to dates almost no-one cares about.

Hence for now disabling the test and even accurate = TRUE (since it clearly isn't "accurate" in round-robin sense within the bounds where it should be) on ARM64 seems completely logical to me.

Note also that I tried my original implementation too (here and here), and I got the same discrepancy.

@lsilvest
Copy link
Collaborator

lsilvest commented Jun 20, 2024 via email

@eddelbuettel
Copy link
Owner Author

Well two (or three) things:

  • the new 'accurrate' flag and this test came after 0.3.7, and were never at CRAN so no arm64 there
  • plus I am usually impatient and only run Linux CI so macOS was off
  • but GH did us all a favor and swicthed from x86_64 macOS to arm64 macOS a few months ago

And as we did not have arm64 'on' we did not see this. So that one is on me, but it will be quick to address. The PR is out there, I plan to roll that up tomorrow morning.

@lsilvest
Copy link
Collaborator

lsilvest commented Jun 21, 2024 via email

@eddelbuettel
Copy link
Owner Author

Yeah the nature of the error message is also weird to me.

I was just out biking for a bit but overnight we also got the auto-nag:

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_nanotime.html.

Please correct before 2024-07-05 to safely retain your package on CRAN.

The CRAN Team

I will take care of it now.

@eddelbuettel
Copy link
Owner Author

Dealt with in #129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants