Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IsPrime checker to math package #3738

Merged
merged 15 commits into from
Apr 5, 2021
Merged

Add IsPrime checker to math package #3738

merged 15 commits into from
Apr 5, 2021

Conversation

rhagenson
Copy link
Member

Due to some work on roaring-pony which proved to me above my head at this time, I implemented an IsPrime checker which might be useful for folks in the future. I am currently working within community members over on Zulip to learn enough LLVM IR to determine if the pre-allocated array of primes is being statically allocated as I intend.

@rhagenson
Copy link
Member Author

rhagenson commented Apr 3, 2021

Following work by @jemc this now uses a statically allocated table of primes, built via a match table.

I also fixed a bug where 1 was being called a prime, which it is not.

@SeanTAllen SeanTAllen added the changelog - added Automatically add "Added" CHANGELOG entry on merge label Apr 4, 2021
@ponylang-main
Copy link
Contributor

Hi @rhagenson,

The changelog - added label was added to this pull request; all PRs with a changelog label need to have release notes included as part of the PR. If you haven't added release notes already, please do.

Release notes are added by creating a uniquely named file in the .release-notes directory. We suggest you call the file 3738.md to match the number of this pull request.

The basic format of the release notes (using markdown) should be:

## Title

End user description of changes, why it's important,
problems it solves etc.

If a breaking change, make sure to include 1 or more
examples what code would look like prior to this change
and how to update it to work after this change.

Thanks.

Comment on lines +8 to +24
match index
| 0 => 2 | 1 => 3 | 2 => 5 | 3 => 7 | 4 => 11
| 5 => 13 | 6 => 17 | 7 => 19 | 8 => 23 | 9 => 29
| 10 => 31 | 11 => 37 | 12 => 41 | 13 => 43 | 14 => 47
| 15 => 53 | 16 => 59 | 17 => 61 | 18 => 67 | 19 => 71
| 20 => 73 | 21 => 79 | 22 => 83 | 23 => 89 | 24 => 97
| 25 => 101 | 26 => 103 | 27 => 107 | 28 => 109 | 29 => 113
| 30 => 127 | 31 => 131 | 32 => 137 | 33 => 139 | 34 => 149
| 35 => 151 | 36 => 157 | 37 => 163 | 38 => 167 | 39 => 173
| 40 => 179 | 41 => 181 | 42 => 191 | 43 => 193 | 44 => 197
| 45 => 199 | 46 => 211 | 47 => 223 | 48 => 227 | 49 => 229
| 50 => 233 | 51 => 239 | 52 => 241 | 53 => 251 | 54 => 257
| 55 => 263 | 56 => 269 | 57 => 271 | 58 => 277 | 59 => 281
| 60 => 283 | 61 => 293 | 62 => 307 | 63 => 311 | 64 => 313
| 65 => 317 | 66 => 331 | 67 => 337 | 68 => 347 | 69 => 349
| 70 => 353 | 71 => 359 | 72 => 367 | 73 => 373 | 74 => 379
| 75 => 383 | 76 => 389 | 77 => 397 | 78 => 401 | 79 => 409
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Theodus thoughts on this vis-a-vis the style guide?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each branch on a separate line would be consistent with the rest of the stdlib. But a match of this size is unusual, so I don't have strong opinions about it.

@SeanTAllen SeanTAllen changed the title Provide class-based IsPrime checker Add IsPrime checker to Math package Apr 4, 2021
// Using a match table like below provides information to LLVM
// to allocate static globals rather than stack or heap allocation.
fun _low_prime_table_size(): USize => 256
fun _low_prime_table(index: USize): A val =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a newline between these methods.

fun apply(n: A): Bool =>
var table_index: USize = 0
while table_index < _low_prime_table_size() do
let prime = _low_prime_table(table_index); table_index = table_index + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for the ';' here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How Joe had it written. It separates two statements on the same line. I have made them two lines locally and will push with my next commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a dirty multi-statement line writer sometimes! I admit it!

if n == prime then return true end
if (n %% prime) == 0 then return false end
end
if n == 1 then return false end // 1 is not prime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is for comments like the one at the end of the line to be before the line (i.e. above it rather than trailing)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I blend both trailing and line before comments depending on length of comment. No strong opinion on "right" so I will make the comment be on the previous line.

@@ -0,0 +1,80 @@
primitive IsPrime[A: (Integer[A] val & Unsigned)]
"""Primality test using 6k+-1 optimization."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything that can be linked to that explains this optimization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to find a good source. My search landed me on Wikipedia, but I would prefer something more definitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relevant passage is this, but has no source to it.

We can improve this method further. Observe that all primes greater than 3 are of the form 6k ± 1, where k is any integer greater than 0. This is because all integers can be expressed as (6k + i), where i = −1, 0, 1, 2, 3, or 4. Note that 2 divides (6k + 0), (6k + 2), and (6k + 4) and 3 divides (6k + 3). So, a more efficient method is to test whether n is divisible by 2 or 3, then to check through all numbers of the form 6 k ± 1 ≤ n {\displaystyle \scriptstyle 6k\ \pm \ 1\leq {\sqrt {n}}} {\displaystyle \scriptstyle 6k\ \pm \ 1\leq {\sqrt {n}}}. This is 3 times faster than testing all numbers up to √n.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything is better than nothing! (Maybe) 😉

Comment on lines +4 to +5
// Using a match table like below provides information to LLVM
// to allocate static globals rather than stack or heap allocation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this comment be on the _low_primate_table rather than the size function here? Or perhaps right before the table itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It is why I had both methods without a newline as they were sort of acting as one method together. I moved the size method down below the table so the comment could be on top and be above the table method itself.

@SeanTAllen
Copy link
Member

This is an awesome addition. Thanks @rhagenson.

@rhagenson
Copy link
Member Author

@SeanTAllen I got all your comments addressed. I was not able to track down the source of the 6k±1 fact...each place I look simple states it as fact rather than stating where/how it was first formulated.

@rhagenson
Copy link
Member Author

I figured out the 6k±1 connection. All integers can be written in as either a multiple of six (i.e., 6k+0) or as being between multiples of six (6k + i, where i is -1, 1, 2, 3, or 4) and because all, but two of those i choices create numbers that are divisible by either 2 or 3 then only those remaining two i choices can encode prime numbers.

Comment on lines 1 to 5
## Add IsPrime to math package

Quickly determine if a given number is prime using the 6k ± 1 method.

All integers (excluding 2 and 3), can be expressed as (6k + i), where i = −1, 0, 1, 2, 3, or 4. Since 2 divides (6k + 0), (6k + 2), and (6k + 4), while 3 divides (6k + 3) that leaves only (6k - 1) and (6k + 1) for expressing primes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Given that these notes end up in markdown on GitHub. Feel free to do IsPrime and math is you want or to use other markdown formatting as well. (You don't have to though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@rhagenson
Copy link
Member Author

It does not have a source because it is (once you realize what it is saying) pretty obviously true. I had taken the 6k±1 to only apply to primes, but applies to all integers -- once I realized that it all fell into place on why it makes sense.

@rhagenson rhagenson marked this pull request as ready for review April 4, 2021 00:47
@rhagenson rhagenson changed the title Add IsPrime checker to Math package Add IsPrime checker to math package Apr 4, 2021
@rhagenson
Copy link
Member Author

rhagenson commented Apr 4, 2021

I am trying to think up a good way to add tests before this is merged. I could reasonable take the 6k+-i formula and verify any such k with i other than +1 and -1 is not prime. I need to put more thought into if that is a good way to test.

I also would want to check the various types are correct, ie the use of values inside the static table outside the range of U8 does not miscall when using IsPrime[U8].

@rhagenson
Copy link
Member Author

rhagenson commented Apr 4, 2021

As I suspected, the tests do not pass for all types. I will need to investigate further as to why. Failure on U8 I expected, but failing on U16 surprised me.

Edit: I saw these failures locally, but wanted to push the changes as I am done for the evening. I will see if they fail in CI as well.

@rhagenson
Copy link
Member Author

I think the local failures may be due to some oddities on my local machine. I was playing around with debug builds of ponyc the other day and suspect a make distclean and re-build from source may be in order.

@rhagenson
Copy link
Member Author

If the tests all pass, I am going to extend these tests so it is checking more than one value in each range. I will add it as a parameter at the test builder level so we can use it as _IsPrimeTestBuilder[U128]("U128", 100) to check 100 random U128 values.

@SeanTAllen
Copy link
Member

@rhagenson if your PR isn't ready to be merged, go ahead and add a "DO NOT MERGE" label to it so that no one accidentally merges early. And whenever you are ready, go ahead and do a "Squash and merge". Just remember to update the commit comment when squashing.

@rhagenson rhagenson added the do not merge This PR should not be merged at this time label Apr 4, 2021
@rhagenson
Copy link
Member Author

rhagenson commented Apr 4, 2021

If the tests all pass, I am going to extend these tests so it is checking more than one value in each range. I will add it as a parameter at the test builder level so we can use it as _IsPrimeTestBuilder[U128]("U128", 100) to check 100 random U128 values.

I figured out the reason tests were passing is because stdlib/_test.pony did not actually run the (prior to now absent) tests in math. I also decided against adding a parameter in favor of basing the number of values to check on two times the bitwidth so U8 checks 16 values, U16 checks 32 values, ..., U128 checks 256 values. Less arbitrary and makes the number of values checked scale with the number of values possible.

Edit: Added part about basing number of values to check on bitwidth.

@@ -63,6 +63,7 @@ actor Main is TestList
itertools.Main.make().tests(test)
json.Main.make().tests(test)
logger.Main.make().tests(test)
math.Main.make().tests(test)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh! I thought this already existed. dur. I should have realized that it didn't based on the rest of the PR.

Sorry about that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Before now, math had no tests so it made sense that this line was not present.

@rhagenson rhagenson removed the do not merge This PR should not be merged at this time label Apr 5, 2021
@rhagenson
Copy link
Member Author

@SeanTAllen Before I squash and merge this, as it is my first such commit to the ponyc repo, any advice/requirements to follow for a good commit message?

@SeanTAllen
Copy link
Member

No advice beyond what is in the contributing guide.

@rhagenson
Copy link
Member Author

Got it. I will give that another read then merge this.

@rhagenson rhagenson merged commit ae556fe into main Apr 5, 2021
@rhagenson rhagenson deleted the add-is-prime branch April 5, 2021 22:07
github-actions bot pushed a commit that referenced this pull request Apr 5, 2021
github-actions bot pushed a commit that referenced this pull request Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - added Automatically add "Added" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants