Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use EllipticCurveMethod in CombinedFactorAlgorithm.factor() #2

Closed
axkr opened this issue Mar 11, 2019 · 35 comments
Closed

Use EllipticCurveMethod in CombinedFactorAlgorithm.factor() #2

axkr opened this issue Mar 11, 2019 · 35 comments

Comments

@axkr
Copy link

axkr commented Mar 11, 2019

Use de.tilman_neumann.jml.factor.ecm.EllipticCurveMethod in CombinedFactorAlgorithm.factor()

@TilmanNeumann
Copy link
Owner

Class EllipticCurveMethod does not look very reliable yet. I am working on tests.

@axkr
Copy link
Author

axkr commented Oct 30, 2019

@TilmanNeumann does it help doing tests, if I already update the Symja project or better waiting for an official release?

BTW: which method is now best suited to factor a Java int value?

@TilmanNeumann
Copy link
Owner

If you are happy with the stability of your current implementation, then I would wait, because my latest changes had hardly an influence on performance. The point is that ECM is quite tolerant against spurious bugs; if a bug only appears in 1% of curves then the overall performance will not suffer much.

The fastest algorithm to factor ints depends on the nature of the test numbers. For arbitrary numbers, TDiv31Barrett is the best choice for N up to 31 bits. For semiprimes, Hart_Squarefree is the fastest algorithm for N from 27 to 31 bit.

@axkr
Copy link
Author

axkr commented Nov 1, 2019

What function must I call to get the simple small prime factors?

This example returns 22 as a factor, how can I get 2 and 11 into the map?

	public static void main(String[] args) {
		TDiv31Barrett tdiv=new TDiv31Barrett();
		SortedMultiset<BigInteger> tdivFactors = tdiv.factor(BigInteger.valueOf(990));
		for (Map.Entry<BigInteger, Integer> entry : tdivFactors.entrySet()) {
			System.out.println("Key:"+entry.getKey()+" Value:"+ entry.getValue());
		}
	}

BTW shouldn't there be a method for int values to avoid the overhead of BigIntegers?

@axkr
Copy link
Author

axkr commented Nov 1, 2019

Ok, I think it should work like this?

	public static void main(String[] args) {
		TDiv31Barrett tdiv = new TDiv31Barrett();

		int value = 990;
		int prime = tdiv.findSingleFactor((int) value);
		while (true) {
			prime = tdiv.findSingleFactor((int) value);
			value /= prime;
			if (prime != 1) {
				System.out.println("Key:" + prime);
			} else {
				break;
			}
		}
		if (prime != 1) {
			SortedMultiset<BigInteger> tdivFactors = tdiv.factor(BigInteger.valueOf(value));
			for (Map.Entry<BigInteger, Integer> entry : tdivFactors.entrySet()) {
				System.out.println("Key:" + entry.getKey() + " Value:" + entry.getValue());
			}
		}
	}

@axkr
Copy link
Author

axkr commented Nov 1, 2019

From my experience this JAS algorithm is a quite fast algorithm for long values up to a BETA limit:

https://github.com/kredel/java-algebra-system/blob/8f040b29fc4faaa2ac6d12b993769aa6bb78fd66/src/edu/jas/arith/PrimeInteger.java#L377

Do you have a corresponding algorithm in your library?

@TilmanNeumann
Copy link
Owner

The findSingleFactor() method is mostly of interest if you know that the test numbers are semiprimes. In any other case I'ld use the factor() method. I admit that for some small range algorithms the factor() method is not well-implemented, but TDivBarrett31 overrides it and should be pretty fast in the mentioned number range.

I guess that BETA is something near to 2^61 or 2^62. My best algorithm in that range is TinyEcm64. PollardRhoBrentMontgomery64 is the second best algorithm in that range. Hart_Fast2Mult and Lehman_CustomKOrder are the best algorithms from something like 35 to 48 bits.

@axkr
Copy link
Author

axkr commented Nov 1, 2019

TDiv31Barrett#.factor(BigInteger.valueOf(990)) returns 22 as a factor?

@TilmanNeumann
Copy link
Owner

Good spot. Obviously my Barrett division does not work with p=2. Now I check the lsb which is faster anyway.

@TilmanNeumann
Copy link
Owner

Thanks for finding that one!

@TilmanNeumann
Copy link
Owner

My inital integration of Dario Alpern's ECM is finished. I took a very conservative (say, small) number of curves to run before invoking SIQS/PSIQS because the original settings were very slow for hard semi-primes.

This work required to introduce a more complex but more powerful interface to "find some factors". Feedback is very welcome.

There are a few minor topics I'll work on in the sequel, like passing the amount of trial division that has already been done.

@axkr
Copy link
Author

axkr commented Dec 25, 2020

Can you show some simple examples for factoring:

For simplicity let's say I want to factor the number 990.
How should I implement this now?

@axkr
Copy link
Author

axkr commented Dec 25, 2020

I'm using now this algorithm. It seems to be slower in my JUnit test suite.
Especially testMultiplicativeOrder() in

Does it make sense to search for the bottleneck or should I use some different coding?

public static void factorInteger(final BigInteger val, SortedMap<BigInteger, Integer> map) {
    CombinedFactorAlgorithm factorizer;
    final int cores = Runtime.getRuntime().availableProcessors();
    if (Config.JAVA_UNSAFE) {
      factorizer = new CombinedFactorAlgorithm(cores / 2 + 1, true);
    } else {
      factorizer = new CombinedFactorAlgorithm(1, false);
    }
    //    CombinedFactorAlgorithm cfa = new CombinedFactorAlgorithm(1);
    SortedMultiset<BigInteger> set = factorizer.factor(val);
    for (Map.Entry<BigInteger, Integer> entry : set.entrySet()) {
      map.put(entry.getKey(), entry.getValue());
    }
    return;
  }

@TilmanNeumann
Copy link
Owner

Your coding looks ok. This is a first shot, it is quite possible I overlooked somthing.

Much depends on the kind of test numbers. Which kind of numbers is your JUnit test suite testing?

@TilmanNeumann
Copy link
Owner

That looks too special purpose. Dont feel happy with it.

@axkr
Copy link
Author

axkr commented Dec 25, 2020

Much depends on the kind of test numbers. Which kind of numbers is your JUnit test suite testing?

most numbers are in the range of Java short values.

That looks too special purpose. Dont feel happy with it.

It would be enough, if I can define my own derived SortedMultiset<BigInteger> and use it as a second argument in the factor method and can override put(BigInteger key, Integer value)myself.

@TilmanNeumann
Copy link
Owner

Much depends on the kind of test numbers. Which kind of numbers is your JUnit test suite testing?
most numbers are in the range of Java short values.

This is a really small-valued range. My improvements were thought for much larger numbers (200 bit+), the tension field being "not to deterioate performance on semi-primes while improving performance on arbitrary inputs". For smaller inputs, ECM hardly makes sense because the (P)SIQS implementation is much faster.

Actually the CombinedFactorAlgorithm should provide you with a good choice of algorithm for your number range, but maybe the new searchFactors() interface is not bug-free. You could experiment with CombinedFactorAlgorithm.useLegacyFactoring and CombinedFactorAlgorithm.searchSmallFactors to see if there are some differences.

That looks too special purpose. Dont feel happy with it.

It would be enough, if I can define my own derived SortedMultiset<BigInteger> and use it as a second argument in the factor method and can override put(BigInteger key, Integer value)myself.

I'll think about it.

@axkr
Copy link
Author

axkr commented Dec 25, 2020

Yes it seems for big numbers there is a very good improvement.

  • FactorInteger(10^79+5923) is now approx. 113 seconds (vs. 147 seconds in the old version)
  • FactorInteger(10^71-1) is now approx. 18 seconds (vs. 32seconds in the old version)

@axkr
Copy link
Author

axkr commented Dec 26, 2020

The demo page is now updated:
http://matheclipse.org/input?i=FactorInteger(2%5E15-5)

But the time limit of 30 seconds of the Google Appengine limits the tests to something like FactorInteger(10^61-1)
grafik

@TilmanNeumann
Copy link
Owner

I had another look at #2 (comment) and noticed that your were using a constructor of CombinedFactorAlgorithm that sets useLegacyFactoring=true, thus running the old code I kept for comparison. Use a constructor like new CombinedFactorAlgorithm(6, null, true, false, true). But probably you figured that out already?

Additionally I added the new factor() signature you requested, fixed a bug in CombinedFactorAlgorithm I introduced yesterday evening and improved the performance of CombinedFactorAlgorithm for N<32 bit by about factor 3.

Is it not possible to increase the 30s limit?

@axkr
Copy link
Author

axkr commented Dec 26, 2020

Is it not possible to increase the 30s limit?

I created a new release, where you have no time limit, if you use it locally on your machine:
https://github.com/axkr/symja_android_library/releases

Example from Symja-browser-usage.
Press the question mark and insert FactorInteger to get the help page:
grafik

If someone likes to help, it would be nice to have a #209 docker image available

@TilmanNeumann
Copy link
Owner

I'ld be interested if according to your JUnit tests, CombinedFactorAlgorithm still looks slower for small N than the version before yesterday.

This is not unthinkable. Before, I let CombinedFactorAlgorithm factor the "last small factors" in a loop doing {find some factor; do primality test on that factor;}. Yesterday I switched that to {find all factors in a single trial division run}. But the latter may indeed perform worse for numbers having a very big factor, where the old algorithm would terminate trial division quickly because of the primality test.

@axkr
Copy link
Author

axkr commented Dec 27, 2020

If I place this old code before CombinedFactorAlgorithm#factor() it runs much faster (approx 7 times) for smaller values:

    // Do trial division by all primes < 131072.
    TDiv tdiv = new TDiv();
    n = tdiv.findSmallFactors(n, 131072, map);
    if (n.equals(BigInteger.ONE)) {
      return;
    } 

axkr added a commit to axkr/symja_android_library that referenced this issue Dec 27, 2020
@TilmanNeumann
Copy link
Owner

TilmanNeumann commented Dec 27, 2020

Ah, I see. My adjustment is still quite biased for hard semiprimes. But in the case of class CombinedFactorAlgorithm it would make sense to adjust it better for arbitrary random numbers.

@TilmanNeumann
Copy link
Owner

Now I tried to reproduce your performance improvement but I can't. I think now that what you are witnessing is the overhead of the CombinedFactorAlgorithm constructor. You should instantiate that class only once, not for each new N. If you do that then you'll see a completely different performance for small N.

@axkr
Copy link
Author

axkr commented Dec 27, 2020

The instantiation of a singleton CombinedFactorAlgorithm is indeed faster. But is this really thread safe?

I see non static attributes in some of the declared classes?

@TilmanNeumann
Copy link
Owner

TilmanNeumann commented Dec 27, 2020

My own classes should be thread-safe. E.g. the SIQS class is used inside multi-threaded PSIQS when it comes to factor large Q(x) and no problems there.

EllipticCurveMethod has several static attributes, though. So if you want to run several instances of CombinedFactorAlgorithm (which itself is possibly multi-threaded) at once, you may get problems with EllipticCurveMethod .

@TilmanNeumann
Copy link
Owner

Instantiating them as singletons was an essential idea in the design of my factoring algorithms. If EllipticCurveMethod makes problems there then it should be fixed.

@axkr
Copy link
Author

axkr commented Dec 27, 2020

These variables aren't static for example:

grafik

axkr added a commit to axkr/symja_android_library that referenced this issue Dec 27, 2020
@TilmanNeumann
Copy link
Owner

TilmanNeumann commented Dec 27, 2020

Referring to #2 (comment):
Yes, and that's good, isn't it? Only constants should be static if you want thread-safety.

@TilmanNeumann
Copy link
Owner

TilmanNeumann commented Dec 27, 2020

I looked up class EllipticCurveMethod and noticed that all non-constants have been made non-static already. So now I think that everything is thread-safe. Maybe not if you are using a singleton, but lets say if you have 2 threads, then each one should have its own instance. In that way, instantiating it in a ThreadLocal as you did may just be the right way.

@TilmanNeumann
Copy link
Owner

But if you really have several threads running CombinedFactorAlgorithm at once, you should think about the numberOfThreads you appoint to each instance.

@TilmanNeumann
Copy link
Owner

This issue has been addressed.

@TilmanNeumann
Copy link
Owner

TilmanNeumann commented Dec 30, 2020

Fixed another bug in EllipticCurveMethod: If maxCurvesForN resulted to 0, ECM would run until it found a factor. E.g. for N=856483652537814883803418179972154563054077

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants