EDES/FACTOR -- A Freeware Suite of Factorization Programs
Version: 2.00
Release Date: 6 June 2011
Author: Craig W. Schulenberg (cschulen@schulenberg.com)
Website: http://www.schulenberg.com
The EDES Factorization Suite is a collection of number factorization programs that share a common interface design and standardized input/output file formats. These programs are designed to operate under Windows, and it is most efficient to run them within a Command Prompt window (a 'DOS'-type window). You will need to use a text editor like WordPad to create the DEF (define) files that are used to specify the number to be factored, as well as to document the results of previous factorizations. DEF files are thus used as both inputs and outputs; any factors found by a factorization program are output as individual DEF files that can then be input to a successor program. Listing files (e.g., FACTOR1.LST) are also generated that contain more information, including performance statistics, for the factorization run.
1.0 Available Programs.
There are currently six programs in this suite (FACTOR1 - FACTOR6), and a further program (GNFS/FACTOR7) is in an early development stage:
FACTOR1 -- Trial-Division method. Use this FIRST to pull out all factors less than 9-10+ digits in size from numbers up to 350 digits in size. Start the program up and then kill it manually (e.g., Ctrl C) when it appears to have found the smaller factors, and then switch to ECM (FACTOR4) or MPQS (FACTOR3). FACTOR1 will always leave its latest outputs in a file named TEMP1.DEF. This file can then be input to any of the remaining programs.
FACTOR2 -- Pollard-Rho algorithm. *** Made Obsolete by ECM ***. This program will extract factors up to 20+ digits in size from numbers up to 350 digits in size. This method, however, has now been far outclassed by the much faster ECM (Elliptic Curve Method) program: FACTOR4.
FACTOR3 -- Multiple Polynomial Quadratic Sieve (MPQS). Use this LAST, and only if the remaining number to factor is <= 130 digits. However, factoring any general number with more than 90-95 digits is going to take a VERY long time (see estimated timings below). Still, this is the 'big gun' in our current suite of factorization methods.
FACTOR4 -- Elliptic Curve method (ECM). Use this 2nd to pull out all factors less than 40+ digits in size from numbers up to 350 digits in size. However, if the number to be factored is less than 70-75 digits, then it is usually more efficient to use MPQS (running it several times if necessary) to split the number into two factors. The true benefit of ECM is its ability to efficiently extract rather large factors from numbers that are far too large to be handled by MPQS directly -- or by any other known method (e.g., GNFS).
.
FACTOR5 -- Williams p+1 Method. *** A 'Long' Shot ***. Try this to see if you are 'lucky' (this will not work with RSA numbers). This program lacks the sophistication of Phase 2 processing, but it will sometimes work on very large numbers. It is worth a try, especially if you have other computers available.
FACTOR6 -- Pollard p-1 Method. *** A 'Long' Shot ***. Try this to see if you are 'lucky' (this will not work with RSA numbers). This program lacks the sophistication of Phase 2 processing, but it will sometimes work on very large numbers. It is worth a try, especially if you have other computers available.
FACTOR7 -- General Number Field Sieve (GNFS). *** Under Development ***.
My current plans call for the addition of a GNFS (General Number Field Sieve) factorization program, improvements to FACTOR5 and FACTOR 6 (addition of a true Phase 2 capability), and further speedups to the MPQS program (FACTOR3).
If you find problems with any of these programs, or have suggestions for improvements, please let me know.
2.0 Recommended Procedure for Factoring Large Numbers
When you have a number that you wish to factor, the first step is to specify the number within a DEF (define) file, e.g., MYNUM.DEF. A DEF file must be created with a text editor (like NotePad, WordPad, or TextPad) that does not utilize special word-processing codes like Microsoft WORD or OpenOffice WRITER. A DEF file consists of comment lines, indicated by an asterisk (*) in column 1, and a number to be factored (no asterisk in column 1). There must therefore be at most 1 line in a DEF file lacking an asterisk in column 1.
Step 1. Specify your number in a DEF file, and add comments as appropriate to document the number (its source, its significance, etc.).
The next step is to run FACTOR1 (Trial Division) to find all small factors, e.g., up to 9-10 digits or so. When FACTOR1 prompts you for a number, type your DEF file name instead, e.g., MYNUM.DEF. All of the factorization programs invoke the Rabin-Miller primality test algorithm to ensure that you are not trying to factor a prime number. The Rabin-Miller algorithm is also used to mark the output factors as being 'prime' or 'composite'. Rather than use a DEF file, of course, you can always type in random numbers of various sizes to see how fast FACTOR1 can find smaller factors. The DEF file REP63.DEF makes an interesting test case for FACTOR1. This is a number consisting of 63 1's.
Step 2. Run FACTOR1 and give it the name of your DEF file. if the program doesn't completely factor the number in a short time, then it will essentially run forever. Simply kill the program when you think it is appropriate (e.g., type Ctrl C). FACTOR1 will leave its latest results in a file named TEMP1.DEF. This new DEF file will document all factors found by trial division, and will contain the remaining composite number that is to be factored.
You started off with a DEF file that you named 'MYNUM.DEF', and after running FACTOR1 you now have a new DEF file named TEMP1.DEF. This file is suitable for input to the other factorization programs (e.g., FACTOR3 or FACTOR4), but since it will be overwritten if you run FACTOR1 again, it is best to rename it.
Step 3. Rename the TEMP1.DEF file to something more appropriate, e.g., MYNUM1.DEF. Of course, if FACTOR1 found no factors at all, then you can continue to use your original DEF file (MYNUM.DEF).
If the remaining number to be factored is less than 70-75 digits or so, then the simplest way to proceed is to use MPQS once, twice, or more in order to 'split' the number into two factors. If the number to be factored is larger, then ECM 'might' be more efficient, but if the number is greater than 100 digits or so then MPQS will take an incredibly long time and ECM will be the method of choice. In the current implementation, MPQS will only handle numbers of up to 130 digits in size (and such numbers will take YEARS to factor), and ECM is your only recourse for attacking larger numbers -- until our GNFS program is available.
Step 4a (number <=75 digits). Run MPQS (FACTOR3) and supply it with the name of your DEF file. When MPQS finishes, it will generate two (2) DEF files: TEMP3F1.DEF and TEMP3F2.DEF. Each DEF file will contain one of the factors obtained (with the other factor represented by a comment line), and the factor will be marked as 'prime' or 'composite'. If the factor is prime, then you are done with it; otherwise, rename the DEF files (e.g., MYNUM3A.DEF, MYNUM3B.DEF) and run MPQS again.
Step 4b (number >=75 digits). Run ECM (FACTOR4) and supply it with the name of your DEF file. When ECM finishes, it will generate two (2) DEF files: TEMP4F1.DEF and TEMP4F2.DEF. Each DEF file will contain one of the factors obtained (with the other factor represented by a comment line), and the factor will be marked as 'prime' or 'composite'. If the factor is prime, then you are done with it; otherwise, rename the DEF files (e.g., MYNUM4A.DEF, MYNUM4B.DEF) and run ECM again.
Step 4c (multiple machines). Run MPQS on one computer, and run ECM on another.
The Pollard p-1 and Williams p+1 method are also provided in this suite, and they are worth a try -- especially if you have a spare computer that can be devoted to running these programs. They WILL NOT WORK with RSA semi-prime numbers, but they 'may' be able to crack a very large arbitrary number ... if you are lucky.
Step 5. Experiment with Williams p+1 (FACTOR5) and Pollard p-1 (FACTOR6) on other available computers.
3.0 Timing Estimates
FACTOR3 (MPQS)
DEF file |
Runtime(secs) |
Runtime (hours) |
COMP10 |
0.03 |
0.000008 |
COMP15 |
0.02 |
0.000006 |
COMP20 |
0.13 |
0.000036 |
COMP25 |
0.06 |
0.000017 |
COMP30 |
0.17 |
0.000047 |
COMP35 |
0.17 |
0.000047 |
COMP40 |
0.44 |
0.000122 |
COMP45 |
1.28 |
0.000356 |
COMP50 |
3.11 |
0.000864 |
COMP55 |
14.25 |
0.003958 |
COMP60 |
78.66 |
0.021850 |
COMP65 |
116.44 |
0.032344 |
COMP70 |
788.45 |
0.219014 |
COMP75 |
3078.86 |
0.855239 |
COMP80 |
12029.12 |
3.341422 |
COMP85 |
37372.09 |
10.381136 |
COMP90 |
120951.37 |
33.597603 |
COMP95 |
262838.83 |
73.0108 (3.042 days) |
COMP100 |
|
1-2 weeks? |
COMP105 |
|
1-2 months? |
COMP110 |
|
3-7 months? |
COMP115 |
|
1-2.25 years? |
COMP120 |
|
4-9 years? |
COMP125 |
|
15-35 years? |
COMP130 |
|
60-130 years? |
These timings were made on a 3-year-old 2.33 GHz dual-core laptop. The DEF files shown above are included with the factorization suite so you can make timings on your own computer. The bottom line is that MPQS will crack a 90-digit 'hard' (semiprime) number into two 45 digit factors in a little over 33 hours on an obsolete laptop. Smaller numbers of up to 65 digits in size require from 0.02 seconds to just under 2 minutes.
On a fast computer these run times would be improved by at least a factor of 2, but this means that a 100 digit number would still take about a week to factor. The only effective solution to this problem is to run MPQS in parallel on multiple machines, as was done some years ago to factor the 129-digit RSA number. However, it makes more sense to switch to GNFS (General Number Field Sieve), and this is our current focus.
FACTOR4 (ECM)
The chief virtue of ECM (the Elliptic Curve Method) is its ability to extract factors of considerable size from much larger numbers. To estimate timings for ECM we therefore used composite numbers that were the product of a 100-digit prime and a second prime of 5-, 10-, 15-, 20-, 25-, 30-, 35-, and 40-digits in size. The following timings were made on the same dual-core laptop that was used for the MPQS timings:
CMP10005 (100 digit prime * 5 digit prime): 2.48 secs
CMP10010 (100 digit prime * 10 digit prime): 2.61 secs
CMP10015 (100 digit prime * 15 digit prime): 3.70 secs
CMP10020 (100 digit prime * 20 digit prime): 4.05 secs
CMP10025 (100 digit prime * 25 digit prime): 449.87 secs (0.125 hrs)
CMP10030 (100 digit prime * 30 digit prime): 9244.11 secs (2.57 hrs)
CMP10035 (100 digit prime * 35 digit prime): 58610.27 secs (16.28 hrs)
CMP10040 (100 digit prime * 40 digit prime): TBD (NOT YET TESTED)
An additional test was run to see how much longer it would take to extract a 30-digit factor from a 230 digit number (200 digit prime * 30 digit prime). This resulted in:
CMP20030 (200 digit prime * 30 digit prime): 11167.81 secs (3.10 hrs). In other words, it took only about 20% longer to extract a 30 digit prime from a 230 digit number than it did to extract a 30 digit prime from a 130 digit number.
CONTACT INFO:
Craig W. Schulenberg
Schulenberg and Associates, Inc.
P.O. BOX 14507
Huntsville, Alabama 35815
(256) 541-9946
email: cschulen@schulenberg.com
http://www.schulenberg.com (web site)