Forth & math: October 2015

Thursday, October 29, 2015

Integer factoring

Factoring of large integers is a great challenge in computational mathematics.

When the numbers are very large, no efficient, non-quantum integer factorization algorithm is known; an effort by several researchers concluded in 2009, factoring a 232-digit number (RSA-768), utilizing hundreds of machines over a span of two years. Wikipedia

However, there are efficient and simple algorithms for cell size numbers (about 19 digit numbers in 64 bit systems) that immediately delivers the prime factorization. One such algorithm is Pollard rho method.

: func ( numb n -- m ) dup um* 1 0 d+ rot um/mod drop ;

: pollard1 ( numb start -- pfac flag ) \ numb is an odd composite
dup dup loc{ numb start alpha beta } 0
begin drop numb alpha func to alpha
numb dup beta func func to beta
alpha beta - abs \ |alpha-beta|
numb ugcd dup numb = \ gcd flag
if false exit
then dup 1 = 0= \ gcd<>1
until true ;

: sqrtf \ m -- n \ floor
0 d>f fsqrt f>s ;

: sqrtfslow \ m -- n
dup 4 u< if dup
if drop 1 then exit then
dup 1- >r 1 rshift
begin r@ over 0 swap um/mod nip
over + 1 rshift tuck u> 0=
until r> drop ;

: sqrtc \ m -- n \ ceiling
1- sqrtf 1+ ;

: pollard2 \ numb -- prime numb>1
dup 1 and 0= if drop 2 exit then
dup prime if exit then
dup sqrtf dup * over =
if sqrtf exit then -1 2 \ numb 100 0
do dup i pollard1 \ numb pfac flag
if leave \ numb pfac
then drop \ numb
loop nip ; \ pfac

: pollard \ n -- p1 ... pk
dup pollard2 2dup = \ n q f=
if drop exit \ n
then dup >r
0 swap um/mod nip recurse \ q n/q
r> recurse ;

4267640728818788929 pollard .s 1234567891 3456789019 ok

12345678987654321 pollard cr .s
333667 37 3 3 333667 37 3 3 ok

Sometimes the greatest prime factor is of interest and I guess there is no other way to find it than factoring and sorting the stack.

: lower \ m1 ... mn m n+1 -- m1 ... m ... mi n+1
\ lower m on stack until next number not is greater
dup 2 =
if drop 2dup u>
if swap
then 2 exit
then >r 2dup u> 0=
if r> exit
then r> rot >r
1- recurse 1+ r> swap ;

: sort \ m1 ... mn n -- s1 ... sn n o(n²)
dup 3 <
if dup 2 =
if drop 2dup u>
if swap
then 2
then exit
then dup >r
1- recurse roll
r> lower ;

But to use sort one have to know the number of factors to sort.

: pollard# \ numb -- p1 ... pk k

depth >r

pollard depth r> - 1+ ;

12345678987654321 pollard# cr .s
333667 37 3 3 333667 37 3 3 8 ok
sort .s 3 3 3 3 37 37 333667 333667 8 ok
over . 333667 ok

Monday, October 26, 2015

Prime number tests

A simple pseudo prime number test is using Fermat's little theorem:

for any prime number p and integer a such that 0<a<p. To select a first there is a need for a pseudo random number generator.

variable rnd base @ hex

: reset_seed 0ABCDEF1 rnd ! ; reset_seed

: rand ( -- u )
rnd @ 08088405 * 1+ dup rnd ! ;

: random ( u1 -- u2 )
rand um* nip ;

base !

A good thing with fermat is that flag always is true if p is a prime.

: fermat ( numb -- flag )
dup 2 - random 2 +
over 1- rot u**mod 1 = ;

It could therefore be used as a rather fast first primality test to see whether it's necessary to take a closer look.

Rabin-Miller strong pseudoprime test

is an efficient probabilistic algorithm for determining if a given number is prime.

: 2/mod \ n -- r q
dup 1 and swap 1 rshift ;

: get-rs 0 loc{ numb r -- r s }
numb 1-
begin dup 2/mod swap 0= \ n qout rest=0
while nip r 1+ to r
repeat 2drop
r numb 1- r rshift ;

: get-a ( numb -- a )
2 - random 2 + ;

: rabin-miller1 ( numb -- flag )
dup dup get-rs rot get-a false loc{ numb r s a flag }
a s numb u**mod 1 =
if true exit
then false to flag r 0
?do a s i lshift numb u**mod numb 1- =
if true to flag leave then
loop flag ;

Both fermat and rabin-miller1 depends on pseudo random numbers that varies from test to test and are not even combined accurate enough. A repeated test with all odd numbers between 3 and 1000000 gave the series 12 12 7 17 13 of number of errors. A second test series with fermat plus 2 times rabin-miller gave the series 1 1 2 1 3.

Due to OEIS A014233 it is enough to repeat the Rabin-Miller test a few times, depending of the word length, to be sure. Instead of random numbers one use the first prime numbers:

2 3 5 7 11 13 17 19 23 29 31 37 ...

Repeated Rabin-Miller will reveal all composite numbers less than:

2027 if a=2
1373653 if a=2,3
25326001 if a=2,3,5

3215031751 if a=2,3,5,7

2152302898747 if a=2,3,5,7,11

Since 2^32 < 2152302898747 a=2,3,5,7,11 suffice for 32 bit integers. For an accurate test of all 64 bit integers it's enough to test for a=2,3,5,7,11,13,17,19,23,29,31,37.

create pnr
2 c, 3 c, 5 c, 7 c, 11 c, 13 c, 17 c, 19 c, 23 c, 29 c, 31 c, 37 c,

The word create defines the word pnr which leaves an address on tos, the address where the bytes 2 to 37 are allocated.

: rabin-miller2 loc{ numb a | s r flag -- flag } \ odd numb>a>1
numb get-rs to s to r
a s numb u**mod 1 =
if true exit then
false to flag r 0
?do a s i lshift numb u**mod numb 1- =
if true to flag leave then
loop flag ;

: rmx ( numb -- n ) \ 32 bit integers
dup 2047 u< if drop 1 exit then
dup 1373653 u< if drop 2 exit then
dup 25326001 u< if drop 3 exit then
3215031751 u< if 4 else 5 then ;

: rmx ( numb -- n ) \ 64 bit integers
dup 2047 u< if drop 1 exit then
dup 1373653 u< if drop 2 exit then
dup 25326001 u< if drop 3 exit then
dup 3215031751 u< if drop 4 exit then
dup 2152302898747 u< if drop 5 exit then
dup 3474749660383 u< if drop 6 exit then
dup 341550071728321 u< if drop 8 exit then
3825123056546413051 u< if 11 else 12 then ;

: rabin-miller ( numb -- flag )
dup rmx 0
do dup i pnr + c@ rabin-miller2 0=
if 0= leave then
loop 0= 0= ;

Or faster when composite:

: fermat-rabin-miller ( numb -- flag ) \ numb odd
dup fermat
if rabin-miller
else 0=
then ;

suitable to define

: nextprime ( numb -- prime ) \ numb unsigned integer
dup 3 u< if drop 2 exit then
1 or
begin dup fermat-rabin-miller 0=
while 2 +
repeat ;

: prime ( numb -- flag )
dup 3 u< if 2 = exit then
dup 1 and 0=
if drop false exit
then rabin-miller ;

Friday, October 23, 2015

Single cell computational arithmetic

To manage single cell computational arithmetic there is a need for words as:

ugcd ( a b -- c ) unsigned-greatest-common-divisor
where a and b are unsigned single cell numbers and c is their greatest common divisor.

u*mod ( a b m -- c ) unsigned-star-mod
where a b m are any unsigned single cell numbers and the result is the single cell number c=a*b(mod m).

u**mod ( b a m -- c ) unsigned-power-mod
as above but where c=b^a(mod m).

invmod ( a m -- c ) signed-invers-mod
here 0<a<m and m is a signed number relative prime to a. The result c is a number 0<c<m such that a*c(mod m)=1.

It should be possible to compute the numbers above for any single number arguments (in the case below in 64bit-systems):

1234567890123456789 987654321098765432 123123123123123123 u**mod ok
. 1321916066429949 ok

The usual single number arithmetic + - * / */ etc. aren't powerful enough but there are standard Forth words which are:

um* ( b a-- ud ) where a b are unsigned singles and ud is an unsigned double

um/mod ( ud u -- r q ) where r is the rest and q is the quote

m*/ ( d a b -- d' ) where d'=d*a/b, d d' dubble cell numbers

Greatest common divisor

: ugcd 0 loc{ a b t -- c } \ Algorithm from Wikipedia
a b u< if a b to a to b then \ a>=b as unsigned numbers
begin b                      \ while b<>0
while b to t                 \ t <- b
     a 0 b um/mod drop to b    \ b <- a(mod b)
     t to a                    \ a <- t
  repeat a ;

: ugcd ( a b -- c )     \ a version without local variables
2dup u< if swap then \ smallest of a b on top of stack (tos)
?dup 0= if exit then \ return second value if tos is zero
begin tuck            \ y x y first time b a b
     0 swap um/mod      \ y x 0 y --> y r q
     drop dup 0=        \ y r [r=0]
until drop ;          \ y

The word um/mod has the stackdiagram ( ud u -- r q ) and is used above to calculate a(mod b). The single unsigned number a is converted to a double by putting a zero on the top of a. The word ?dup duplicates the top of stack if it isn't zero.

Modular multiplication

: u*mod ( u1 u2 u3 -- u4 )
>r r@ umod swap r@ umod um*
r> um/mod drop ;

Here m is stored in the return stack:

>r ( a -- ) tos is popped from stack and pushed to r-stack
r> ( -- a ) r-stack is popped and a is pushed to the parameterstack
r@ ( -- a ) a copy of a is pushed on parameterstack without popping

Modular exponentiation

Below lshift ( m i -- n ) n=m*2^i and rshift ( m i -- n ) n=m/2^i

are standard words and log~ is defined (bits gives the maximal number of bits in one cell):

cell 8 * constant bits

: log~ ( n -- nr )     \ nr = 1+²log n
bits here !           \ bits is stored at the address 'here'
bits 0                \ do-loop from i=0,...,bits-1
do 1 rshift ?dup 0=   \ shift tos at right test if zero
     if i 1+ here !     \ if zero store i+1 at 'here'
        leave           \ and leave the loop
     then
loop here @ ;

: u**mod loc{ b a m -- x }
1 \ preparation for repeated multiplication
a log~ 0 \ n 0 n is the number of binary digits in a

?do a 1 i lshift and  \ flag the i-th digit of a is 0/1
     if b m u*mod       \ multiply if flag
     then b dup m       \ square b for each i: b b^2 b^4 b^8 ...

u*mod to b \ new squared number to b
loop ; \ result of the repeated multiplication

The algorithm is called square-and-multiply:

The ?do-loop differs from do-loop in that there is an initial check with ?do, while with do the calculation between do and loop always is done at least once.

Modular inverse

: invmod ( a m -- a' ) \ a m must be coprime
dup 1 0 loc{ a m v b c -- a' }
begin a
while v a / >r
c b s>d c s>d r@ 1 m*/ d- d>s to c to b \ old c become new b
a v s>d a s>d r> 1 m*/ d- d>s to a to v \ old a become new v
repeat b m mod dup to b 0<
if m b + else b then ;

The words s>d and d>s converts between single and double signed integers.

1234567890123456789 123123123123123133 2dup ugcd . 1 ok
2dup invmod dup . 94365978201029936 ok
swap u*mod . 1 ok

Pages