Forth & math: Prime tables and the prime counting function

The only possibility for a very fast prime counting function is to make tables. To make a straight forward table would take a lot of time and a lot of memory, but a table of primes is more compact, is loading faster and can be used for fast prime counting.

It's possible to store consecutive prime series in tables using only two bytes per prime number up to virtually any size (at least 100 decimals or so), with a technique using that the prime gaps for those numbers are less than 65536=2^16. What's stored in the main table are the prime numbers modulo 65536 - and in a minor table "break points" are stored: the number n when the next prime Pn+1 divided with 65536 is greater than the same for the previous prime. A fast search in the minor table gives the index number, that multiplied with 65536 should be added to the number in the main table to get the prime.

Such a table for all primes less than 2^24 is created in less than a second and can be used to compute π(x) for x<16777216 in a few microseconds. The memory needed for the main table is 2*π(16777216)=2*1077871, slightly more than 2 Mb, and for the minor table 1-2 kb if the break points are stored as single integers. The memory saved is near 50 % in this case, but for larger tables the memory saved will be 100 % or more.

Making a table for all 64 bit primes would need about 5 billion Gb. All 32 bit primes would need about 1 Gb. All primes less than 100,000,000 would need about 10 Mb and would take some time to load (about 5 seconds). For greater values it's more realistic to use other and slower methods, like files with records of primes and their order numbers, with one second computing time gap in between.

First some words that is used:

: prevprime ( numb -- prime )
dup 3 u< if drop 2 exit then
1- 1 or
begin dup fermat-rabin-miller 0=
while 2 -
repeat ;

: ?def ( -- flag ) bl word find nip 0= ;

?def under+ [if]
: under+ ( a b c -- a+c b ) rot + swap ; [then]

: 8/mod ( n -- r q ) dup 7 and swap 3 rshift ;
: 256/mod ( n -- r q ) dup 255 and swap 8 rshift ;
: 65536/mod ( n -- r q ) dup 65535 and swap 16 rshift ;

The idea is to use the sieve of Eratosthenes to create av very fast word prime_ ( n -- flag ) and bit arrays is used.

\ bit arrays

: adbit ( i ad -- j a ) swap 8/mod rot + ;

: setbit \ i ad --
adbit 1 rot lshift over c@ or swap c! ;

: tglbit \ i ad --
adbit 1 rot lshift over c@ xor swap c! ;

: clrbit \ i ad --
adbit 1 rot lshift 255 xor over c@ and swap c! ;

: bit@ \ i ad -- bit
adbit c@ swap rshift 1 and ;

: bit! \ bit i ad --
rot if setbit else clrbit then ;

: bit? ( i ad -- f ) bit@ negate ;

\ Defining named bitarrays i.e.
\ 1000000 bitarray megaset constant megasetbuffer
: bitarray \ bits -- ad | n -- bit
8/mod swap 0= 0= negate + allocate throw
create dup ,
does> @ swap 8/mod rot + bit@ ;

For storing numbers in i=2,3,... bytes spaces I use the words

mb! ( n ad i -- )

and

mb@ ( ad i -- n )

\ multiple byte read/write in arrays

variable ebuf
1 ebuf ! ebuf c@ negate
[if] \ little-endian, as in Windows and Android
: mb! ( n ad i -- ) 2>r ebuf ! ebuf 2r> cmove ;
: mb@ ( ad i -- n ) ebuf 0 over ! swap cmove ebuf @ ;
[else] \ big-endian, as in MacOS
: mb! ( n ad i -- ) 2>r ebuf ! ebuf cell + r@ - 2r> cmove ;
: mb@ ( ad i -- n ) ebuf 0 over ! cell + over - swap cmove ebuf @ ;
[then]

The tables

The method below can be used for any number of primes. Here plim=2^24 and pi_plim=π(plim)=1077871:

\ the sieve of Eratosthenes

0xfffff constant plim
82025 constant pi_plim

\ 16777215 constant plim \ Loads in
\ 1077871 constant pi_plim \ a few seconds

\ 100000000 constant plim \ 100000000 takes 6 times
\ 5761455 constant pi_plim \ longer time to load

plim bitarray prime_ constant pbuf
\ prime_ ( n -- flag ) n<plim

: resetsieve ( -- )
pbuf plim 8/mod swap 0= 0= - + pbuf
do true i ! cell +loop ;

: sieve ( -- )
resetsieve
0 pbuf clrbit
1 pbuf clrbit
plim sqrtf 1+ 2
do i pbuf bit@
if plim 1+ i dup *
do i pbuf clrbit j +loop
then
loop ; \ takes approximate 3 seconds or less to load

Next improvement of the prime functions:

plim prevprime constant pmax_

: nextprime_ ( numb -- prime ) \ numb<pmax_
dup 3 u< if drop 2 exit then
1 or
begin dup prime_ 0=
while 2 +
repeat ;

: prevprime_ ( numb -- prime )
dup 3 u< if drop 2 exit then
1- 1 or
begin dup prime_ 0=
while 2 -
repeat ;

: prime ( n -- flag )
dup plim u> if prime else prime_ negate then ;

: nextprime ( numb -- prime )
dup pmax_ u< if nextprime_ else nextprime then ;

: prevprime ( numb -- prime )
dup plim u< if prevprime_ else prevprime then ;

It takes less than a second to store all primes less than plim=2^24 in the buffer pnrbuf.

plim 65536/mod swap 0= 0= - constant breaknumbers
pi_plim 2* allocate throw constant pnrbuf
breaknumbers cells allocate throw constant breaks

: init_pnr ( -- )
0 pad ! 0 breaks !
1 pi_plim 1+ 1
do nextprime_ dup \ p p
65536/mod pad @ = 0= \ p r flag
if 1 pad +! \ p r
i 1- pad @ cells breaks + ! \ p r
then pnrbuf i 1- 2* + 2 mb! 1+ \ p+1
loop drop ; init_pnr

: newintbreaks ( i j x -- i' j' ) \ bisection search
>r 2dup + 2/ dup
cells breaks + @ r> u> \ i j k flag
if -rot nip else nip then ;

: pnr@ ( n -- p ) \ p is n:th prime
1- dup >r 2* pnrbuf + 2 mb@ \ n ≤ pi_plim
breaknumbers 0
begin r@ newintbreaks 2dup - 2 u<
until rdrop nip 16 lshift + ;

From n comes the modulo in pnrbuf and bisection search in breaks gives the index i. Then Pn=Pn(mod 2^16)+i*2^16, where 0≤i<breaknumbers.

: newintpnr ( i j x -- i' j' )
>r 2dup + 2/ dup pnr@ r> u> \ i j k flag
if -rot nip else nip then ;

: pi ( x -- n ) >r pi_plim 1+ 0 \ x ≤ plim
begin r@ newintpnr 2dup - 2 u< \ i j flag
until rdrop nip ;

A conjecture

A multiplicative function is such that f(ab)=f(a)f(b). All multiplicative functions on the positive integers can be defined by the action on the primes in the prime factorization and any function defined that way is multiplicative. The function that maps primes p on π(p), that is, the nth prime on n, defines a multiplicative function f that maps positive integers on positive integers.

Blue plot f(n) and red plot π(n)

The conjecture is that f(n)≤π(n) except for n=35,49 and there is a proof on mathematics stack exchange but it's unclear if the proof is verified. Testing the conjecture for all n≤plim takes a few minuts if plim=100000000:

\ conjecture: if f:p1^n1*...*pk^nk -> 1^n1*...*k^nk ==>
\ f(n)≤π(n) for n>49

: func_f ( n -- m )
oddpart nip pollard# sort 1 swap 0
do swap pi * loop ;

Since f(Pn)=π(Pn)=n it's enough to test non primes:

: conjecture ( N -- ) 1
do i prime_ 0=
     if i func_f i pi u>
        if cr i .
        then
     then
loop ;

100000000 conjecture
35
49 ok

7 comments:

Shaun GriffithDecember 31, 2015 at 11:32 AM
"...more than 100 %."

You can't save more than 100%.
Lars-ErikDecember 31, 2015 at 1:43 PM
No, you are right! But it could takes more than 100 % more memory NOT using any squeezing method. Happy new year!
DanaJanuary 1, 2016 at 3:19 PM
While the mechanics of doing table-based prime counts are very interesting, I don't agree with the opening thesis: "The only possibility for a very fast prime counting function is to make tables." There are very fast prime count methods that don't use tables.
DanaJanuary 3, 2016 at 10:54 AM
Speeding up the phi function is a critical part of making it fast -- you can't naively use the recursive formula for large values. There are a number of ways to do it, each with different tradeoffs. Mapes' method is one way to handle small values. Another good method for dealing with small a values is to use small tables, which can give you almost instant results for phi(x,a) when a < 10. Caching can also be done to vastly cut down on the number of recursions (basically doing efficient memoization). I made a very different method that creates a sorted list as it walks down the a's -- this has very few function calls but can be memory intensive.

Sage uses an implementation of Legendre's method, with interesting phi optimizations -- Mapes' type calcs for small values, some shortcuts for large values, and then tables. You can see the code here:
https://github.com/sagemath/sage/blob/2cc7f6ce9b270c1efcd07b541fe94e34a245918d/src/sage/functions/prime_pi.pyx

primecount now uses eLMO by default, but the other algorithms are still there. His Phi uses small tables with small a values, caching otherwise.

Perl/ntheory uses eLMO by default, but the other algorithms are still there (no longer compiled in by default). There are a few ways to do phi -- small a with small tables (~32k of shorts), caching (32MB cache), or my a-walker.

Forth & math

Pages

Monday, December 28, 2015

Prime tables and the prime counting function

The tables

A conjecture

7 comments: