Tuesday, May 30, 2017

Karatsuba multiplication

The time for direct multiplication is proportional to n² where n is the number of figures in the multiplicands x,y. When choosing a big base B=2ⁿ and and writing

xy=(x0+B*x1)(y0+B*y1)=x0y0 + (x0*y1+x1*y0)B + x1*y1*B²

there are four multiplications of smaller numbers, also here the calculation time is proportional to n². However

(x0*y1+x1*y0)=(x0+x1)(y0+y1)-x0*y0-x1*y1

why it's enough to calculate three smaller multiplications 

x0*y0, x1*y1 and (x0+x1)(y0+y1). 

The multiplication with B are fast left shifting and if the shifting and the addition where cost free, the recursive Karatsuba multiplication would be very efficient



but unfortunately the extra math (and in Forth also some stack juggling) takes a lot of time and the method is efficient only for rather big numbers. For very big numbers, however, it's very efficient.

Here is the way I implemented it in ANS Forth:

: bcells* \ big m -- big*C^m
  cells top$ locals| n ad mb |
  ad ad mb + n move
  ad mb erase
  mb bvp @ +! ;
\ C is the number of digits in a cell

: bcells/ \ big m -- big/C^m
  cells top$ locals| n ad mb |
  ad mb + ad n move
  mb negate bvp @ +! ;

: bsplit \ w ad -- u v 
  dup nextfree < 
  if bvp @ dup @ vp+ bvp @ ! ! 
  else drop bzero
  then ;
\ A big number is split on the big stack at address ad

: btransmul \ x y -- x0 x1 y0 y1 m     B=2^bits 
  len1 len2 max cell + lcell 1+ rshift     \ m
  dup >r cells 
  >bx first over + bsplit 
  bx> first + bsplit r> ; 
\ x=x0+x1*B^m  y=y0+y1*B^m 


0x84 value karalim \ break point byte length for termination.


: b* \ x y -- xy
  len1 len2 max karalim < 
  if b* exit then
  btransmul >r                   \ x0 x1 y0 y1
  3 bpick 2 bpick recurse >bx    \ bx: x0*y0
  2 bpick 1 bpick recurse >bx    \ bx: x0*y0 x1*y1
  b+ >bx b+ bx>   recurse        \ (x0+x1)(y0+y1)
  bx b- by b- r@ bcells*         \ z1*C^m
  bx> r> 2* bcells* bx> b+ b+ <top ;
\ Karatsuba multiplication


No comments:

Post a Comment