## Tuesday, May 30, 2017

### Karatsuba multiplication

The time for direct multiplication is proportional to n² where n is the number of figures in the multiplicands x,y. When choosing a big base B=2ⁿ and and writing

xy=(x0+B*x1)(y0+B*y1)=x0y0 + (x0*y1+x1*y0)B + x1*y1*B²

there are four multiplications of smaller numbers, also here the calculation time is proportional to n². However

(x0*y1+x1*y0)=(x0+x1)(y0+y1)-x0*y0-x1*y1

why it's enough to calculate three smaller multiplications

x0*y0, x1*y1 and (x0+x1)(y0+y1).

The multiplication with B are fast left shifting and if the shifting and the addition where cost free, the recursive Karatsuba multiplication would be very efficient

but unfortunately the extra math (and in Forth also some stack juggling) takes a lot of time and the method is efficient only for rather big numbers. For very big numbers, however, it's very efficient.

Here is the way I implemented it in ANS Forth:

: bcells* \ big m -- big*C^m
cells top\$ locals| n ad mb |
mb bvp @ +! ;
\ C is the number of digits in a cell

: bcells/ \ big m -- big/C^m
cells top\$ locals| n ad mb |
mb negate bvp @ +! ;

: bsplit \ w ad -- u v
dup nextfree <
if bvp @ dup @ vp+ bvp @ ! !
else drop bzero
then ;
\ A big number is split on the big stack at address ad

: btransmul \ x y -- x0 x1 y0 y1 m     B=2^bits
len1 len2 max cell + lcell 1+ rshift     \ m
dup >r cells
>bx first over + bsplit
bx> first + bsplit r> ;
\ x=x0+x1*B^m  y=y0+y1*B^m

0x84 value karalim \ break point byte length for termination.

: b* \ x y -- xy
len1 len2 max karalim <
if b* exit then
btransmul >r                   \ x0 x1 y0 y1
3 bpick 2 bpick recurse >bx    \ bx: x0*y0
2 bpick 1 bpick recurse >bx    \ bx: x0*y0 x1*y1
b+ >bx b+ bx>   recurse        \ (x0+x1)(y0+y1)
bx b- by b- r@ bcells*         \ z1*C^m
bx> r> 2* bcells* bx> b+ b+ <top ;
\ Karatsuba multiplication