Multiplication Using Addition and Bit Shifts

Devon · Oct 1, 2023

The MULU and MULS instructions are known to be quite slow. They require at least 38 cycles to execute, with the worst case being 70 cycles. If you are multiplying with rather basic and small numbers, this isn't ideal. So, with this guide, I can show you an alternative way to perform multiplication, by just using addition and bit shifts.

The best way I can visualize this is basically how you can take the number 1 and expand it in a way that matches the multiplier. The easiest way I can show this is in binary.

Let's say we want to multiply by 10 ($0A). In binary that is 1010. So, how can we take the number 1 and expand it into that? Lemme show you...

First, let's start off with our number 1, and let's bit shift it to the left 1 time.
Code:
0001 -> 0010
If you look closely, you can see that one of the bits in $0A now matches up.

0010
1010

So, how can we get the other bit set? Well, what we can do is save our current iteration for later, and keep going. Let's now bit shift to the left twice.
Code:
0010 -> 1000
Saved: 0010
Well, now the other bit is set, but the first bit we had is now clear, so what gives?

1000
1010

This is where the magic happens. Remember the 0010 we saved? Well, let's add it back!
Code:
1000 -> 1010
And hey look, it's our multiplier! This method basically consists of a series of bit shifts and saving and re-adding to get our product, thought of with the example of multiplying 1 by the multiplier. As you go on, you basically save an iteration whenever you match a bit in your multiplier, and then you add it back on later.

Let's do another example. Let's say we want to multiply by 13 ($0D). In binary that is 1101.

Well, from the get go, with our number 1, we can see that the lowest bit matches, so we'll put that to the side.

0001
1101

Now let's shift it to the left 2 times.
Code:
0001 -> 0100
Saved: 0001
We now get our next match.

0100
1101

What we can do in this instance is add this iteration to our previously saved iteration. So that makes our saved iteration 0101. Now let's do 1 more bit shift to the left.
Code:
0100 -> 1000
Saved: 0101
We get our final match.

1000
1101

Now let's add our saved iteration (0101) back in, and voila, it's 1101, our multiplier.

We can implement the first example in m68k assembly like so, assuming we want to multiply as a 16-bit value, with d0 as our multiplicand.
Code:
    add.w   d0,d0         ; 4 ; First bit shift
    move.w  d0,d1         ; 4 ; Save iteration
    add.w   d0,d0         ; 4 ; 2 more bit shifts
    add.w   d0,d0         ; 4
    add.w   d1,d0         ; 4 ; Add saved iteration
                          ; 20 cycles total
And the second.
Code:
    move.w  d0,d1         ; 4 ; Save iteration
    add.w   d0,d0         ; 4 ; 2 bit shifts
    add.w   d0,d0         ; 4
    add.w   d0,d1         ; 4 ; Add to saved iteration
    add.w   d0,d0         ; 4 ; 1 more bit shift
    add.w   d1,d0         ; 4 ; Add saved iteration
                          ; 24 cycles total
And as we can also see, these 2 processes are faster than the best case scenario for either MULU or MULS.

Disclaimers

It should be noted that this should be mainly reserved for more basic multiplications. The more bit shifts and additions you need to do, the slower it gets, and the more instructions you need to store. If you are multiplying with a rather large and complex number, then this method would be worse than just using MULU or MULS. Use this method wisely! This is not a catch-all solution for slow multiplication. If using MULU or MULS is faster, then use them.

If you are dealing with negative numbers, then this can work, but be aware of any sign extensions you might need to deal with, depending on the sizes you work with. It's also much much faster to multiply with the absolute value of the multiplier and then negate the result than it is to attempt to do a series of shifts and additions to match the negative multiplier.

I also do 2 additions for 2 consecutive bit shifts, because doing 2 separate additions on bytes and words is faster than a single bit shift instruction (8 cycles vs 10). However, if you are doing more than 2 shifts or are working with longwords, then using a bit shift instruction is the faster option (using bit shift instructions increases by 2 cycles per shift, vs. at least 4 for each addition).

Log in or Sign up

Multiplication Using Addition and Bit Shifts

Devon A̸ ̴S̴ ̵C̵ ̷E̶ ̸N̸ ̴D̶ ̵E̶ ̸D̶ Member

Log in or Sign up

Multiplication Using Addition and Bit Shifts

Devon A̸ ̴S̴ ̵C̵ ̷E̶ ̸N̸ ̴D̶ ̵E̶ ̸D̶ Member

Useful Searches