[hpsdr] Blackfin 32*32 bits multiply

Tue Aug 21 19:06:41 PDT 2007

On 8/21/07, Greg Overkamp <overkamp at yahoo.com> wrote:
> ***** High Performance Software Defined Radio Discussion List *****
>
>
> > On 21/08/2007, Chris Donzelot wrote:
> >
> > > There is some details in the Blackfin Hardware
> > reference.
> > > The 32bits mpy is microcoded
> > > The dual MAC operation only allow doubling MAC
> > throughput , you can
> > > input two 32 bits scalar into both MACs to double
> > throughput.
> > > One of both MAC have a shifter just after
> > accumulator, and the MAC
> > > output can be a standard  32 bits register.
> > > The  32 bits  MAC will be slower than 300 MMAC/s @
> > 600 MHz (4 16x16 MAC
> > > + shift & 32 bits addition).
> > >
> >
> > I am not an expert in the Blackfin, however I have
> > been reading up on it.
> > Some Blackfin instructions can be performed in
> > parallel thereby reducing the
> > number of cycles.
> >
> > If you look at Analog Devices, Engineer to Engineer
> > Note EE-186
> >
> (http://www.analog.com/UploadedFiles/Application_Notes/52064380701163EE186.p
> > df ), you will see that it is possible  to do a 32 x
> > 32 multiply with a 31
> > bit accuracy in 2 instructions cycles.
> >
> > Also in the same document are examples of 32 bit and
> > 31 bit accurate FIR
> > implementations with a calculation for the number of
> > cycles. This may help
> > estimate the relative performance of the Blackfin
> > against a Pentium. I do
> > not know the equivalent number of cycles for Pentium
> > type processor, perhaps
> > some else could point to a source for these.
> >
> > Of course the time taken to carry out a 32 bit
> > multiply is only part of the
> > story. Perhaps a better solution would be to choose
> > a particular dsp
> > function or set of functions required and do a
> > direct comparison.
> >
> > Chris Down
> > G8MXW
> >
>
> A 32x32 bit multiply produces a 64-bit result. In the
> case of the Blackfin 32x32 bit multiply (I am
> referring to the built-in 32x32 bit multiply, not one
> that you would write yourself as a macro) the result
> can only be saved to a 32-bit register. This means
> that you would need to scale down either the
> coefficients or the data in order to prevent overflow
> on storing the multiplier result, which really defeats
> the purpose of having a 32x32 bit multiply.
>
> Most integer DSPs have accumulators that are even
> larger than the multiply result, which allows the
> intermediate sum-of-products in an FIR filter, for
> example, to grow larger than the size of the multiply
> result (64-bits in the case of a 32x32 bit multiply).
> The final FIR result may be within range of the size
> of the multiplier output width, while the intermediate
> sum could have grown larger than this width. The extra
> accumulator width prevents saturation from occurring
> during the sum-of-products. The Blackfin does indeed
> do this for its 16x16 bit multiply by providing a
> 40-bit accumulator rather than a 32-bit accumulator
> (recall the Motorola DSP56000 with its 24x24
> multiplier and 56-bit accumulator).
>
> A 32x32 bit multiply with a 32-bit result is only
> useful if you know ahead of time that either your data
> or your coefficients will not be 32 bits. When using
> 24-bit data converters, you would have to use 8-bit
> FIR coefficients in order to guarantee no overflow of
> the multiplier! I really think that application note
> EE-186 is wishful thinking on ADI's part in order to
> try to sell its Blackfin processor to audio folks.
>
> The Blackfin looks like a great 16-bit DSP, but for
> high resolution audio work (or baseband SDR using
> 24-bit converters) I think a better choice would be a
> DSP that has native 32-bit arithmetic.
>
> Greg
> WD9DEX

Thanks Greg! FINALLY someone talking some sense about the Blackfin (as
opposed to the armchair DSP engineers)!

73 Phil N8VB

 1187748401.0