[hpsdr] [Flexradio] PS3 support in Linux 2.6 - Cool!

Robert McGwier rwmcgwier at gmail.com
Thu Dec 7 15:15:34 PST 2006




Eric Blossom wrote:
>> 200 GFLOPS on  << ONE>>  of the 8 SPE.   As always,  our needs will 
>> advance to meet the available resources.
>>     
>
> Uhh, don't think so.
>
> I'm pretty sure it goes like this (spent most of the last week on this stuff):
>
>   3.2 GHz SPE clock
>   In a single clock it can issue either 1 single-precision SIMD mult or add
>       or
>   In a single clock it can issue a fused single-precision SIMD mult+add
>
>   The SIMD instruction operates on 4 single-prec floats.
>   [Latency is 6 clocks, but can issue every clock]
>
> Thus each SPU does 3.2e9 * 4 = 12.8 GFLOPS.  Or, if the only instruction
> you ever execute is fused SIMD mult+add you get twice this, 25.6 GFLOPS.
>
> FWIW, in addition to issuing the floating point simd op per clock, the
> SPU can also issue another kind of instruction in its other pipe
> (typically a memory op), but that's irrelevant to the FLOP calcs.
>
> Thus the peak rate (assuming that _all_ you do is single-prec
> mult+add -- love those FIRs) is 25.6 GFLOPS/SPU * 8 SPUS = 205 GFLOPS.
>
> A more reasonable starting point (but still a stretch) is to assume
> that you can issue a non-fused-mult-add SIMD instruction per
> cycle and that gives you about 100 GFLOPS.  Of course, as soon as
> you're not able to fully vectorize, you lose another factor of 4, so
> call it 25 GFLOPS.
>
> My gut sense is that we ought to be able to sustain something in the
> 25 - 50 GFLOPs range (per cell chip, not SPU) for typical SDR work
> loads.  The 2-way blade of course doubles this.
>
> Hope you didn't bet the farm on that big number ;)
>   
I haven't bet anything yet.   I did mean 200 GFlops sustained per cell. 
  I got too enthused.   I can do all I need to with 50 GFlops on a 
single Cell Server and if I get 100 with the dual cell server,  it will 
be awesome.   As always, the problem in all these things is not how many 
flops we can do (we can do enough).   What the problem is going to be is 
I/O. Period.  The Mercury Cell server (2-way blade),  helps us solve 
this without solving it by having the PCI Express on it.  We can always 
use more and as always,  our needs will grow to meet our resources.

> Eric
>
>   
Bob


-- 
AMSAT Director and VP Engineering. Member: ARRL, AMSAT-DL,
TAPR, Packrats, NJQRP, QRP ARCI, QCWA, FRC. ARRL SDR WG Chair
"If you board the wrong train, it is no use running along the
corridor in the other direction. " - Dietrich Bonhoffer


 1165533334.0


More information about the Hpsdr mailing list