[hpsdr] FIR filtering and HPSDR

Thu Sep 11 11:29:30 PDT 2014

Hi Erik,

First of all, upon reading your response, I immediately spotted a 
glaring error that I, in my haste, made in my message of yesterday. Let 
me first correct that.  The delay through a direct convolution linear 
phase filter is (1 / Sample_rate) * (N - 1)/2 where N is the number of 
taps.  This is perhaps more commonly expressed as (1 / Sample_rate) * 
M/2 where M is the order of the filter.  There are FIR designs with less 
signal delay, e.g., a "Minimum Phase" filter has somewhat less delay at 
the expense of, obviously, having non-linear phase.  While the algorithm 
in the paper appears to reduce the delay versus the "large block" 
approach, I don't believe it can reduce the delay below that of a direct 
convolution filter. When the author of the paper claims "zero delay," I 
can only assume that he is excluding the normal direct convolution FIR 
filter delay.  In any case, I don't believe he claims that his algorithm 
has less delay than direct convolution.

As shown in the paper (Figure 6), direct convolution filters are quite 
expensive computationally, prohibitively so in many cases. Therefore, 
with regard to things like bandpass filters in SDRs, FFT "fast 
convolution" filters are usually used because of their computational 
efficiency.  There are special cases where direct convolution filters 
are used.  For example, direct convolution may be used in an automatic 
notch filter where the filter coefficients are dynamically adjusted 
after each sample in response to the signal.  Direct convolution may 
also be used in efficient polyphase implementations , for various purposes.

You also make a good point about the "block" nature of data transfers in 
SDRs --- data typically does not arrive to the software in a 
sample-by-sample fashion.

It does appear that the author's approach can be used to reduce the 
delay --- per his data, that comes at the cost of considerable 
additional CPU cycles.

If you reach any further conclusions or do any simulation, please let me 
know how things work out.

73,
Warren  NR0V

On 9/11/2014 2:32 AM, Erik Anderson wrote:
> I do appreciate your thoughts and analysis on the paper.  Let's see if 
> I can spent a bit more bandwidth and push this farther with some 
> additional trains of thought.  Apologies if this is a little too 
> long-winded.
>
> First of all, it does appear as though the author is discussing FIR 
> filters with >1e5 taps to them, but he is not suggesting use of 
> "Sample_rate*(N-1)/2 where N is the number of taps".  Instead he 
> appears to be picking N to be "the size at which block convolution 
> becomes more efficient than direct form filtering; for a typical DSP 
> this might be 32 or 64 samples" and interrupting the algorithm at that 
> point.  The truncated direct-convolution filter appears to be getting 
> summed with the results of later-computed FFT operations in order to 
> return the complete filter.  Unless I am missing something this would 
> permit a zero-latency filter without incurring the normal delay of a 
> direct-convolution filter.
>
> Processing a filter sample-by-sample may make sense for audio 
> applications, but I'm getting a strong message that it makes no sense 
> for HPSDR, which is just too darn fast.  It's impossible for PC 
> applications to receive data sample-by-sample anyhow, the current 
> protocol returns ethernet packets containing between 38 samples (@ 4 
> receivers) and 126 samples (@ 1 receiver).  Any implementation of this 
> approach on PC for HPSDR should not use direct-convolution.  The 
> highest curve I would be looking at on Fig 6 would be the one labeled 
> "1.5 ms delay", which is absent any direct-convolution.
>
> The smallest FFT block size he is calculating is one of size 32 or 64, 
> and he is then scheduling a tree of block sizes of powers of two until 
> he gets up to the size required to satisfy the filter.  The choice of 
> starting with such a small number is because of his parallel use of a 
> direct-convolution algorithm.  As we would not be using direct 
> convolution we are then free to choose the starting FFT block size to 
> be an arbitrary amount balancing the amount of latency vs additional 
> complexity and performance.
>
> One possible use for this is in overlapped FFT calculations.  Assuming 
> the overlap is a (likely small) power of two, and assuming that all 
> windows are applied _after_ the FFT calculations are done (via 
> replacement with an equivalent filter), then this approach could be 
> used to _decrease_ the number of calculations necessary, by setting 
> FFTs first at the GCD of the overlapped frames and reusing the 
> calculations.
>
> At this point my focus is on Eq 24 (picture version) or Eq 22 
> (non-picture version), which seems to be the "magic" in tying two FFT 
> operations together.  I can't use any further derivations past this 
> point as the author is assuming real inputs, so I need to pour over 
> the definition of FFT again to see what this equation represents.
>
> 73 and thx as always,
> Erik KM2G

 1410460170.0