Actions

icon Post
text/html Subscribe
text/html Unsubscribe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vsipl++] [patch] FIR Filter bank benchmark


  • Subject: Re: [vsipl++] [patch] FIR Filter bank benchmark
  • From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
  • Date: Fri, 31 Mar 2006 14:59:03 -0500

Don McCoy wrote:
The attached patch adds one of the MIT Lincoln Labs' PCA Kernel-Level Benchmarks to VSIPL++ -- the FIR Filter Bank. It also has a minor re-organization of some support functions, moving them from the tests/ directory to the src/vsip_csl/ directory. Actually, copies have been made as I didn't think it would be good to delete the ones in tests/ until all other references to them have been cleaned up.

This benchmark defines two sets of parameters for performing a series of convolutions on the input data. In each case, M input vectors of length N are convolved with filters of length K. The two sets of parameters are given as follows:

    Set    1    2
    M    64    20
    N    4096    1024
    K    128    12

The benchmark framework defined for VSIPL++ sweeps N over a range of values, so the point of interest for each set may be extracted according to the table above.

Refer to the end of benchmarks/firbank.cpp to see the options used to select various tests. Note: the last digit of the option value is always 1 or 2, corresponding to the data set chosen.

In order to use external data files with the benchmark, they must be located in benchmarks/data/set1 and benchmarks/data/set2. The filenames must be as follows: inputs_X.matrix, filter.matrix and outputs_X.matrix, where X denotes the size as a power of two [log2(N)]. The default starting and ending values for N are 7 and 16, so files corresponding to those vector sizes must be provided.



Validation is performed with external data. For full convolution, all values are checked. The FFT-based algorithm is circular rather than linear though, so values near the beginning and end are not checked. The number of values that are checked is N - 2 * (K - 2).


Lastly, I had some difficulty getting the right answers to come out due to the fact that the convolutions are done repeatedly on the same vector in order to take a more accurate measurement. With the Fir class, the state_save/state_no_save template parameter *must* be set to 'no_save', or the results are retained between successive convolutions, thereby corrupting the results. Not what is desired in this case!

Actually, using state_no_save isn't all that bad. In particular for radar systems, data is usually not collected continuously. A regular interval of pulses are transmitted. In between each pulse the received signal is collected. This received data is not continuous because most systems cannot transmit and recieve data simultaneously (radar signals fall off with the 4th power of distance, so getting the transmitted signal would blow out the receive amplifiers); and because each new pulse "resets" the distance corresponding to the received data.

A system might look something like:

transmit:   *          *           *
receive:        ......    .......      .......

                     ^    ^
                     |    +- the beginning of this pulse is near
                     |
                     +- this end of this pulse is far


In a cheapo system, each pulse might have the same waveform (which would simplify the FIRbank into only needing a single set of coefficients). However, systems often use "waveform diversity" where each pulse is slightly different. This makes it harder to jam and may increase the sensitivity of the system. This diversity would require multiple sets of filter kernels.




Similarly with fast convolution, a temporary is used.  I.e.:

    for (index_type l=0; l<loop; ++l)
    {
      // Perform FIR convolutions
      for ( length_type i = 0; i < local_M; ++i )
      {
        Vector<T> tmp(N, T());
        fwd_fft(l_inputs.row(i), tmp);
        tmp *= response.row(0);    // assume fft already done on response
        inv_fft(tmp, test.row(i));
      }
    }

It should be OK to move the declaration of tmp entirely outside the loop. If fwd_fft's size is N, it will completely overwrite the values in 'tmp'


Moving the declaration and initialization of 'tmp' outside the loop has the same effect as with 'state_save' because the contents of tmp are not zeroed between rows. With it inside the loop (as it should be), performance does not appear to be affected noticeably, though it should have a slight impact.

Comments and feedback appreciated.


Reviewing the patch now ...


--
Jules Bergmann
CodeSourcery
jules@xxxxxxxxxxxxxxxx
(650) 331-3385 x705