[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Re: [vsipl++] [patch] Some enhancements to the FFT code.
- To: VSIPL++ Developers List <vsipl++@xxxxxxxxxxxxxxxx>
- Subject: Re: [vsipl++] [patch] Some enhancements to the FFT code.
- From: Stefan Seefeld <stefan@xxxxxxxxxxxxxxxx>
- Date: Mon, 27 Feb 2006 10:10:00 -0500
Jules Bergmann wrote:
I'm not sure if the mercury FFTs have any stated alignment requirements
for their temporary buffer, but to be safe we should allocate with
either a 16-byte (altivec) or 32-byte (cache line) alignment using the
alloc_aligned function.
Ok, I use alloc_align(32, ...) for now. (Various backends may have their
own optimized memory management routines, so this has to be revisited later.)
I took the occasion to apply a patch to the alloc_align function which
we had discussed many moons ago. It is now parametrized, i.e. instead of
double *array = static_cast<double *>(alloc_align(32, 1024 * sizeof(double)));
you simply write
double *array = alloc_align<double>(32, 1024);
I adjusted all the code that uses alloc_align accordingly (and fixed a
long-standing issue with paths in some sarsim-related scripts). The attached
patch is checked in.
Regards,
Stefan
Index: ChangeLog
===================================================================
RCS file: /home/cvs/Repository/vpp/ChangeLog,v
retrieving revision 1.398
diff -u -r1.398 ChangeLog
--- ChangeLog 23 Feb 2006 08:21:17 -0000 1.398
+++ ChangeLog 27 Feb 2006 15:05:24 -0000
@@ -1,3 +1,17 @@
+2006-02-27 Stefan Seefeld <stefan@xxxxxxxxxxxxxxxx>
+
+ * tests/fft.cpp: Add tests for complex split format.
+ * src/vsip/impl/allocation.hpp: Make alloc_align type-safe.
+ * src/vsip/impl/aligned_allocator.hpp: Adjust accordingly.
+ * apps/sarsim/sarsim.hpp: Likewise.
+ * apps/sarsim/mit-sarsim.cpp: Likewise.
+ * apps/sarsim/chk-simd-48-4: Fix path.
+ * apps/sarsim/chk-simd-8-4: Likewise.
+ * apps/sarsim/chk-sims-48-4: Likewise.
+ * apps/sarsim/chk-sims-8-4: Likewise.
+ * src/vsip/impl/signal-fft.hpp: Use temporary buffer in SAL backend.
+ * src/vsip/impl/sal/fft.hpp: Likewise.
+
2006-02-23 Don McCoy <don@xxxxxxxxxxxxxxxx>
* src/vsip/profile.cpp: corrected cases where 'stamp_type'
Index: apps/sarsim/chk-simd-48-4
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/chk-simd-48-4,v
retrieving revision 1.1
diff -u -r1.1 chk-simd-48-4
--- apps/sarsim/chk-simd-48-4 10 Aug 2005 18:26:36 -0000 1.1
+++ apps/sarsim/chk-simd-48-4 27 Feb 2006 15:05:24 -0000
@@ -2,6 +2,8 @@
# Check result of single-precision run
+DIR="."
+
NRANGE=2048
NPULSE=512
NTAPS=48
@@ -12,22 +14,22 @@
PREC=d
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hh-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/hh-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hv-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/hv-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vh-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/vh-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vv-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/vv-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
Index: apps/sarsim/chk-simd-8-4
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/chk-simd-8-4,v
retrieving revision 1.1
diff -u -r1.1 chk-simd-8-4
--- apps/sarsim/chk-simd-8-4 16 Jun 2005 18:01:20 -0000 1.1
+++ apps/sarsim/chk-simd-8-4 27 Feb 2006 15:05:24 -0000
@@ -2,6 +2,8 @@
# Check result of single-precision run
+DIR="."
+
NRANGE=256
NPULSE=64
NTAPS=8
@@ -9,22 +11,22 @@
THRESH=-190
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i test-8/hh-d-$NTAPS-$NFRAME.bin \
-ref test-8/ref-plain/hh-d-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i test-8/hv-d-$NTAPS-$NFRAME.bin \
-ref test-8/ref-plain/hv-d-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i test-8/vh-d-$NTAPS-$NFRAME.bin \
-ref test-8/ref-plain/vh-d-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i test-8/vv-d-$NTAPS-$NFRAME.bin \
-ref test-8/ref-plain/vv-d-$NTAPS-$NFRAME.bin \
-chk $THRESH
Index: apps/sarsim/chk-sims-48-4
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/chk-sims-48-4,v
retrieving revision 1.1
diff -u -r1.1 chk-sims-48-4
--- apps/sarsim/chk-sims-48-4 10 Aug 2005 18:26:36 -0000 1.1
+++ apps/sarsim/chk-sims-48-4 27 Feb 2006 15:05:24 -0000
@@ -2,6 +2,8 @@
# Check result of single-precision run
+DIR="."
+
NRANGE=2048
NPULSE=512
NTAPS=48
@@ -12,22 +14,22 @@
PREC=s
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hh-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/hh-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hv-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/hv-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vh-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/vh-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vv-$PREC-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-$ATTR/vv-$PREC-$NTAPS-$NFRAME.bin \
-chk $THRESH
Index: apps/sarsim/chk-sims-8-4
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/chk-sims-8-4,v
retrieving revision 1.2
diff -u -r1.2 chk-sims-8-4
--- apps/sarsim/chk-sims-8-4 5 Aug 2005 20:20:42 -0000 1.2
+++ apps/sarsim/chk-sims-8-4 27 Feb 2006 15:05:24 -0000
@@ -2,6 +2,8 @@
# Check result of single-precision run
+DIR="."
+
TDIR=test-8
NRANGE=256
NPULSE=64
@@ -10,22 +12,22 @@
THRESH=-190
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hh-s-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-plain/hh-s-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/hv-s-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-plain/hv-s-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vh-s-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-plain/vh-s-$NTAPS-$NFRAME.bin \
-chk $THRESH
-histcmp -nrange $NRANGE -npulse $NPULSE \
+$DIR/histcmp -nrange $NRANGE -npulse $NPULSE \
-i $TDIR/vv-s-$NTAPS-$NFRAME.bin \
-ref $TDIR/ref-plain/vv-s-$NTAPS-$NFRAME.bin \
-chk $THRESH
Index: apps/sarsim/mit-sarsim.cpp
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/mit-sarsim.cpp,v
retrieving revision 1.5
diff -u -r1.5 mit-sarsim.cpp
--- apps/sarsim/mit-sarsim.cpp 10 Sep 2005 17:59:24 -0000 1.5
+++ apps/sarsim/mit-sarsim.cpp 27 Feb 2006 15:05:24 -0000
@@ -116,8 +116,7 @@
Vector<cval_type> v(npulse);
v = this->azbuf_ (Domain<1>(npulse, 1, npulse));
- io_type* io_buf =
- (io_type *)vsip::impl::alloc_align(32, 2 * npulse * sizeof (io_type));
+ io_type* io_buf = vsip::impl::alloc_align<io_type>(32, 2 * npulse);
vsip::Dense<1, vsip::complex<io_type> > io_block(Domain<1>(npulse), io_buf);
vsip::Vector<vsip::complex<io_type> > io_vec(io_block);
Index: apps/sarsim/sarsim.hpp
===================================================================
RCS file: /home/cvs/Repository/vpp/apps/sarsim/sarsim.hpp,v
retrieving revision 1.9
diff -u -r1.9 sarsim.hpp
--- apps/sarsim/sarsim.hpp 26 Sep 2005 20:11:05 -0000 1.9
+++ apps/sarsim/sarsim.hpp 27 Feb 2006 15:05:24 -0000
@@ -251,10 +251,10 @@
assert(cube_in_.block().admitted() == false);
- for (index_type frame = 0; frame < nframe; ++frame) {
+ for (index_type frame = 0; frame < nframe; ++frame)
+ {
input_frame_buffer_[frame] =
- static_cast<cval_type*>(alloc_align(align,
- cube_in_.size() * sizeof(cval_type)));
+ alloc_align<cval_type>(align, cube_in_.size());
cube_in_.block().rebind(input_frame_buffer_[frame]);
cube_in_.block().admit(false);
@@ -262,8 +262,7 @@
cube_in_.block().release(true);
output_frame_buffer_[frame] =
- static_cast<cval_type*>(alloc_align(align,
- cube_out_.size() * sizeof(cval_type)));
+ alloc_align<cval_type>(align, cube_out_.size());
}
}
Index: src/vsip/impl/aligned_allocator.hpp
===================================================================
RCS file: /home/cvs/Repository/vpp/src/vsip/impl/aligned_allocator.hpp,v
retrieving revision 1.4
diff -u -r1.4 aligned_allocator.hpp
--- src/vsip/impl/aligned_allocator.hpp 16 Sep 2005 22:03:20 -0000 1.4
+++ src/vsip/impl/aligned_allocator.hpp 27 Feb 2006 15:05:24 -0000
@@ -89,7 +89,7 @@
pointer allocate(size_type num, const void* = 0)
{
// allocate aligned memory
- pointer p = static_cast<pointer>(alloc_align(align, num*sizeof(T)));
+ pointer p = alloc_align<value_type>(align, num);
if (p == 0)
{
printf("failed to allocate(%lu)\n", static_cast<unsigned long>(num));
Index: src/vsip/impl/allocation.hpp
===================================================================
RCS file: /home/cvs/Repository/vpp/src/vsip/impl/allocation.hpp,v
retrieving revision 1.5
diff -u -r1.5 allocation.hpp
--- src/vsip/impl/allocation.hpp 24 Jul 2005 04:58:29 -0000 1.5
+++ src/vsip/impl/allocation.hpp 27 Feb 2006 15:05:24 -0000
@@ -73,16 +73,19 @@
/// Allocate aligned memory.
-inline void*
+template <typename T>
+inline T*
alloc_align(size_t align, size_t size)
{
#if HAVE_POSIX_MEMALIGN && !VSIP_IMPL_AVOID_POSIX_MEMALIGN
void* ptr;
- return (posix_memalign(&ptr, align, size) == 0) ? ptr : 0;
+ return (posix_memalign(&ptr, align, size*sizeof(T)) == 0)
+ ? static_cast<T*>(ptr)
+ : 0;
#elif HAVE_MEMALIGN
- return memalign(align, size);
+ return static_cast<T*>(memalign(align, size*sizeof(T)));
#else
- return alloc::impl_alloc_align(align, size);
+ return static_cast<T*>(alloc::impl_alloc_align(align, size*sizeof(T)));
#endif
}
Index: src/vsip/impl/signal-fft.hpp
===================================================================
RCS file: /home/cvs/Repository/vpp/src/vsip/impl/signal-fft.hpp,v
retrieving revision 1.30
diff -u -r1.30 signal-fft.hpp
--- src/vsip/impl/signal-fft.hpp 17 Feb 2006 20:23:44 -0000 1.30
+++ src/vsip/impl/signal-fft.hpp 27 Feb 2006 15:05:25 -0000
@@ -92,6 +92,7 @@
int stride_; // 1 for sd_ == 0, length of row for sd_ == 1.
int dist_; // 1 for sd_ == 1, length of column for sd_ == 0.
+ void *buffer_;
// used only for Fftm
int sd_; // 0: compute FFTs of rows; 1: of columns
int runs_; // number of 1D FFTs to perform; varies by map
Index: src/vsip/impl/sal/fft.hpp
===================================================================
RCS file: /home/cvs/Repository/vpp/src/vsip/impl/sal/fft.hpp,v
retrieving revision 1.1
diff -u -r1.1 fft.hpp
--- src/vsip/impl/sal/fft.hpp 17 Feb 2006 20:23:44 -0000 1.1
+++ src/vsip/impl/sal/fft.hpp 27 Feb 2006 15:05:25 -0000
@@ -193,6 +193,7 @@
VSIP_THROW((std::bad_alloc))
{
self.is_forward_ = (expn == -1);
+ self.buffer_ = alloc_align<typename Complex_of<inT>::type>(32, dom.size());
unsigned long max = sal::log2n<D>::translate(dom, sd, self.size_);
sal::fft_planner<D, inT, outT>::create(self.plan_, max);
}
@@ -202,6 +203,7 @@
destroy(Fft_core<D, inT, outT, doFftm>& self) VSIP_THROW((std::bad_alloc))
{
sal::fft_planner<D, inT, outT>::destroy(self.plan_);
+ free_align(self.buffer_);
}
inline void
@@ -266,8 +268,9 @@
{
FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
float *out = reinterpret_cast<float*>(out_arg);
- fft_ropx(&setup, const_cast<float*>(in), 1, out, 1,
- self.size_[0], FFT_FORWARD, sal::ESAL);
+ fft_roptx(&setup, const_cast<float*>(in), 1, out, 1,
+ reinterpret_cast<float*>(self.buffer_),
+ self.size_[0], FFT_FORWARD, sal::ESAL);
// unpack the data (see SAL reference for details).
int const N = (1 << self.size_[0]) + 2;
out[N - 2] = out[1];
@@ -306,8 +309,9 @@
{
FFT_setupd setup = reinterpret_cast<FFT_setupd>(self.plan_);
double *out = reinterpret_cast<double*>(out_arg);
- fft_ropdx(&setup, const_cast<double*>(in), 1, out, 1,
- self.size_[0], FFT_FORWARD, sal::ESAL);
+ fft_ropdtx(&setup, const_cast<double*>(in), 1, out, 1,
+ reinterpret_cast<double*>(self.buffer_),
+ self.size_[0], FFT_FORWARD, sal::ESAL);
// unpack the data (see SAL reference for details).
int const N = (1 << self.size_[0]) + 2;
out[N - 2] = out[1];
@@ -346,8 +350,9 @@
COMPLEX *out = reinterpret_cast<COMPLEX *>(out_arg);
long stride = 2;
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fft_copx(&setup, in_, stride, out, stride, self.size_[0],
- direction, sal::ESAL);
+ fft_coptx(&setup, in_, stride, out, stride,
+ reinterpret_cast<COMPLEX*>(self.buffer_),
+ self.size_[0], direction, sal::ESAL);
}
inline void
@@ -360,8 +365,9 @@
DOUBLE_COMPLEX *out = reinterpret_cast<DOUBLE_COMPLEX *>(out_arg);
long stride = 2;
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fft_copdx(&setup, in_, stride, out, stride, self.size_[0],
- direction, sal::ESAL);
+ fft_copdtx(&setup, in_, stride, out, stride,
+ reinterpret_cast<DOUBLE_COMPLEX*>(self.buffer_),
+ self.size_[0], direction, sal::ESAL);
}
// 2D real -> complex forward fft
@@ -418,9 +424,10 @@
// The size of the output array is (N/2) x M (if measured in std::complex<float>)
unsigned long const N = (1 << self.size_[1]) + 2;
unsigned long const M = (1 << self.size_[0]);
- fft2d_ropx(&setup, const_cast<float*>(in), self.stride_, self.dist_,
- out, self.stride_, N,
- self.size_[1], self.size_[0], FFT_FORWARD, sal::ESAL);
+ fft2d_roptx(&setup, const_cast<float*>(in), self.stride_, self.dist_,
+ out, self.stride_, N,
+ reinterpret_cast<float*>(self.buffer_),
+ self.size_[1], self.size_[0], FFT_FORWARD, sal::ESAL);
// unpack the data (see SAL reference, figure 3.6, for details).
unpack(out, N, M, self.stride_);
@@ -440,9 +447,10 @@
// The size of the output array is (N/2) x M (if measured in std::complex<float>)
unsigned long const N = (1 << self.size_[1]) + 2;
unsigned long const M = (1 << self.size_[0]);
- fft2d_ropdx(&setup, const_cast<double*>(in), self.stride_, self.dist_,
- out, self.stride_, N,
- self.size_[1], self.size_[0], FFT_FORWARD, sal::ESAL);
+ fft2d_ropdtx(&setup, const_cast<double*>(in), self.stride_, self.dist_,
+ out, self.stride_, N,
+ reinterpret_cast<double*>(self.buffer_),
+ self.size_[1], self.size_[0], FFT_FORWARD, sal::ESAL);
// unpack the data (see SAL reference, figure 3.6, for details).
unpack(out, N, M, self.stride_);
@@ -527,10 +535,11 @@
COMPLEX *out = reinterpret_cast<COMPLEX *>(out_arg);
long stride = 2;
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fft2d_copx(&setup, in_, stride, 2 << self.size_[1],
- out, stride, 2 << self.size_[1],
- self.size_[1], self.size_[0],
- direction, sal::ESAL);
+ fft2d_coptx(&setup, in_, stride, 2 << self.size_[1],
+ out, stride, 2 << self.size_[1],
+ reinterpret_cast<COMPLEX*>(self.buffer_),
+ self.size_[1], self.size_[0],
+ direction, sal::ESAL);
}
inline void
@@ -545,6 +554,7 @@
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
fft2d_copdx(&setup, in_, stride, stride << self.size_[1],
out, stride, stride << self.size_[1],
+ reinterpret_cast<LONG_COMPLEX*>(self.buffer_),
self.size_[1], self.size_[0],
direction, sal::ESAL);
}
@@ -571,9 +581,11 @@
reinterpret_cast<COMPLEX *>(const_cast<std::complex<float>*>(in));
COMPLEX *out = reinterpret_cast<COMPLEX *>(out_arg);
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fftm_copx(&setup, in_, self.stride_, self.dist_,
- out, 2, 2 << self.size_[1], self.size_[1], self.runs_,
- direction, sal::ESAL);
+ fftm_coptx(&setup, in_, self.stride_, self.dist_,
+ out, 2, 2 << self.size_[1],
+ reinterpret_cast<COMPLEX*>(self.buffer_),
+ self.size_[1], self.runs_,
+ direction, sal::ESAL);
}
inline void
@@ -583,10 +595,11 @@
FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
float *out = reinterpret_cast<float*>(out_arg);
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fftm_ropx(&setup, const_cast<float *>(in), self.stride_, self.dist_,
- out, self.stride_, self.dist_ + 2,
- self.size_[1], self.runs_,
- direction, sal::ESAL);
+ fftm_roptx(&setup, const_cast<float *>(in), self.stride_, self.dist_,
+ out, self.stride_, self.dist_ + 2,
+ reinterpret_cast<float*>(self.buffer_),
+ self.size_[1], self.runs_,
+ direction, sal::ESAL);
// Unpack the data (see SAL reference for details), and scale back by 1/2.
int const N = (1 << self.size_[1]) + 2;
float scale = 0.5f;
@@ -609,10 +622,11 @@
DOUBLE_COMPLEX *out = reinterpret_cast<DOUBLE_COMPLEX *>(out_arg);
long stride = 2;
long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
- fftm_copdx(&setup, in_, stride, stride << self.size_[1],
- out, stride, stride << self.size_[1],
- self.size_[1], 1 << self.size_[0],
- direction, sal::ESAL);
+ fftm_copdtx(&setup, in_, stride, stride << self.size_[1],
+ out, stride, stride << self.size_[1],
+ reinterpret_cast<DOUBLE_COMPLEX*>(self.buffer_),
+ self.size_[1], 1 << self.size_[0],
+ direction, sal::ESAL);
}
} // namespace vsip::impl
Index: tests/fft.cpp
===================================================================
RCS file: /home/cvs/Repository/vpp/tests/fft.cpp,v
retrieving revision 1.11
diff -u -r1.11 fft.cpp
--- tests/fft.cpp 17 Feb 2006 20:23:44 -0000 1.11
+++ tests/fft.cpp 27 Feb 2006 15:05:25 -0000
@@ -219,15 +219,16 @@
-/// Test by-reference Fft (out-of-place and in-place).
+/// Test complex by-reference Fft (out-of-place and in-place).
-template <typename T>
+template <typename T, typename Complex_format>
void
-test_by_ref(int set, length_type N)
+test_complex_by_ref(int set, length_type N)
{
- typedef Fft<const_Vector, T, T, fft_fwd, by_reference, 1, alg_space>
+ typedef std::complex<T> CT;
+ typedef Fft<const_Vector, CT, CT, fft_fwd, by_reference, 1, alg_space>
f_fft_type;
- typedef Fft<const_Vector, T, T, fft_inv, by_reference, 1, alg_space>
+ typedef Fft<const_Vector, CT, CT, fft_inv, by_reference, 1, alg_space>
i_fft_type;
f_fft_type f_fft(Domain<1>(N), 1.0);
@@ -239,10 +240,14 @@
test_assert(i_fft.input_size().size() == N);
test_assert(i_fft.output_size().size() == N);
- Vector<T> in(N, T());
- Vector<T> out(N);
- Vector<T> ref(N);
- Vector<T> inv(N);
+ typedef impl::Fast_block<1, CT,
+ impl::Layout<1, row1_type, impl::Stride_unit_dense, Complex_format> >
+ block_type;
+
+ Vector<CT, block_type> in(N, CT());
+ Vector<CT, block_type> out(N);
+ Vector<CT, block_type> ref(N);
+ Vector<CT, block_type> inv(N);
setup_data(set, in);
@@ -262,15 +267,16 @@
-/// Test by-value Fft.
+/// Test complex by-value Fft.
-template <typename T>
+template <typename T, typename Complex_format>
void
-test_by_val(int set, length_type N)
+test_complex_by_val(int set, length_type N)
{
- typedef Fft<const_Vector, T, T, fft_fwd, by_value, 1, alg_space>
+ typedef std::complex<T> CT;
+ typedef Fft<const_Vector, CT, CT, fft_fwd, by_value, 1, alg_space>
f_fft_type;
- typedef Fft<const_Vector, T, T, fft_inv, by_value, 1, alg_space>
+ typedef Fft<const_Vector, CT, CT, fft_inv, by_value, 1, alg_space>
i_fft_type;
f_fft_type f_fft(Domain<1>(N), 1.0);
@@ -282,10 +288,14 @@
test_assert(i_fft.input_size().size() == N);
test_assert(i_fft.output_size().size() == N);
- Vector<T> in(N, T());
- Vector<T> out(N);
- Vector<T> ref(N);
- Vector<T> inv(N);
+ typedef impl::Fast_block<1, CT,
+ impl::Layout<1, row1_type, impl::Stride_unit_dense, Complex_format> >
+ block_type;
+
+ Vector<CT, block_type> in(N, CT());
+ Vector<CT, block_type> out(N);
+ Vector<CT, block_type> ref(N);
+ Vector<CT, block_type> inv(N);
setup_data(set, in);
@@ -347,7 +357,6 @@
ref = out;
inv = i_fft(out);
-
test_assert(error_db(inv, in) < -100);
// make sure out has not been scribbled in during the conversion.
@@ -929,19 +938,27 @@
//
#if defined(VSIP_IMPL_FFT_USE_FLOAT)
- test_by_ref<complex<float> >(2, 64);
+ test_complex_by_ref<float, impl::Cmplx_inter_fmt>(2, 64);
+ test_complex_by_ref<float, impl::Cmplx_split_fmt>(2, 64);
#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<float> >(1, 68);
+ test_complex_by_ref<float, impl::Cmplx_inter_fmt>(1, 68);
+ test_complex_by_ref<float, impl::Cmplx_split_fmt>(1, 68);
#endif
- test_by_ref<complex<float> >(2, 256);
+ test_complex_by_ref<float, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_ref<float, impl::Cmplx_split_fmt>(2, 256);
#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<float> >(2, 252);
- test_by_ref<complex<float> >(3, 17);
-#endif
-
- test_by_val<complex<float> >(1, 128);
- test_by_val<complex<float> >(2, 256);
- test_by_val<complex<float> >(3, 512);
+ test_complex_by_ref<float, impl::Cmplx_inter_fmt>(2, 252);
+ test_complex_by_ref<float, impl::Cmplx_split_fmt>(2, 252);
+ test_complex_by_ref<float, impl::Cmplx_inter_fmt>(3, 17);
+ test_complex_by_ref<float, impl::Cmplx_split_fmt>(3, 17);
+#endif
+
+ test_complex_by_val<float, impl::Cmplx_inter_fmt>(1, 128);
+ test_complex_by_val<float, impl::Cmplx_split_fmt>(1, 128);
+ test_complex_by_val<float, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_val<float, impl::Cmplx_split_fmt>(2, 256);
+ test_complex_by_val<float, impl::Cmplx_inter_fmt>(3, 512);
+ test_complex_by_val<float, impl::Cmplx_split_fmt>(3, 512);
test_real<float>(1, 128);
#if !defined(VSIP_IMPL_SAL_FFT)
@@ -953,19 +970,27 @@
#if defined(VSIP_IMPL_FFT_USE_DOUBLE)
- test_by_ref<complex<double> >(2, 64);
+ test_complex_by_ref<double, impl::Cmplx_inter_fmt>(2, 64);
+ test_complex_by_ref<double, impl::Cmplx_split_fmt>(2, 64);
#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<double> >(1, 68);
+ test_complex_by_ref<double, impl::Cmplx_inter_fmt>(1, 68);
+ test_complex_by_ref<double, impl::Cmplx_split_fmt>(1, 68);
#endif
- test_by_ref<complex<double> >(2, 256);
+ test_complex_by_ref<double, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_ref<double, impl::Cmplx_split_fmt>(2, 256);
#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<double> >(2, 252);
- test_by_ref<complex<double> >(3, 17);
-#endif
-
- test_by_val<complex<double> >(1, 128);
- test_by_val<complex<double> >(2, 256);
- test_by_val<complex<double> >(3, 512);
+ test_complex_by_ref<double, impl::Cmplx_inter_fmt>(2, 252);
+ test_complex_by_ref<double, impl::Cmplx_split_fmt>(2, 252);
+ test_complex_by_ref<double, impl::Cmplx_inter_fmt>(3, 17);
+ test_complex_by_ref<double, impl::Cmplx_split_fmt>(3, 17);
+#endif
+
+ test_complex_by_val<double, impl::Cmplx_inter_fmt>(1, 128);
+ test_complex_by_val<double, impl::Cmplx_split_fmt>(1, 128);
+ test_complex_by_val<double, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_val<double, impl::Cmplx_split_fmt>(2, 256);
+ test_complex_by_val<double, impl::Cmplx_inter_fmt>(3, 512);
+ test_complex_by_val<double, impl::Cmplx_split_fmt>(3, 512);
test_real<double>(1, 128);
#if !defined(VSIP_IMPL_SAL_FFT)
@@ -979,18 +1004,26 @@
#if ! defined(VSIP_IMPL_IPP_FFT)
#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<long double> >(2, 64);
+ test_complex_by_ref<long double, impl::Cmplx_inter_fmt>(2, 64);
+ test_complex_by_ref<long double, impl::Cmplx_split_fmt>(2, 64);
#endif
- test_by_ref<complex<long double> >(1, 68);
- test_by_ref<complex<long double> >(2, 256);
-#if !defined(VSIP_IMPL_SAL_FFT)
- test_by_ref<complex<long double> >(2, 252);
- test_by_ref<complex<long double> >(3, 17);
+ test_complex_by_ref<long double, impl::Cmplx_inter_fmt>(1, 68);
+ test_complex_by_ref<long double, impl::Cmplx_split_fmt>(1, 68);
+ test_complex_by_ref<long double, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_ref<long double, impl::Cmplx_split_fmt>(2, 256);
+#if !defined(VSIP_IMPL_SAL_FFT)
+ test_complex_by_ref<long double, impl::Cmplx_inter_fmt>(2, 252);
+ test_complex_by_ref<long double, impl::Cmplx_split_fmt>(2, 252);
+ test_complex_by_ref<long double, impl::Cmplx_inter_fmt>(3, 17);
+ test_complex_by_ref<long double, impl::Cmplx_split_fmt>(3, 17);
#endif
- test_by_val<complex<long double> >(1, 128);
- test_by_val<complex<long double> >(2, 256);
- test_by_val<complex<long double> >(3, 512);
+ test_complex_by_val<long double, impl::Cmplx_inter_fmt>(1, 128);
+ test_complex_by_val<long double, impl::Cmplx_split_fmt>(1, 128);
+ test_complex_by_val<long double, impl::Cmplx_inter_fmt>(2, 256);
+ test_complex_by_val<long double, impl::Cmplx_split_fmt>(2, 256);
+ test_complex_by_val<long double, impl::Cmplx_inter_fmt>(3, 512);
+ test_complex_by_val<long double, impl::Cmplx_split_fmt>(3, 512);
test_real<long double>(1, 128);
#if !defined(VSIP_IMPL_SAL_FFT)