Actions

icon Post
text/html Subscribe
text/html Unsubscribe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vsipl++] patch: support sal-fft


  • Subject: Re: [vsipl++] patch: support sal-fft
  • From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
  • Date: Fri, 17 Feb 2006 12:26:36 -0500

Stefan Seefeld wrote:
The attached patch adds support for SAL as a new backend for our FFT engine. This new backend does not support 3D ffts, and block sizes that are not powers of two. I therefor (conditionally) masked those tests in tests/fft.cpp that SAL does not support. This can be reverted as soon as the fft_dispatch framework is in place,
i.e. multiple fft backends are supported in parallel.

That sounds fine.


I believe the code in src/vsip/impl/sal/fft.h can be tidied up quite a bit,
though I'd prefer to do that only once the code is refactored, i.e. for now
focus on feature completeness.

Stefan,

This looks good.

We need to avoid copying data, unless it is necessary to pack/unpack it, align it, or transpose it. In particular,for complex to complex FFTs we should use fft_coptx() instead of fft_copx().

				-- Jules


Regards,
        Stefan


------------------------------------------------------------------------

+#elif defined(VSIP_IMPL_SAL_FFT)
+    // In some contexts, SAL destroys the input data itself, and sometimes
+    // we have to modify it to 'pack' data into the format SAL expects
+    // (see SAL Tutorial for details).
+    // Therefor, we always copy the input.
+    static const bool  must_copy = true;

For complex-to-complex FFTs, there are non-clobbering variants of some of the functions (fft_coptx instead of fft_copx) that we should use instead of copying the data. For real-to-complex and complex-to-real it still might be necessary to copy.

+    // SAL cannot handle non-unit strides properly as 'complex' isn't
+    // a real (packed) datatype, so the stride would be applied to the real/imag
+    // offset, too.
+    static const vsip::dimension_type  transpose_target = 1;
 #else
       static const bool  must_copy = false;
       static const vsip::dimension_type  transpose_target = axis;
 #endif
-
       typename impl::Maybe_force_copy<
 	  must_copy,typename local_type::block_type,
           typename impl::Maybe_transpose<2,axis,transpose_target>::type>::type
@@ -833,7 +890,7 @@
 	  impl::Ext_data<in_block_type>(this->in_temp_).data());
const bool native_order = (axis == transpose_target); - +
       this->core_->scale_ = this->scale_;
       this->core_->runs_ = local_inout.size(1-axis);
       this->core_->stride_ = 1;
Index: src/vsip/impl/sal/fft.hpp

+
+template <vsip::dimension_type D, typename inT, typename outT, bool doFftm>
+inline void
+from_to(Fft_core<D, inT, outT, doFftm>& self, inT const* in, outT* out)
+  VSIP_NOTHROW
+{
+  assert(0 && "Sorry, operation not yet supported for this type !");
+  // TBD

This shouldn't make it out into a release, but to be on the safe size, you should really say:

	VSIP_IMPL_THROW(vsip::impl::unimplemented("..."));

+}
+
+// 1D real -> complex forward fft
+
+inline void
+from_to(Fft_core<1, float, std::complex<float>, false>& self,
+	float const* in, std::complex<float>* out)
+  VSIP_NOTHROW
+{
+  FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
+  float *out_ = reinterpret_cast<float*>(out);

We reserve the "_" suffix for member variables. Perhaps you could call the parameter "arg_out" and then use "out" for the cast?

+  fft_ropx(&setup, const_cast<float*>(in), 1, out_, 1,
+	   self.size_[0], FFT_FORWARD, sal::ESAL);
+  // unpack the data (see SAL reference for details).
+  int const N = (1 << self.size_[0]) + 2;
+  out_[N - 2] = out_[1];
+  out_[1] = 0.f;
+  out_[N - 1] = 0.f;
+  // forward fft_ropx is scaled up by 2.
+  float scale = 0.5f;
+  vsmulx(out_, 1, &scale, out_, 1, N, sal::ESAL);
+}

+inline void
+from_to(Fft_core<1, std::complex<float>, std::complex<float>, false>& self,
+	std::complex<float> const *in, std::complex<float> *out) VSIP_NOTHROW
+{
+  FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
+  COMPLEX *in_ = reinterpret_cast<COMPLEX *>(const_cast<std::complex<float>*>(in));
+  COMPLEX *out_ = reinterpret_cast<COMPLEX *>(out);
+  long stride = 2;
+  long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
+  fft_copx(&setup, in_, stride, out_, stride, self.size_[0],
+	   direction, sal::ESAL);

We should use fft_coptx() instead to avoid clobbering the input. Likewise for fft_copdtx.