Actions

icon Post
text/html Subscribe
text/html Unsubscribe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vsipl++] SIMD all unaligned dispatch


  • To: Assem Salama <assem@xxxxxxxxxxxxxxxx>
  • Subject: Re: [vsipl++] SIMD all unaligned dispatch
  • From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
  • Date: Tue, 19 Jun 2007 08:34:52 -0400

Assem Salama wrote:
> Everyone,
>  This patch includes some missing pieces not included in previous patch.
> This should make a fresh checkout compile ok :) I apologize for last
> patch's incompleteness.

Assem,

What is the reason for extending the length of type_list?  Is that
needed for this patch?

Rather than add a new evaluator ("all unaligned"), I would like to
have a single evaluator handle the cases where views have the same
alignment (whether it is 0 or N).  The only difference between the two
is cleanup code before SIMD processing.  Can you make that change and
repost a patch?

Also, did you have a chance to benchmark the iterator change (#4)
below?

				-- Jules


>
> Thanks,
> Assem
>
>
> ------------------------------------------------------------------------
>

> Index: src/vsip/opt/simd/expr_evaluator.hpp
> ===================================================================

> +  static bool rt_valid(LB& lhs, RB const& rhs)
> +  {
> +    Ext_data<LB, layout_type> dda(lhs, SYNC_OUT);
> +    int lhs_a = simd::Proxy_factory<LB,       true>::alignment(lhs);

[1] Instead of calling Proxy_factory::alignment (which internally
creates another Ext_data object -- which is both extra overhead and
potentially undefined), use Simd_traits::alignment_of directly.

> +    return (dda.stride(0) == 1 &&
> +            simd::Proxy_factory<RB, true>::rt_valid(rhs, lhs_a));
> +
> +
> +  }
> +

> +    // First, deal with unaligned pointers
> + typename Ext_data<LB, layout_type>::raw_ptr_type raw_ptr = dda.data(); > + while(simd::Simd_traits<typename LB::value_type>::alignment_of(raw_ptr) &&
> +          n > 0)
> +    {
> +      lhs.put(size-n, rhs.get(size-n));
> +      n--;
> +      raw_ptr++;
> +    }

[2] What updates the pointers held by lp and rp?  They are still
unaligned, right?

Ah, I see.  You've changed Proxy::Proxy to force alignment below.


> Index: src/vsip/opt/simd/eval_generic.hpp
> ===================================================================
> --- src/vsip/opt/simd/eval_generic.hpp	(revision 174261)
> +++ src/vsip/opt/simd/eval_generic.hpp	(working copy)
> @@ -664,6 +664,8 @@
>
>    static bool rt_valid(DstBlock& dst, SrcBlock const& src)
>    {
> +    typedef simd::Simd_traits<typename SrcBlock::value_type> simd;
> +
>      // check if all data is unit stride
>      Ext_data<DstBlock, dst_lp>     ext_dst(dst,              SYNC_OUT);
>      Ext_data<Block1,   a_lp>       ext_a(src.first().left(), SYNC_IN);
> @@ -672,7 +674,11 @@
>             ext_a.stride(0) == 1 &&
>  	   ext_b.stride(0) == 1 &&
>  	   // make sure (A op B, A, k)
> -	   (&(src.first().left()) == &(src.second())));
> +	   (&(src.first().left()) == &(src.second())) &&
> +	   // make sure everyting is aligned!
> +	   !simd::alignment_of(ext_dst.data()) &&
> +	   !simd::alignment_of(ext_a.data()) &&
> +	   !simd::alignment_of(ext_b.data()));

[3] Doesn't threshold handle initial unaligned values?  If so, it is
sufficient to check that dst, a, and b all have the same alignment.



>    static void exec(DstBlock& dst, SrcBlock const& src)
> Index: src/vsip/opt/simd/expr_iterator.hpp
> ===================================================================
> --- src/vsip/opt/simd/expr_iterator.hpp	(revision 174261)
> +++ src/vsip/opt/simd/expr_iterator.hpp	(working copy)
> @@ -268,13 +268,14 @@
>    simd_type load() const
>    { return simd::perm(x0_, x1_, sh_); }
>
> -  void increment(length_type n = 1)
> +  //void increment(length_type n = 1)
> +  void increment()
>    {
> -    ptr_unaligned_ += n * Simd_traits<value_type>::vec_size;
> -    ptr_aligned_   += n;
> +    ptr_unaligned_ += Simd_traits<value_type>::vec_size;
> +    ptr_aligned_++;
>
>      // update x0
> -    x0_ = (n == 1)? x1_:simd::load((value_type*)ptr_aligned_);
> +    x0_ = x1_;

[4] Did you ever benchmark the difference between these two?

>
> -  Proxy(value_type *ptr) : ptr_(ptr) {}
> +  Proxy(value_type *ptr) : ptr_(ptr)
> +  {
> +    // Force alignment of pointer.
> +    intptr_t int_ptr = (intptr_t)ptr_;
> +    int_ptr &= ~(Simd_traits<value_type>::alignment-1);
> +    ptr_ = (value_type*) int_ptr;
> +  }
> +

[5] For LValue_access_traits, this ignores the IsAligned template
parameter.  since we appear to only handle the case where the LHS
is aligned, we should specialize this for IsAligned = true.




--
Jules Bergmann
CodeSourcery
jules@xxxxxxxxxxxxxxxx
(650) 331-3385 x705