Actions

icon Post
text/html Subscribe
text/html Unsubscribe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vsipl++] [patch] matvec: outer, gem, cumsum


  • Subject: Re: [vsipl++] [patch] matvec: outer, gem, cumsum
  • From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
  • Date: Tue, 27 Sep 2005 18:56:48 -0400



Don McCoy wrote:
The attached patch rounds out the functionality of [math.matvec] with the exception of a few of the matrix-vector product functions. Since those are implemented in a separate file, this patch stands by itself pretty well.

Don,

gemp and gems need to support the mat_conj and mat_herm mat_op_types as well. (The spec is a bit confusing. [math.matvec.gem]/3 defines the 4 mat_op_types: mat_ntrans, mat_trans, mat_herm, and mat_conj. gemp's requirements than say that OpA and OpB must be mat_ntrans or mat_trans unless T is complex. The implication is that if T is complex, OpA and OpB can be mat_herm and mat_conj as well).

The approach you've taken for gemp is fine, it is definitely possible to plug those additional cases in. However, since the number of cases is multiplicative (size(OpA) x size(OpB)), you might want to separate the handling of OpA and OpB to simplify things.

One way to do this is to define a class that applies a mat_op to a single matrix:

template <mat_op_type OpT,
          typename    T,
          typename    Block>
struct Apply_mat_op;

template <typename T,
          typename Block>
struct Apply_mat_op<mat_ntrans, T, Block>
{
  typedef typename const_Matrix<T, Block> result_type;

  static result_type
  exec(const_Matrix<T, Block> m) VSIP_NOTHROW
  {
    return m;
  }
};

template <typename T,
          typename Block>
struct Apply_mat_op<mat_trans, T, Block>
{
  typedef typename const_Matrix<T, Block>::transpose_type result_type;

  static result_type
  exec(const_Matrix<T, Block> m) VSIP_NOTHROW
  {
    return m.transpose();
  }
};

template <typename T,
          typename Block>
struct Apply_mat_op<mat_herm, complex<T>, Block>
// this definition only makes mat_herm only valid for complex<T>
{
...
};


You could optionaly provide a convenience function to use Apply_mat_op:

template <mat_op_type OpT,
          typename    T,
          typename    Block>
typename Apply_mat_op<OpT, T, Block>::result_type
apply_mat_op(...)
{
  return Apply_mat_op<OpT, T, Block>::exec(m);
}

Now, you could implement the top-level gemp as:

void
gemp(
  T0 alpha,
  const_Matrix<T1, Block1> A,
  const_Matrix<T2, Block2> B,
  T3 beta,
  Matrix<T4, Block4> C)
     VSIP_NOTHROW
{
  // equivalent to C = alpha * OpA(A) * OpB(B) + beta * C
  impl::gemp(alpha, apply_mat_op<OpA>(A), apply_mat_op<OpB>(B),
              beta, C);
}





------------------------------------------------------------------------

+ + + template <dimension_type d,
+           typename T0,
+           typename T1,
+           typename Block0,
+           typename Block1>
+ void
+ cumsum(
+   const_Vector<T0, Block0> v,
+ Vector<T1, Block1> w) + VSIP_NOTHROW
+ {
+ // Effects: w has values equaling the cumulative sum of values in v. + // + // If View is Vector, d is ignored and, for + // 0 <= i < v.size(), + // w.get(i) equals the sum over 0 <= j <= i of v.get(j)
+   assert( v.size() == w.size() );
+ + for ( index_type i = 0; i < v.size(); ++i )
+   {
+     T1 sum = T0();
+     for ( index_type j = 0; j <= i; ++j )
+       sum += v.get(j);
+     w.put(i, sum);
+   }

You could avoid recomputing the sum each time by keeping a running total:

	T1 sum = T0();
	for (index_type i=0; ...)
	{
	  sum += v.get(i);
	  w.put(i, sum);
	}

You should be able to something similar for matrix cumsum.