 |
|
|
|
Actions
|
|
[ Date Prev][ Date Next][ Thread Prev][ Thread Next][ Date Index][ Thread Index]
Re: [vsipl++] [patch] BLAS dispatch
- Subject: Re: [vsipl++] [patch] BLAS dispatch
- From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
- Date: Mon, 07 Nov 2005 22:19:31 -0500
Don McCoy wrote:
The attached patch adds dispatch support for certain BLAS functions.
Two things that are worth drawing attention to are: 1) The row-major
cases for outer() with complex values and 2) The various run-time and
compile-time checks used in the blas evaluator functions.
This patch looks good. I have two small comments below, please check it
in once they're addressed.
For 1), my concern is that the BLAS 'ger' variant used can only
conjugate the second vector argument. I'm using the non-conj version
and performing the conjugation on the first vector argument manually.
It involves memory allocation and an extra loop through one of the
vectors. I'd like to know if there is a more efficient way to do this.
The choices here seem to be:
1) use generic implementation
2) compute result in wrong dimension order, then tranpose in place.
3) conjugate in-place, compute outer, reverse conjugate
4) allocate temporary storage to store conjugate (as you currently do)
Let's go with either (1) or (3). (2) would be good, but although we
have a stub in Ext_data to reorgnize data, its not implemented yet. We
can't do (4), the current approach, because of the memory allocation.
We need to avoid memory allocation that would occur during the inner
loop of an application.
For 2), just want to make sure I didn't omit any checks that would
dispatch a call to BLAS that it cannot handle. I was careful to verify
that BLAS was only called when it should be, but it would be easy to
overlook something if there is not a corresponding test case for it. In
cases like outer, it is not tested with a column-major result matrix if
only the vsip::outer() is called (because it allocates the matrix with
the default block). So, for the test, I added the col-major cases
explicitly by calling vsip::impl::outer() directly. There may be other
cases where we should add specific tests for col-major layouts.
This sounds good.
-- Jules
===================================================================
RCS file: /home/cvs/Repository/vpp/src/vsip/impl/general_dispatch.hpp,v
retrieving revision 1.1
diff -c -p -r1.1 general_dispatch.hpp
*** src/vsip/impl/general_dispatch.hpp 12 Oct 2005 12:45:05 -0000 1.1
--- src/vsip/impl/general_dispatch.hpp 4 Nov 2005 19:32:27 -0000
*************** namespace impl
*** 35,40 ****
--- 35,42 ----
struct Op_prod_vv; // vector-vector dot-product
Let's create seperate tags for dot-product and outer-product to avoid
confusion. Perhaps Op_prod_vv_dot and Op_prod_vv_outer?
struct Op_prod_mm; // matrix-matrix product
+ struct Op_prod_mv; // matrix-vector product
+ struct Op_prod_vm; // vector-matrix product
|
|