[ Date Prev][ Date Next][ Thread Prev][ Thread Next][ Date Index][ Thread Index]
[vsipl++] [patch] Fix SSE2 mag()
- To: VSIPL++ Developers List <vsipl++@xxxxxxxxxxxxxxxx>
- Subject: [vsipl++] [patch] Fix SSE2 mag()
- From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
- Date: Fri, 01 Feb 2008 15:32:21 -0500
The mag mask had the wrong width for each element (24 bits instead of
32). This was causing coverage_unary to fail.
Not sure why this wasn't showing up with buildbot. Perhaps the buildbot
configs are not using --enable-simd-loop-fusion? Also, this wouldn't
have shown up when IPP was enabled because IPP mag takes precendence of
SIMD loop fusion.
Patch applied to trunk and branches/1.4.
-- Jules
--
Jules Bergmann
CodeSourcery
jules@xxxxxxxxxxxxxxxx
(650) 331-3385 x705
Index: ChangeLog
===================================================================
--- ChangeLog (revision 192344)
+++ ChangeLog (working copy)
@@ -1,5 +1,9 @@
2008-01-31 Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
+ * src/vsip/opt/simd/simd.hpp (SSE2 mag): Fix bug in mask width.
+
+2008-01-31 Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
+
* scripts/config: Add missing SIMD configure flags in Mondo package.
2008-01-30 Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
Index: src/vsip/opt/simd/simd.hpp
===================================================================
--- src/vsip/opt/simd/simd.hpp (revision 191870)
+++ src/vsip/opt/simd/simd.hpp (working copy)
@@ -1409,6 +1409,17 @@
#endif
}
+ static value_type extract(simd_type const& v, int pos)
+ {
+ union
+ {
+ simd_type vec;
+ value_type val[vec_size];
+ } u;
+ u.vec = v;
+ return u.val[pos];
+ }
+
static simd_type add(simd_type const& v1, simd_type const& v2)
{ return _mm_add_pd(v1, v2); }
@@ -1427,9 +1438,9 @@
static simd_type mag(simd_type const& v1)
{
- simd_type mask = (simd_type)_mm_set_epi32(0x7ffffff, 0xfffffff,
- 0x7ffffff, 0xfffffff);
- return _mm_and_pd((simd_type)mask, v1);
+ simd_type mask = (simd_type)_mm_set_epi32(0x7fffffff, 0xffffffff,
+ 0x7fffffff, 0xffffffff);
+ return _mm_and_pd(mask, v1);
}
static simd_type min(simd_type const& v1, simd_type const& v2)
|