Actions

icon Post
text/html Subscribe
text/html Unsubscribe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vsipl++] [patch] Fix SSE2 mag()


  • To: VSIPL++ Developers List <vsipl++@xxxxxxxxxxxxxxxx>
  • Subject: [vsipl++] [patch] Fix SSE2 mag()
  • From: Jules Bergmann <jules@xxxxxxxxxxxxxxxx>
  • Date: Fri, 01 Feb 2008 15:32:21 -0500

The mag mask had the wrong width for each element (24 bits instead of 32). This was causing coverage_unary to fail.

Not sure why this wasn't showing up with buildbot. Perhaps the buildbot configs are not using --enable-simd-loop-fusion? Also, this wouldn't have shown up when IPP was enabled because IPP mag takes precendence of SIMD loop fusion.

Patch applied to trunk and branches/1.4.

			-- Jules

--
Jules Bergmann
CodeSourcery
jules@xxxxxxxxxxxxxxxx
(650) 331-3385 x705
Index: ChangeLog
===================================================================
--- ChangeLog	(revision 192344)
+++ ChangeLog	(working copy)
@@ -1,5 +1,9 @@
 2008-01-31  Jules Bergmann  <jules@xxxxxxxxxxxxxxxx>
 
+	* src/vsip/opt/simd/simd.hpp (SSE2 mag): Fix bug in mask width.
+
+2008-01-31  Jules Bergmann  <jules@xxxxxxxxxxxxxxxx>
+
 	* scripts/config: Add missing SIMD configure flags in Mondo package.
 
 2008-01-30  Jules Bergmann  <jules@xxxxxxxxxxxxxxxx>
Index: src/vsip/opt/simd/simd.hpp
===================================================================
--- src/vsip/opt/simd/simd.hpp	(revision 191870)
+++ src/vsip/opt/simd/simd.hpp	(working copy)
@@ -1409,6 +1409,17 @@
 #endif
   }
 
+  static value_type extract(simd_type const& v, int pos)
+  {
+    union
+    {
+      simd_type  vec;
+      value_type val[vec_size];
+    } u;
+    u.vec             = v;
+    return u.val[pos];
+  }
+
   static simd_type add(simd_type const& v1, simd_type const& v2)
   { return _mm_add_pd(v1, v2); }
 
@@ -1427,9 +1438,9 @@
 
   static simd_type mag(simd_type const& v1)
   {
-    simd_type mask = (simd_type)_mm_set_epi32(0x7ffffff, 0xfffffff,
-					      0x7ffffff, 0xfffffff);
-    return _mm_and_pd((simd_type)mask, v1);
+    simd_type mask = (simd_type)_mm_set_epi32(0x7fffffff, 0xffffffff,
+					      0x7fffffff, 0xffffffff);
+    return _mm_and_pd(mask, v1);
   }
 
   static simd_type min(simd_type const& v1, simd_type const& v2)