[arm-gnu] Using ARM Neon Intrinsics to load a constant
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[arm-gnu] Using ARM Neon Intrinsics to load a constant
- To: arm-gnu@xxxxxxxxxxxxxxxx
- Subject: [arm-gnu] Using ARM Neon Intrinsics to load a constant
- From: Bob Feretich <bob.feretich@xxxxxxxxxxxxxxx>
- Date: Thu, 10 Dec 2009 17:08:55 -0800
I want to set the all of the elements of a quad word vector to a legal
floating point constant (usually 0.0).
Using the intrinsics, it seems that I need to code...
float32x4_t vec4;
...
vec4 = vdupq_n_f32 (0.0);
When I do that the 2009q3 compiler generates:
mov r7,#0
vdupq.32 qx,r7
The above uses an unnecessary Arm core register. How do I get the
compiler to generate:
vmov.32 qx,#0
I wasn't sure that I had was using the intrinsics call properly, so I
posed the question to the ARM software support engineers. They confirmed
my understanding. This was their response...
RVCT 4.0 build 650 generates the code you expect. When I compile this
function:
float32x4_t Foo(void)
{
float32x4_t vec4 = vdupq_n_f32(0.0);
return vec4;
}
It generates:
Foo
0x00000000: f2800050 P... VMOV.I32 q0,#0
0x00000004: e12fff1e ../. BX lr
What do I code to have the Code Sourcery compiler to generate similar code?
Regards,
Bob