gcc-patches@gcc.gnu.org
[Top] [All Lists]

Re: ARM: Improve NEON vector creation

Subject: Re: ARM: Improve NEON vector creation
From: Daniel Jacobowitz
Date: Wed, 11 Nov 2009 09:20:39 -0500
On Wed, Nov 11, 2009 at 10:41:42AM +0000, Richard Earnshaw wrote:
> 
> On Tue, 2009-11-10 at 16:24 -0500, Daniel Jacobowitz wrote:
> > This isn't obviously a win, even for -mvectorize-with-neon-quad.
> > Should I limit vdup to the non-constant case instead?  I had hoped to
> > avoid the constant pool entry by using movw / movt / vdup, but GCC
> > doesn't realize (or doesn't agree) that such a sequence is cheaper
> > than a constant pool load.
> > 
> movw/movt will need a core register, so that's going to generate a
> 4-instruction sequence in most cases (with an instruction to move the
> result to the Neon reg bank); I would have thought that was unlikely to
> be a win.

No, just three (I hope):

   0:   e3000001        movw    r0, #1  ; 0x1
   4:   e3400002        movt    r0, #2  ; 0x2
   8:   eea00b10        vdup.32 q0, r0

I've checked this in as-is.  If someone wants to rip out that
particular part of the patch, I won't feel in the least offended :-)

-- 
Daniel Jacobowitz
CodeSourcery

<Prev in Thread] Current Thread [Next in Thread>