On 29/12/2007, David Collier-Brown <davec-b@xxxxxxxxxx> wrote:
> > On 28/12/2007, Volker Lendecke <vl@xxxxxxxxx> wrote:
> >>On my Laptop with some limited netbench runs this gains about 1.5% of
> >>performance. When looking at the assembler output I would suspect the
> >>gain is by the fact that with this in place the calls to the debug
> >>functions is
> >>moved to the function end, out of the way of the normal code paths. valgrind
> >>tests pending I would suspect this to be much more cache friendly.
> James Peach wrote:
> > ISTR doing this for IRIX giving betwen 2% and 5%. Compiler and
> > architecture dependent, but definitely worthwhile.
> Fred Weigel and I saw similar results on SPARC many moons ago,
> but we didn't have an unlikely() to fix it with.
> Specifically, we found setting the branch-taken prediction bit
> for the branch around the debug code had no measurable effect!
> A colleague with a hardware analyzer found that the slowdown
> was from branching to an address which required filling a
> new cache line, rather than the next one in a straight-line
Exactly. On MIPS/IRIX, the speedup was due to the positive istruction
cache effects of the compiler moving all the debug code out of the
main code path.
James Peach | jorgar@xxxxxxxxx