> >>>>> Zdenek writes:
> Zdenek> IMO this is the only way that can work well on architecture that
> Zdenek> has small number of registers and simultaneously rich addressing modes
> Zdenek> (i.e., x86).
> x86 has a small number of user-ISA-defined registers, but the
> microarchitecture of current implementations has many more internal
> registers. Targeting the microarchitecture has proven to be very
> effective for other high-performance x86 compilers. Internally, current
> x86 processors are RISC-like.
> Zdenek> I find it more senseful to make a simple pass (SR) more clever, than
> Zdenek> making the task that is already almost impossible to solve as it is
> Zdenek> more complex.
> SR occurs too early to have an accurate model of register pressure
> and to infer the final instructions that will be produced.
ivopts are run pretty late at the tree stage of optimizations, after
which point not many significant changes occur to the code; thus, quite
realistic estimates are possible. And, at the very least, we cannot
avoid taking register pressure/addressing modes into account currently,
without changes to register allocator and/or other passes (to see the
effects, you may just remove the few lines of code responsible for
handling them from ivopts; I did not implement them just for fun, but to
avoid performance regressions in order of 10%).
> Tuning and
> maintaining the heuristics and the register pressure information for every
> future change to GCC's optimization information is an unnecessary and
> unproductive burden.
Note that quite many people believe that some of the optimizations
currently implemented in RTL backend should be moved to the trees,
and that more of the target details should be exposed to allow this
(via gradually lowering the level of IR). I would hardly call the
work that makes this possible "unproductive".