Jamie Lokier wrote:
> Kevin O'Connor wrote:
>>> - All segment bases are zero, and all limits are LIMIT
>>> (3GiB for old Linux in user mode).
>>> - When filling the MMU TLB, if it's for an address >= LIMIT,
>>> treat as MMU exception.
>> That's interesting.
>> If I understand correctly, you're suggesting using segment checks in
>> translated code if any segment has a non-zero base or a different size
>> limit. Thus limiting the optimization to those (common) cases where
>> the limits and bases are all the same. Presumably this would also
>> require LIMIT to be page aligned.
> Yes. I hadn't thought of non-zero bases, but that would work too.
> Base and limit have to be page aligned. The only code where that
> isn't true is likely 16 bit MS-DOS and Windows code, which being
> designed for much slower processes, should be fine with fully inlined
> segment checks anyway.
>> One could probably allow ES, FS, and GS to have different bases/limits
>> as long as their ranges all fit within the primary range (0..LIMIT of
>> CS, DS, and SS).
> You could also translate just the base offset adding, in cases where
> the limit covers the whole 4GiB. But that adds more translation key
> It's probably worth including ES as a "primary" segment like DS.
>> It should be okay to always emit segment checks for
>> accesses that use these segments as they should be pretty rare.
> In Windows, and modern Linux, %fs or %gs are used in userspace code
> for thread-specific variables. In older Linux kernels, %fs is used to
> copy data to/from userspace memory. Maybe the translated checks for
> these are not enough overhead to matter?
>> Would this still work in 32bit flat mode?
> It will if QEMU's MMU TLB is always used, and uses an identity
> mapping. I'm not familiar with that part of QEMU, but I had the
> impression it's always used as it also traps memory-mapped I/O.
> In which case, it should work in 16 bit flat mode too :)
>>> - Flush MMU TLB on any interesting segment change (limit gets
>>> smaller, etc.).
>>> - Count rate of interesting segment changes. When it's high,
>>> switch to including segment checks in translated code (same as
>>> non-zero bases) and not flushing TLB. When it's low, don't put
>>> segment checks into translated code, and use TLB flushes on
>>> segment changes.
>>> - Keep separate count for ring 0 and ring 3, or for
>>> "code which uses segment prefixes" vs "code which doesn't".
>> Why are the heuristics needed? I wonder if the tlb flush could just
>> be optimized.
> Even if TLB flush itself is fast, you need to refill the TLB entries
> on subsequent memory accesses. It's good to avoid TLB flushes for
> that reason.
> I'm thinking of code like this from Linux which does
> movl %fs:(%eax),%ebx
> movl %ebx,(%ecx)
> I.e. rapidly switching between segments with different limits, and the
> %ds accesses are to addresses forbidden by %fs. If you're inlining
> %fs segment checks, though, then no TLB flush will be needed.
>> One would only need to flush the tlb when transitioning from "segment
>> checks in translated code" mode to "segment checks in mmu" mode, or
>> when directly going to a new LIMIT. In these cases one could just
>> flush NEWLIMIT..OLDLIMIT.
> That's true, you could optimise the flush in other ways too, such as
> when changing protection ring, just flush certain types of TLB entry.
> Or even keep multiple TLBs on the go, hashed on mostly-constant values
> like the translation cache, and the TLB choice being a translation
> cache key so it can be inlined into translated code. I didn't want to
> overcomplicate the suggestion, but you seem to like funky optimisations :-)
Don't want to stop all your creativity, but just like Paul I'm also a
bit skeptical about the TLB way of achieving range and type safety.
My major concern once was that the TLB works on a global scope so that
you cannot tell the original segments behind some address apart. And
extending the virtual address space for this is a no-go on 32-bit hosts
(which I unfortunately had and still have to support here :->).
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux