On Jun 17, 2010, at 3:52 PM, Garrett D'Amore wrote:
> Anyway, I'm happy to share the code, and even go through the
> request-sponsor process to push this upstream. I would like the
> opinions of the ZFS and FMA teams though... is the approach I'm using
> sane, or have I missed some important design principle? Certainly it
> *seems* to work well on the systems I've tested, and we (Nexenta) think
> that it fixes what appears to us to be a critical deficiency in the ZFS
> error detection and handling. But I'd like to hear other thoughts.
I don't think this is the right approach. You'll end up faulting drives that
should be marked removed, among other things. The correct answer is for
drivers to use the new LDI events (LDI_EV_DEVICE_REMOVE) to indicate device
removal. This is already in ON but not hooked up to ZFS. It's easy to do, but
just hasn't been pushed yet.
Note that for legacy drivers, the DKIOCGETSTATE ioctl() is supposed to handle
this for you. However, there is a race condition where if the vdev probe
happens before the driver has transitioned to a state where it returns
DEV_GONE, then we can miss this event (because it is only probed in reaction to
I/O failure and we won't try again). We spent some time looking at ways to
eliminate this window, but it ultimately got quite ugly and doesn't support hot
spares, so the better answer was to just properly support the LDI events.
If you wanted to expand support for legacy drivers, you should expand use of
the DKIOCGETSTATE ioctl(), perhaps with an async task that probes spares, as
well as a delayed timer (within the bounds of the zfs-diagsnosis
resource.removed horizon) to close the associated window for normal vdevs.
However, a better solution would be to update the drivers that matter to use
LDI_EV_DEVICE_REMOVE, which provides much crisper semantics and will be used in
the future to hook into other subsystems.
In order for anything to be accepted upstream, it's key that it be able to
distinguish between REMOVED and FAULTED devices. Mis-diagnosing a removed
drive as faulted is very bad (fault = broken hardware = service call = $$$).
P.S. the bug in the ZFS scheme module is legit, we just haven't fixed it yet
Eric Schrock, Fishworks http://blogs.sun.com/eschrock
zfs-discuss mailing list