On Jun 12, 2011, at 4:18 PM, Jim Klimov wrote:
> 2011-06-12 23:57, Richard Elling wrote:
>> How long should it wait? Before you answer, read through the thread:
>> Then add your comments :-)
>> -- richard
> Interesting thread. I did not quite get the resentment against
> a tunable value instead of a hard-coded #define, though.
Tunables are evil. They increase complexity and lead to local optimizations
that interfere with systemic optimizations.
> Especially if we might want to somehow tune it per-device,
> i.e. CDROM, enterprise SAS and some commodity drive or a
> USB stick (or a VMWare emulated HDD, as Ceri pointed out)
> might all be plugged into the same box and require different
> timeouts only the sysadmin might know about (the numeric
> values per-device). So I'd rather go with some hardcoded
> default and many tuned lines in sd.conf, probably.
yuck. I'd rather have my eye poked out with a sharp stick.
> But the point of my previous comment was that, according
> to the original poster, after a while his disk did get
> marked as "faulted" or "offlined". IF this happened
> during the system's initial uptime, but it froze anyway,
> it it a problem.
> What I do not know is if he rebooted the box within the
> 5 minutes set aside for the timeout, or if some other
> processes gave up during the 5 minutes of no IO and
> effectively hung the system.
Not likely. Much more likely that that which you were expecting
> If it is somehow the latter - that the inaccessible drive
> did (lead to) hang(ing) the system past any set IO retry
> timeouts - that is a bug, I think.
> But maybe I'm just too annoyed with my box hanging with
> a more-or-less reproducible scenario, and now I'm barking
> up any tree that looks like system freeze related to IO ;)
Yep, a common reaction.
I think we can be more creative...
zfs-discuss mailing list