zfs-discuss@opensolaris.org
[Top] [All Lists]

[zfs-discuss] Debugging filesystem lock-ups

Subject: [zfs-discuss] Debugging filesystem lock-ups
From: "Peter Bortas"
Date: Sun, 4 May 2008 04:53:53 +0200
Hello,

I'm using snv_81 x86 as a file server and occasional CPU server at
home. It consists of one system disk with normal UFS/swap and one pool
of six disks in raidz1 configuration.

Every now and again the raidz file systems will lock up hard. Any
access to them will block in IO-wait. Trying to reboot will lock up
the system, so pressing the reset button in the only option. After a
reboot everything works fine again.

I can usually trigger the problem within 12 hours by doing lots of
compilations in parrallell, but just leaving it alone serving files
via Samba and NFS will trigger it within a couple of weeks. The
problem has been there ever since I installed snv_55 on it way back,
so my guess is that it's not a systematic error in ZFS, but rather a
driver problem or a hardware glitch. The trick is figuring out which
of those two it is so I can correct it.

I should mention that we are talking about disks, whose natural state
of course are full:
% df -h /famine
Filesystem             size   used  avail capacity  Mounted on
famine                 2.7T    72K    14G     1%    /famine

So...

1. The root shell still works. How do I go about trying to debug
things when the filesystems lock up?

2. This is a pretty well ventilated chassi, but trying to determine if
things get to hot of pull too much power is always prudent. Where
should I look for information on how to set up MB sensors and SMART
access?

3. The pool is divided between two SIL-3114 cards flashed to the
non-RAID bios version running on a single P4 CPU. Any known problems
with that configuration?

TIA,
-- 
Peter Bortas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@xxxxxxxxxxxxxxx
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

<Prev in Thread] Current Thread [Next in Thread>
  • [zfs-discuss] Debugging filesystem lock-ups, Peter Bortas <=