[email protected]
[Top] [All Lists]

port-i386/43252: SiI 3512A SATA disks unreliable if used for software ra

Subject: port-i386/43252: SiI 3512A SATA disks unreliable if used for software raid
From:
Date: Tue, 4 May 2010 20:25:01 +0000 UTC
>Number:         43252
>Category:       port-i386
>Synopsis:       SiI 3512A SATA disks unreliable if used for software raid
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          open
>Class:          support
>Submitter-Id:   net
>Arrival-Date:   Tue May 04 20:25:00 +0000 2010
>Originator:     Rainer Glaschick
>Release:        netbsd 5.1_RC1
>Organization:
>Environment:
NetBSD 5.1_RC1 (GENERIC) #0:Sat Apr 24 23:26:09 UTC 2010 buildse7.netbsd.org 
(typed and not copie)
>Description:
If a SiI 3512A controler with two identical SATA 1.5 TB disks is used as two 
different devices for a software raid, the system is not installed reliably. 
Looks like data gets corrupted.

On the first two installation trials, the system did install, but could not 
boot (always rebooted without visible error messages).
A single disk installation went fine without problems.
Next trial (current) boots, but some programs fail; in particular, the checksum 
for vi is different to another successful installation. Other failing programs 
(fsck and ls) have same checksum, but a library might be corruped.

Raid installation was similar to the guide from the wiki 
(http://wiki.netbsd.se/How_to_install_NetBSD_on_RAID1_using_RAIDframe),
but I left the setup of the first disk to sysinst, then copied the disklabels 
and did the raid setup, before restarting the installation with the raid0 
device. Also, the RAID partitions are wd[01]a instead of wd[01]e. wd[01]b were 
set aside as swap or /tmp, but not included in the raid.

A trial installation in the same manner on a VMWARE virtual machine worked 
without problems. 

In the previous trials, I could boot kernel via network, everything went ok, 
except booting. So the kernel may have been corrupted in the first trials. Now, 
in the current state, the result is the same (kernel booted from disk or 
network).

No problems were observed with the installation kernel, booted via network. 

A Debian XFCE 5.04 did install and run without problems with the same hardware 
setup.

I curently have no spare SATA controller to test with. 

I also observed that the RAID setup took very long (app. 12 hours) to 
initialize; as far as I remember, the early installation did not consume so 
much time - but I am not really sure. newfs was significantly quicker (1hr), on 
a single disk and the raid.

Any hint how to isolate the problem? 
>How-To-Repeat:
Install as describe above.
>Fix:

<Prev in Thread] Current Thread [Next in Thread>