zfs-discuss@opensolaris.org
[Top] [All Lists]

[zfs-discuss] UC Davis Cyrus Incident September 2007

Subject: [zfs-discuss] UC Davis Cyrus Incident September 2007
From: Gary Mills
Date: Thu, 18 Oct 2007 08:04:58 -0500
Does anyone on this mailing list have an idea what went wrong with
ZFS and Cyrus IMAP?  Here's an excerpt that explains the problem:

  About a week before classes actually start is when all the kids start
  moving back into town and mailing all their buds.  We saw process
  numbers go from 500-ish to as high as 5,000.  Load would climb
  radically after passing 2,000 processes and systems became slow to
  respond.

Here's a suggestion on the cause:

  The root problem seems to be an interaction between Solaris' concept
  of global memory consistency and the fact that Cyrus spawns many
  processes that all memory map (mmap) the same file.  Whenever any
  process updates any part of a memory mapped file, Solaris freezes all
  of the processes that have that file mmaped, updates their memory
  tables, and then re-schedules the processes to run.  When we have
  problems we see the load average go extremely high and no useful work
  gets done by Cyrus.

I'm concerned because I'm also using Cyrus IMAP with ZFS.  So far,
it's been extremely well behaved.  Snapshots are one the best parts of
this system.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-
_______________________________________________
zfs-discuss mailing list
zfs-discuss@xxxxxxxxxxxxxxx
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

<Prev in Thread] Current Thread [Next in Thread>