|
|
> if those servers are on physical boxes right now i'd do some perfmon
> caps and add up the iops.
Using perfmon to get a sense of what is required is a good idea. Use the 95
percentile to be conservative. The counters I have used are in the Physical
disk object. Don't ignore the latency counters either. In my book, anything
consistently over 20ms or so is excessive.
I run 30+ VMs on an Equallogic array with 14 sata disks, broken up as two
striped 6 disk raid5 sets (raid 50) with 2 hot spares. That array is, on
average, about 25% loaded from an IO stand point. Obviously my VMs are pretty
light. And the EQL gear is *fast*, which makes me feel better about spending
all of that money :).
>> Regarding ZIL usage, from what I have read you will only see
>> benefits if you are using NFS backed storage, but that it can be
>> significant.
>
> link?
>From the ZFS Evil Tuning Guide
>(http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide):
"ZIL stands for ZFS Intent Log. It is used during synchronous writes
operations."
further down:
"If you've noticed terrible NFS or database performance on SAN storage array,
the problem is not with ZFS, but with the way the disk drivers interact with
the storage devices.
ZFS is designed to work with storage devices that manage a disk-level cache.
ZFS commonly asks the storage device to ensure that data is safely placed on
stable storage by requesting a cache flush. For JBOD storage, this works as
designed and without problems. For many NVRAM-based storage arrays, a problem
might come up if the array takes the cache flush request and actually does
something rather than ignoring it. Some storage will flush their caches despite
the fact that the NVRAM protection makes those caches as good as stable storage.
ZFS issues infrequent flushes (every 5 second or so) after the uberblock
updates. The problem here is fairly inconsequential. No tuning is warranted
here.
ZFS also issues a flush every time an application requests a synchronous write
(O_DSYNC, fsync, NFS commit, and so on). The completion of this type of flush
is waited upon by the application and impacts performance. Greatly so, in fact.
From a performance standpoint, this neutralizes the benefits of having an
NVRAM-based storage."
When I was testing iSCSI vs. NFS, it was clear iSCSI was not doing sync, NFS
was. Here are some zpool iostat numbers:
iSCSI testing using iometer with the RealLife work load (65% read, 60% random,
8k transfers - see the link in my previous post) - it is clear that writes are
being cached in RAM, and then spun off to disk.
# zpool iostat data01 1
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
data01 55.5G 20.4T 691 0 4.21M 0
data01 55.5G 20.4T 632 0 3.80M 0
data01 55.5G 20.4T 657 0 3.93M 0
data01 55.5G 20.4T 669 0 4.12M 0
data01 55.5G 20.4T 689 0 4.09M 0
data01 55.5G 20.4T 488 1.77K 2.94M 9.56M
data01 55.5G 20.4T 29 4.28K 176K 23.5M
data01 55.5G 20.4T 25 4.26K 165K 23.7M
data01 55.5G 20.4T 20 3.97K 133K 22.0M
data01 55.6G 20.4T 170 2.26K 1.01M 11.8M
data01 55.6G 20.4T 678 0 4.05M 0
data01 55.6G 20.4T 625 0 3.74M 0
data01 55.6G 20.4T 685 0 4.17M 0
data01 55.6G 20.4T 690 0 4.04M 0
data01 55.6G 20.4T 679 0 4.02M 0
data01 55.6G 20.4T 664 0 4.03M 0
data01 55.6G 20.4T 699 0 4.27M 0
data01 55.6G 20.4T 423 1.73K 2.66M 9.32M
data01 55.6G 20.4T 26 3.97K 151K 21.8M
data01 55.6G 20.4T 34 4.23K 223K 23.2M
data01 55.6G 20.4T 13 4.37K 87.1K 23.9M
data01 55.6G 20.4T 21 3.33K 136K 18.6M
data01 55.6G 20.4T 468 496 2.89M 1.82M
data01 55.6G 20.4T 687 0 4.13M 0
Testing against NFS shows writes to disk continuously.
NFS Testing
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
data01 59.6G 20.4T 57 216 352K 1.74M
data01 59.6G 20.4T 41 21 660K 2.74M
data01 59.6G 20.4T 44 24 655K 3.09M
data01 59.6G 20.4T 41 23 598K 2.97M
data01 59.6G 20.4T 34 33 552K 4.21M
data01 59.6G 20.4T 46 24 757K 3.09M
data01 59.6G 20.4T 39 24 593K 3.09M
data01 59.6G 20.4T 45 25 687K 3.22M
data01 59.6G 20.4T 45 23 683K 2.97M
data01 59.6G 20.4T 33 23 492K 2.97M
data01 59.6G 20.4T 16 41 214K 1.71M
data01 59.6G 20.4T 3 2.36K 53.4K 30.4M
data01 59.6G 20.4T 1 2.23K 20.3K 29.2M
data01 59.6G 20.4T 0 2.24K 30.2K 28.9M
data01 59.6G 20.4T 0 1.93K 30.2K 25.1M
data01 59.6G 20.4T 0 2.22K 0 28.4M
data01 59.7G 20.4T 21 295 317K 4.48M
data01 59.7G 20.4T 32 12 495K 1.61M
data01 59.7G 20.4T 35 25 515K 3.22M
data01 59.7G 20.4T 36 11 522K 1.49M
data01 59.7G 20.4T 33 24 508K 3.09M
data01 59.7G 20.4T 35 23 536K 2.97M
data01 59.7G 20.4T 32 23 483K 2.97M
data01 59.7G 20.4T 37 37 538K 4.70M
Note, the ZIL is being used, just not on a separate device. The periodic high
writes show it being flushed. You can also see reads stall to nearly zero as
the ZIL is dumping. Not good. This thread is discussing this behavior:
http://www.opensolaris.org/jive/thread.jspa?threadID=106453
Coming from a mostly Windows world, I really like the tools that you get on
Opensolaris to see this kind of stuff.
-Scott
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@xxxxxxxxxxxxxxx
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
|
|