I manage several systems with near a billion objects (largest is currently 800M) on each and also discovered slowness over time. This is on X4540 systems with average file sizes being ~5KB. In our environment the following readily sped up performance significantly:
- Do not use RAID-Z. Use as many mirrored disks as you can. This has been discussed before.
- Nest data in directories as deeply as possible.
- Although ZFS doesn't really care, client utilities certainly do and operations in large directories causes needless overhead.
- Make sure you do not use the filesystem past 80% capacity. As available space decreases so does overhead for allocating new files.
- Do not keep snapshots around forever, (although we keep them around for months now without issue.)
- Use ZFS compression (gzip worked best for us.)
- Record size did not make a significant change with our data, so we left it at 128K.
- You need lots of memory for a big ARC.
- Do not use the system for anything else other than serving files. Don't put pressure on system memory and let ARC do its thing.
- We now use the F20 cache cards as a huge L2ARC in each server which makes a large impact. one the cache is primed. Caching all that file metadata really helps
- I found using SSD's over iSCSI as a L2ARC was just as effective, so you don't necessarily need expensive PCIe flash.
After these tweaks the systems are blazingly quick, able to do many 1000's of ops/second and deliver full gigE line speed even on fully random workloads. Your mileage may very but for now I am very happy with the systems finally (and rightfully so given their performance potential!)
-- Adam Serediuk |