StorNext File System Tuning
The Metadata Controller System
StorNext File System Tuning Guide 17
The buffer cache I/O size is adjusted using the cachebufsize setting. The
default setting is usually optimal; however, sometimes performance can
be improved by increasing this setting to match the RAID5 stripe size.
Unfortunately, this is often not possible on Linux due to kernel memory
fragmentation. In this case performance may degrade severely because
the full amount of buffer cache cannot be allocated. Using a large
cachebufsize setting also decreases random I/O performance when the
amount of data being read is smaller than the cache buffer size.
Buffer cache read-ahead can be adjusted with the
buffercache_readahead
setting. When the system detects that a file is being read in its entirety,
several buffer cache I/O daemons pre-fetch data from the file in the
background for improved performance. The default setting is optimal in
most scenarios.
The buffer flusher can be tuned with the
buffercache_iods setting. A
single flusher daemon is responsible for flushing dirty buffers. Instead of
synchronously writing each buffer, the flusher places the buffers in an I/
O queue that is processed by multiple daemons. By default there are eight
I/O daemons that simultaneously perform disk I/O. Provided that the
system supports SCSI Command Tagged Queuing, this concurrency
dramatically improves throughput.
RAID systems such as the EMC CX series and Engenio that provide
excellent concurrent small I/O performance can usually benefit from
increasing the
buffercache_iods setting.
The
auto_dma_read_length and auto_dma_write_length settings determine
the minimum transfer size where direct DMA I/O is performed instead
of using the buffer cache for well-formed I/O. These settings can be
useful when performance degradation is observed for small DMA I/O
sizes compared to buffer cache.
For example, if buffer cache I/O throughput is 200 MB/sec but 512K
DMA I/O size observes only 100MB/sec, it would be useful to determine
which DMA I/O size matches the buffer cache performance and adjust
auto_dma_read_length and auto_dma_write_length accordingly. The lmdd
utility is handy here.
The
dircachesize option sets the size of the directory information cache on
the client. This cache can dramatically improve the speed of readdir
operations by reducing metadata network message traffic between the
SNFS client and FSM. Increasing this value improves performance in
scenarios where very large directories are not observing the benefit of the
client directory cache.