4285ch01.fm Draft Document for Review May 4, 2007 11:35 am
22 Linux Performance and Tuning Guidelines
Figure 1-20 Locality of reference
Linux implementation make use of this principal in many components such as page cache,
file object cache (i-node cache, directory entry cache etc.), read ahead buffer and so on.
Flushing dirty buffer
When a process reads data from disk, the data is copied on to memory. The process and
other processes can retrieve the same data from the copy of the data cached in memory.
When a process tries to change the data, the process changes the data in memory first. At
this time, the data on disk and the data in memory is not identical and the data in memory is
referred to as a dirty buffer. The dirty buffer should be synchronized to the data on disk as
soon as possible, or the data in memory may be lost if a sudden crash occurs.
The synchronization process for a dirty buffer is called flush. In the Linux kernel 2.6
implementation, pdflush kernel thread is responsible for flushing data to the disk. The flush
occurs on regular basis (kupdate) and when the proportion of dirty buffers in memory
exceeds a certain threshold (bdflush). The threshold is configurable in the
/proc/sys/vm/dirty_background_ratio file. For more information, refer to 4.5.1, “Setting
kernel swap and pdflush behavior” on page110.
Temporal locality Spatial locality
CPU
Register
Cache
Memory
Disk
First access
Data
Data
Data
Data
Second access in a few seconds Second access to data2 in a few seconds
Data2
Data2
CPU
Register
Cache
Memory
Disk
Data
Data
Data
Data
CPU
Register
Cache
Memory
Disk
First access
Data1
Data1
Data
Data
Data2
Data2
CPU
Register
Cache
Memory
Disk
Data1
Data1
Data
Data