IBM REDP-4285-00 Page frame allocation, Page frame reclaiming

4285ch01.fm Draft Document for Review May 4, 2007 11:35 am

14 Linux Performance and Tuning Guidelines

Page frame allocation

A page is a group of contiguous linear addresses in physical memory (page frame) or virtual

memory. The Linux kernel handles memory with this page unit. A page is usually 4K bytes in

size. When a process requests a certain amount of pages, if there are available pages, the

Linux kernel can allocate them to the process immediately. Otherwise pages have to be taken

from some other process or page cache. The kernel knows how many memory pages are

available and where they are located.

Buddy system

The Linux kernel maintains its free pages by using the mechanism called buddy system. The

buddy system maintains free pages and tries to allocate pages for page allocation requests. It

tries to keep the memory area contiguous. If small pages are scattered without consideration,

it may cause memory fragmentation and it’s more difficult to allocate large portion of pages

into a contiguous area. It may lead to inefficient memory use and performance decline.

Figure 1-13 illustrates how the buddy system allocates pages.

Figure 1-13 Buddy System

When the attempt of pages allocation failed, the page reclaiming will be activated. Refer to

“Page frame reclaiming” on page14.

You can find information on the buddy system through /proc/buddyinfo. For detail, please

refer to “Memory used in a zone” on page47.

Page frame reclaiming

If pages are not available when a process requests to map a certain amount of pages, the

Linux kernel tries to get pages for the new request by releasing certain pages which are used

before but not used anymore and still marked as active pages based on certain principals and

allocating the memory to new process. This process is called page reclaiming. kswapd kernel

thread and try_to_free_page() kernel function are responsible for page reclaiming.

While kswapd is usually sleeping in task interruptible state, it is called by the buddy system

when free pages in a zone fall short of a certain threshold. It then tries to find the candidate

pages to be gotten out of active pages based on the Least Recently Used (LRU) principal.

This is relatively simple. The pages least recently used should be released first. The active list

and the inactive list are used to maintain the candidate pages. kswapd scans part of the

active list and check how recently the pages were used then the pages not used recently is

put into inactive list. You can take a look at how much memory is considered as active and

inactive using vmstat -a command. For detail refer to 2.3.2, “vmstat”.

kswapd also follows another principal. The pages are used mainly for two purpose; page

cache and process address space. The page cache is pages mapped to a file on disk. The

pages belonging to a process address space is used for heap and stack (called anonymous

memory because it‘s not mapped to any files, and has no name) (refer to 1.1.8, “Process

Used

Request

for 2pages Used

4 pages

chunk

Used

Request

for 2 pages

Used

2 pages

chunk

Used

8 pages

chunk

Used

Release

2 pages

Used

2 pages

chunk

8 pages

chunk

8 pages

chunk