Linux swap discussion

Alan Dayley plug-devel@lists.PLUG.phoenix.az.us
Fri Nov 5 20:21:02 2004


Yesterday's Devel Meeting did not go off quite as planned.  The CD from the=
=20
reference book would not work and that killed much of the presentation=20
content.

To help make up for that, this email provides much of the points I discusse=
d=20
about how the Linux kernel virtual memory manager uses swap space.  It is n=
ot=20
a complete definition or a conclusion, by any means, but it does make some=
=20
interesting points with references to some of the sources.

Note that in some places I make reference to the archives of this very PLUG=
=20
email list.  Thanks to all who contributed back when the discussion was liv=
e=20
in the list.

Alan
=2D--------------------------------
Linux Swap Investigation

=A0 =A0 =A0* Introduction
=A0 =A0 =A0* Investigation Tools
=A0 =A0 =A0* Knowledge Points
=A0 =A0 =A0 =A0 =A0 =A0o Swapping vs. Paging
=A0 =A0 =A0 =A0 =A0 =A0o Swap Page Size
=A0 =A0 =A0 =A0 =A0 =A0o Running Without Swap
=A0 =A0 =A0 =A0 =A0 =A0o Executables Don't Use Swap
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ References
=A0 =A0 =A0 =A0 =A0 =A0o Swap Area Layout
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ Reference
=A0 =A0 =A0 =A0 =A0 =A0o Preventing Process Killing
=A0 =A0 =A0* General References

Introduction
The question was raised about using a flash based ATA drives as the sole dr=
ive=20
in a Linux system.  Linux, like most modern operating systems, usually=20
utilizes a file or partition on the local hard drive for virtual memeory sw=
ap=20
space. Concern about how the swap space would effect the life expectancy of=
=20
the flash media was paramount.

Investigation Tools
These are Linux based tools that can be used to investigate answers to memo=
ry=20
and swap usage questions. Mention of these tools was found in various=20
research places. I have not fully evaluated them all.

=A0 =A0 =A0* free - Command line program that reports amount of free memory=
,=20
including swap. Many options.

=A0 =A0 =A0* top - Command line program that reports amount of free memory,=
=20
including swap, along with all running processes and resources consumed by=
=20
each. Many options.

=A0 =A0 =A0* sar - Command line program to get individual data points from =
the same=20
data provided in by the top tool.

=A0 =A0 =A0* vmstat - "Virtual Memory Status" Command line tool on specific=
 virtual=20
memory usage.

=A0 =A0 =A0* rmfpm - Command line tool mentioned in the research. Have not =
used it.

=A0 =A0 =A0* xosview - Similar to top application but is X-windows based so=
 requires=20
an X server on the target system. Pretty pictures.

=A0 =A0 =A0* /proc/meminfo - virutal file for memory information

=A0 =A0 =A0* /proc/sys/vm/* - various virtual files reporting or setting vi=
tual=20
memory information

Knowledge Points

Swapping vs. Paging
In the "old" days of computing, computers handled virtual memory by moving=
=20
entire processes, including code and data, out of RAM an onto hard drive=20
space. This was called swapping and the hard drive space reserved for this=
=20
virtual memory use was the swap area as a file or partition. Current=20
processors and operating system software are capable of finer grain control=
=2E=20
Current "swap" operatons are actually "paging." Pages of memory instead of=
=20
entire processes are moved out of RAM to the hard drive. Most documentation=
=20
and discussion about paging in current operating systems, including Linux, =
is=20
still called swapping even though memory pages are what is being swapped.

Swap Page Size
The Linux memory page size is 4096 bytes (0x1000), assuming an x86 based=20
platform.

linux2.4.20-8/include/asm/page.h, lines 4-7

/* PAGE_SHIFT determines the page size */
#define PAGE_SHIFT =A0 12
#define PAGE_SIZE =A0 (1UL << PAGE_SHIFT)
#define PAGE_MASK =A0 (~(PAGE_SIZE-1))

Running Without Swap
There was already much discussion about running the system without a swap=20
area. Discussions on the Linux kernel email list describe some benefits and=
=20
pitfalls of this approach but it does appear to be a feasable option worth=
=20
investigation. (See http://kerneltrap.org/node/view/3202#comment-9161.)

Executables Don't Use Swap
The Linux kernel does not swap executable code to the hard drive. Pages=20
containing inactive code are simply abandoned. If the code is needed again,=
=20
it is retrieved again from the executable file.

References

=A0 =A0 =A0* Information from a PLUG member (See=20
http://lists.plug.phoenix.az.us/pipermail/plug-devel/2004-June/001126.html).

=A0 =A0 =A0* See try_to_release_page() in linux2.4.20-8/fs/buffer.c and fun=
ctions=20
that it calls. In that function, the several tests of page->mapping lead to=
=20
execution of the drop_page() function for pages containing executable code=
=20
(ie. text).

=A0 =A0 =A0* "Understanding the Linux Virtual Memory Manager," page 172, it=
em 4
=A0 =A0 =A0 =A0 =A0 =A0o "If, on the other hand, it is backed by a file or =
device, the=20
reference is simply dropped, and the page will be freed as usual when the=20
cound reaches 0."

=A0 =A0 =A0* "Running Linux, Third Edition", page 175, first paragraph stat=
es:
=A0 =A0 =A0 =A0 =A0 =A0o "Also, text pages (those pages containing program =
code, not data)=20
are usually read-only, and therefore not written to disk when swapped out.=
=20
Those pages are instead freed directly from main memory and read from the=20
original executable file when they are accessed again."

=A0 =A0 =A0* "Running Linux, Third Edition", page 555, item 9
=A0 =A0 =A0 =A0 =A0 =A0o Pages with a ->mapping pointer not null (ie. execu=
table=20
code) are simply released.

Swap Area Layout

=A0 =A0 =A0* The swap area, be it a partition or file, is divided into page=
=2Dsized=20
slots on the disk.
=A0 =A0 =A0* The first slot is reserved with information about the swap are=
a and=20
should not be over-written.
=A0 =A0 =A0* The first 1KiB of the swap area is reserved and contains:
=A0 =A0 =A0 =A0 =A0 =A0o A disk label of the partition for use by user-spac=
e tools.
=A0 =A0 =A0 =A0 =A0 =A0o Information about the swap area, written when it w=
as created with=20
mkswap.

Reference

=A0 =A0 =A0* "Understanding the Linux Virtual Memory Manager," Section 11.1=
=20
"Describing the Swap Area," pages 180-183.

Preventing Process Killing
(This is a side issue to the main focus of swap partitions on flash media. =
It=20
is, however, an interesting point that could be useful to us and the=20
customer.)

The kernel, when faced with a low or no memory situation, will attempt to k=
ill=20
process(s) that are consuming resources. The start of the kill process is t=
he=20
out_of_memory() function in oom_kill.c. Careful study of how the kernel=20
determines which process to kill could indicate a method to shield a proces=
s=20
from being killed. This could be important in forcing the kernel to kill=20
other processes and leave a mission critical process alone during an effort=
=20
to recover memory.

General References

=A0 =A0 =A0* "Understanding the Linux Virtual Memory Manager," Mel Gorman (=
This book=20
is a study of kernel version 2.4.22 with some commentary on 2.6.0-test4.)
       - The web site that originated this book is available online at:=20
http://www.csn.ul.ie/~mel/projects/vm/guide/html/understand/
       - The PDF version of the book is downloadable now from here:=20
http://www.phptr.com/promotion/1484?redir=3D1.  Just scroll down into the l=
ist.

=A0 =A0 =A0* Linux source code, version 2.4.20-8 as provided in Red Hat Lin=
ux 9.

=A0 =A0 =A0* Members of the Phoenix Linux User Group "Devel" mailing list.=
=20
Discussion thread currently starts with my message archived at:=20
http://lists.plug.phoenix.az.us/pipermail/plug-devel/2004-June/001123.html

=A0 =A0 =A0* Linux Kernel email list discussion on swap archived at:=20
http://kerneltrap.org/node/view/3202#comment-9161