Swap usage discussion

Ed Skinner plug-devel@lists.PLUG.phoenix.az.us
Fri Jun 18 09:25:02 2004


On Friday 18 June 2004 08:35, Alan Dayley wrote:
> Ted Gould said:
> > I understand what you're saying, but I think that there is an important
> > thing to note here.  I'm pretty sure that Linux won't swap code pages
> > into the swap partition, it 'swaps' them back to the local partition.
> > This because the program shouldn't have changed, so the one on disk is
> > just as good of a swap.  I'm not 100% sure on that one, but someone told
> > me that once ;)
>
> That would be interesting to confirm.  I had not heard that before.
>
> > Alan, I think probably the moral of the story here is that any model you
> > make of the filesystem is going to have to be amazingly complex.  You
> > might be better gathering some data on a live system, and then using
> > that.  I think that there are lots of people that can tell you what
> > various things do at different levels, but I'm not sure anyone really
> > understands: 'How does RedHat 7.3 running workload X use the swap?'  The
> > filesystem guys are going to say it's a daemon issue and the daemon guys
> > a kernel issue.
>
> I am not sure how accurate the answer has to be.  Obviously you are
> correct in that the real answer could only be found by sampling the
> activities of a working system.  But, it may be good enough to point to a
> few documented and expected behaviors and reach an educated guess on what
> will happen to the media.
>
> On the other hand, my research has found a recent discussion in the kernel
> list and subsequently on slashdot about the merits of running a system
> with no swap.  The opinion of many seems to be that as long as you are
> careful with the amount of RAM you need, and you have enough for your
> needs, as swap area on the hard drive is not needed.  You do risk an
> immediate out of memory crash if a process gets out of hand but many
> people accept that risk and have found they don't need it.
>
> I'll keep reading.  Thanks for the discussion.
>
> Alan

      A couple of comments:
* Code pages can be "dropped" and not swapped out, just abandoned, because the 
content is unchanged and, if needed, Linux can simply go back to the 
executable file to reload any needed pages. Hence, code pages never end up in 
the so-called "swap partition". Indeed, when a program is started, the code 
area is "memory-mapped" to the object file. (See "man mmap" for a similar 
user-space operation.) Different versions of Linux then pre-load different 
amounts of that program into memory and then start executing at the program's 
entry point. Any of the remainder of the executable code that is needed but 
that was not initially pre-loaded is then allowed to "fault in" to memory. 
This results in what might be considered a bumpy start that may cause 
problems in systems trying to do real-time but which is usually not 
noticeable to a user.
* For the Linux boxes I'm using (as a "user" -- I'd call these "workstations" 
and not embedded systems [obviously]), my general rule has always been to 
watch paging activity (some will call this "swapping" but, technically, there 
is a difference--more below) and, if paging occurs *ANY*, then I look to see 
if I can add more RAM to that system. IMHO, *ANY* paging is bad because the 
performance degradation is enormous. Compare RAM access time (measured in 
nanoseconds) to disk access time (measured in milliseconds) and, although the 
actual performance isn't quite that bad, at least one of the elements of the 
equation is a multiplier of 1,000,000 (nano- to milli-). Of course, buying 
more RAM isn't always possible or feasible, but it's something to consider. 
Adding RAM to avoid paging is, of course, the ideal. In practice, I usually 
max-out my machines with RAM and then, if paging occurs, I try to slow down 
in what I'm doing to allow it to quiesce. (The xosview program is in the 
RedHat distributions [and others?] and will let you watch paging and other 
busy-ness indicators in real-time.)
* Most embedded systems have no swap. It's turned off and the system has to 
have enough RAM to do all the assigned work. In earlier kernels (before 2.6, 
I think, but this was also fixed in some 2.4 versions maybe?), the kernel 
*could* overcommit on the available RAM and later discover it didn't have 
enough. This led to the dreaded "Out of memory" problem in which the kernel 
takes out a big gun and starts killing fat (RAM-rich) processes. [See the 
oom_kill() funciton in the kernel source file of the similar name. This led 
to a kernel-configuration option to avoid the over-commit but which, in turn, 
may lead some to conclude that RAM is under-utilized. It's a trade-off 
between robustness and cost effectiveness.]
* Finally, you may read some descriptions that confuse "swap" and "paging" and 
others that use the terms as they were originally intended. Here's the 
difference. "Swapping" was originally done to an entire process. When a 
process was "swapped", all of its code, data, stack, heap, etc. were 
"swapped" out to disk. Once it had been "swapped out" the only thing 
remaining in RAM (or core -- this is going way back) was the process control 
structure and some information about where the code, etc. are located on 
disk. Swapping came first, paging came later. Later systems used paging 
where, instead of swapping out the entire process, only little pieces, pages, 
would be paged out. You probably know all about paging so I won't go any 
farther with this except to add that, in some Linux kernels (and I don't know 
if this is still true in 2.6 or not, and under some *extreme* circumstances, 
Linux will *still* do a full "swap" of a process over and above the "paging" 
that applies to that process. In at least some parts of the kernel sources, 
therefore, the distinction between paging and swapping is relevant, and this 
may be important to you in your researches. And, of course, many writers use 
the two terms interchangeably because, for many purposes, they have 
essentially the same effect.
* And if you're still reading at this point, you're probably a "hacker" in the 
original sense of the word (like me). Fun stuff!

-- 
Ed Skinner, ed@flat5.net, http://www.flat5.net/