Kevin,
This is one of the most informative things I have read in a long
time. I have never really worked in the kernel other than well
documented howto bug fixes type stuff but I was able to understand
most of what you were talking about and I think I could almost do
the same now. I don't know if I would have the confidence to go
changing things but this is a very interesting read for problem
trouble shooting. Please post more like this if you have a chance.
Thanks
Bill Warner
> I learned about /proc/sys/fs/file-nr and /proc/sys/fs/file-max
> earlier tonight.
>
> I was sending some mail, cc'd myself and was running fetchmail to
> watch for replies. (Sure enough, the person I wanted to correspond
> with was working late too...) I noticed that even though fetchmail
> said it was fetching some mail (my cc), it wasn't being delivered via
> sendmail (which fetchmail invokes).
>
> Diagnosing this problem simply consisted of reading /var/log/maillog.
> (Apparently I had had a number of such incidents throughout the day.)
> The mail log told me that the system had run out of open files. The
> exact message was:
>
> Too many open files in system
>
> This led me to wonder about how one determines how many open files
> you are allowed to have and how you find out how many your system
> is using... (remember, that we're *not* talking about per-process
> limits here, but rather system limits.)
>
> I chose to use the kernel sources as my documentation. ;-)
>
> In fs/file_table.c, I noticed the following declaration early
> in the file:
>
> int max_files = NR_FILE;/* tunable */
>
> NR_FILES is defined to be 4096 in include/linux/fs.h. If you
> get close to the limit, a user process won't be allowed to open
> a new file if it gets within NR_RESERVED_FILES (10) of the limit.
>
> In addition to declaring/defining `max_files', file_table.c also
> declares `nr_files' and `nr_free_files'.
>
> `nr_files' is the total number of file structs that the kernel
> has allocated thus far. This value is not allowed to exceed
> `max_files'. `nr_free_files' is self descriptive, it simply
> tells the kernel how many (of the `nr_files' that are allocated)
> are free.
>
> Searching for `max_files' in the kernel sources turned out an
> occurence in a data structure in sysctl.c which controls how part of
> the structure of the /proc file system gets set up.
>
> It turns out that /proc/sys/fs/file-max contains the value of
> `max_files'. You may set this value via, e.g,
>
> echo 5200 >/proc/sys/fs/file-max
>
> The file /proc/sys/fs/file-nr is also of interest. It is read-only,
> and contains three values (on one line). These are as follows:
>
> - the total number of file structs allocated by the system,
> i.e, the value given by `nr_files' in the kernel sources.
> - the number of free file structs, i.e, the value given by
> `nr_free_files'.
> - the maximum number of file structs which may be allocated
> by the system. This will be the value of `max_files' in
> the kernel sources and is also available by examining
> /proc/sys/fs/file-max.
>
> So, for example, after my fiasco with my mail not getting delivered
> (which also entailed me frantically closing down a lot of applications
> that I wasn't using), I had the following situation:
>
> ocotillo:linux$ cat /proc/sys/fs/file-nr
> 4096 1323 4096
> ocotillo:linux$ cat /proc/sys/fs/file-max
> 4096
>
> Note that the first and third number being equal indicates that I
> actually hit the limit. (Actually, if these numbers even get close,
> you're probably in trouble - remember that non-root processes will
> fail when the system gets within 10 files of the maximum.)
>
> Then, as root, I did the following:
>
> [root@ocotillo kev]# echo 4608 >/proc/sys/fs/file-max
>
> Running the same two `cat' commands now yields the following
> results:
>
> ocotillo:linux$ cat /proc/sys/fs/file-nr
> 4096 1323 4608
> ocotillo:linux$ cat /proc/sys/fs/file-max
> 4608
>
> Hopefully, this will give me a bit of a cushion for the days to
> come.
>
> I was curious to see how many file structs the kernel was using on
> a different (less heavily used) system. Here's what I saw:
>
> saguaro:kev$ cat /proc/sys/fs/file-nr
> 1969 461 4096
> saguaro:kev$ cat /proc/sys/fs/file-max
> 4096
>
> Comparing the first and third numbers indicates that this other system
> had come no where close to hitting the limit of 4096.
>
> I have a hunch that those of you who run big web servers and other
> sorts of things which may need lots of open files already knew how to
> increase the number of maximum number of open files on your system.
> But even if you did, I hope that my description of the methodology
> that I used to ferret the information out of the kernel was of some
> interest.
>
> Kevin
>
> ________________________________________________
> See http://PLUG.phoenix.az.us/navigator-mail.shtml if your mail doesn't post to the list quickly and you use Netscape to write mail.
>
> Plug-discuss mailing list - Plug-discuss@lists.PLUG.phoenix.az.us
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>