kernel modules and performance
Darrin Chandler
dwchandler at stilyagin.com
Sat Nov 10 13:20:29 MST 2007
On Sat, Nov 10, 2007 at 12:55:44PM -0700, der.hans wrote:
>> Surely we have some kernel folks that could be more authoritative,
>> but since you have had no answer in five hours, I stepped in.
>
> Thanks :).
I was hoping someone with good current knowledge would jump in, too. My
kernel internals knowledge is sketchy, and very out of date.
>> But the design also provided that once a module was loaded, all of
>> its fx() were available without further loading. Possibly that has
>> changed but I suspect not.
>
> Yeah, I think they're still available without further loading,
> provided the module wasn't unloaded, but I'm just wondering if there's
> other overhead such as multiple lookups to get to the fx().
Not sure how they're doing it, but a straightforward approach taken by
several other operating systems for various things involves a jump
table. Basically an array of function locations. There are two ways I
remember offhand of populating such an array. First, you can load the
array completely when you load the module. Second, you prefill the array
with pointers to a "fix up this entry" function. In both cases, once a
given element points to the module function it's not fixed up again. The
post-fixup overhead is a single additional jump (cheap, unless it's a
cache miss).
The kernel may be doing something completely different. Dunno.
>>> Does having a whole bunch of loaded modules cause a performance hit
>>> because some module lookup table gets huge or for some other reason?
See above. *IF* above is how it's done, then after fixup it's quick from
then on, however many modules are loaded.
>> Again, I am not sure but I believe there were a couple of reasons one
>> preferred the compiled in method rather than loading separate
>> modules. Surely one was the loading overhead but I have no clue
>> whether one of the reasons was lookup table sizes. May be more likely
>> symbol lookup could become an issue not just of size but frequency of
>> lookup. Just my guess though.
Ok, so you're building a web server. If it's not on the network it's
completely useless to anyone. You are building this with specific
components including the NIC. In fact, you're building two dozen
identical machines. Why would you NOT compile in the NIC driver?[1]
There's no way it should be slower, and the driver will *always*,
*always* be needed.
[1] If it's important enough to worry about this then it's important
enough to measure performance under some analog of real usage. What I
suspect you'd find is that the performance differences won't make much
difference after the system is up and running and modules are loaded. By
the time those few machine cycles become critical you're already at the
ragged edge of being overloaded and should be looking for another
solution such as better hardware or a cluster.
--
Darrin Chandler | Phoenix BSD User Group | MetaBUG
dwchandler at stilyagin.com | http://phxbug.org/ | http://metabug.org/
http://www.stilyagin.com/ | Daemons in the Desert | Global BUG Federation
More information about the PLUG-discuss
mailing list