kernel modules and performance

Sat Nov 10 20:55:45 MST 2007

On Sat, Nov 10, 2007 at 06:39:30PM -0700, Matt Graham wrote:
> FWIW, I did a short real-world test (copying 690M of data multiple times)
> with e1000 compiled in and as a module.  Results:
> 
> module        compiled in
> 0m27.004s     0m27.172s
> 0m26.192s     0m26.089s
> 0m25.957s     0m26.055s
> 0m25.952s     0m25.971s
> 0m25.982s     0m25.963s
> 
> ...if anything, the module was faster.  This doesn't make much sense to me,
> but it could be that the module overhead is lost in the noise of other
> junk that was going on.  Not that there was much--gettys, the KDM greeter,
> sshd, and the tiny script copying and timing things.

My lame, off the cuff stab at how it *might* be done would be good
enough to make the context switch of a system call dwarf the cost of
module vs. compiled in. The more I think about it, the more I believe
the Linux kernel guys have got module calls so close to compiled-in
calls as to be almost unmeasurable.

If someone were to look long enough, they could probably find a nice
case there you got a ton of module calls out of one system call. Perhaps
something like sendfile() on a huge file.

> > Why would you NOT compile in the NIC driver?[1]
> > There's no way it should be slower,
> 
> No, but it might be!  I could try some more tests later and see if there's
> any sort of pattern at all or the thing's just messing with my head.

In the numbers above there's as much variance between runs of one type
as there are between the two types. I suspect this is going to be the
case for almost all real-world scenarios.

If you find a good way to measure (something like sendfile() as above)
then go for it! It's always good to know these things. But realize
you'll be diverging from something a given server of yours may encounter
in common operation.

If Hans' question revolves around the right way to build a good, fast
server, then the answer will probably be to find something else to
optimize (buy a NIC with a better driver, or a NIC that doesn't generate
interrupt storms, or buy faster memory, or buy the 15krpm SAS instead of
the 4.5krpm IDE, or find an N^2 algo in your app and see if Knuth knows
an O log(N) solution)

> Aye.  You actually might be better off moving to baselayout2 (init scripts
> written in C instead of sh) to save cycles instead of messing with modules
> or the lack thereof.

I think we can agree to agree :)

-- 
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
dwchandler at stilyagin.com   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation