Linux is not reliable enough because ... Was: Linux Kernel Developer job
Alexander Henry
plug-devel@lists.PLUG.phoenix.az.us
Fri May 14 23:41:02 2004
On Thu, 13 May 2004 18:52:21 -0700, Austin Godber <godber@uberhip.com>
wrote:
> Ed Skinner wrote:
>> Linux does not belong in systems where people's lives rely on it.
>> (Nor do most other commercial operating systems, for that matter.)
>
> Hello Ed,
> I am in no position to agree or disagree as I don't know enough about
> the subject. But perhaps you will give us a few examples of what
> abilities reliable systems have and how Linux is lacking in these areas.
> So we could all be better informed.
> I simply don't want you to get away with the "I am an expert and I say
> no!" defense.
>
I know a bit about what he's talking about, as I programmed aviation
systems for Honeywell's small jet autopilots.
As a whole, these systems did one thing: take data in from global
variables (which came from either internal processes or external sensors
or other processors), process, put it back up into globals, at EXACTLY
100ms intervals. Any sooner or later, and all the formulas inside the
chip would be completely off, taking either wrong sensor data or using a
random set of values from variables that came from other processors.
Also, other processors that used the values on your chip would be fed
wrong values. There was a lot of maddness in the software development
process about measuring the processor budget; you wanted it to finish
calculating in under 85ms to make the window and get the outputs where
they need to be.
The engineering way of thinking is almost 180 from the hacker way. To a
hacker, automation is king, redoing work is evil. To an engineer,
redundancy is king, single point of failures are evil. IOW more work and
less automation is good. You want to require there be a LOT of
cooperation from people and physical systems to let Murphy's Law kick in.
I've heard of some systems with three different processors, programmed by
three deliberately isolated teams, each on a different processor and
compiler; that way all three sets of programmers, or chip burns, or
compilers have to mess up in EXACTLY the same point for a failure to
happen. Upstream, there will be redundant sensors measuring everything,
and the program will 'vote' on a sensor based on the combined readings.
Downstream, the pilot has buttons that override the chips.
--
--Alexander