Ok, pardon my dumbness, but let me back up a bit.
Not dumb, I just jumped over a few things, like root!
First, lets look at the switch/router things:
(And I should first ask what flash controller is being used – does
it automatically do ‘read scrub’ or other techniques to get around this
whole issue? If so, you don’t need to do anything…)
They set this up using Yaffs. Yaffs version 1 I believe. I retires a block after 2 or 3 ECC hits, and does nothing to scrub the pages/block before marking it a bad block. That is why I wanted to change nandwrite, so that I can clear the bad block byte and then the existing erase, read, and write functions would then work so I could try cycling it a few times to see if it was a soft error from retention, or a hard error. Yaffs was built for short lifetime hardware and large flash memories. We are over 10 years, and have only 64MBytes, with 60-70% used.
Can you log in (as root, or become root) on the switches/routers?
Yes, in fact there is not really any other user. If you have access to the serial port, you are master of the system.
If so, can you ‘see’ all the files you’d have to change?
Yes, we can list all the files, and in fact have done a complete fs copy of the whole files system to an external computer using ssh. But that may not help us with the flight hardware, as they won't have the same computer, and there will be resistance to both the risk and astronaut time of coping the whole file system back.
If so, why not copy ‘in place’. That is, copy the whole thing (or
parts of it) to a new directory, then rename everything to the new
place.
Yes, copy in place is what I would like to do. But you can't copy any open files, and all the system files needed for the script to run, plus any others Linux has open can't be copied. And those are the very files that it is most important to refresh. That is why I wanted busybox on the ramdrive and chroot, so that the critical system files could be refreshed. And also not have the embarrassing case of deleting the original copy of mv so that you can move the refreshed copy back into that file name, only to have the system no longer able to find mv!
So, for example, lets say you had /bin, /etc, and /var you wanted to
do this to. /bin has 200M, /etc has 10M, and /var has 1G. Your flash
has 2G free.
2G! Don't I wish! The embedded flash chip is only 64MBytes, and that is usually at least 60% used. Still, we could do smaller chunks, but would still have the open system files we couldn't delete to replace, and the bin and sbin programs needed for the script that can't be left in an in between state of no file by the correct name, only by the temporary name.
So, ‘mkdir /bin.new /etc.new /var.new ; cd /bin ; tar cf – . | ( cd
/bin.new && tar xf - ) ;cd /etc; tar cf – . | (cd /etc.new
&& tar xf -) ; cd /var ; tar cf - . | (cd /var.new &&
tar xf -)’
I don't think even with compression we could fit some of the directories. And will tar overwrite a file currently used by the system? Can it overwrite itself?
That gets you 3 new dirs., /bin.new /etc.new and /var.new. Do your MD5 sums across the world and compare.
Now, rename everything: ‘mv /bin /bin.old ; /bin.old/mv /bin.new
/bin ; mv /etc /etc.old ; mv /etc.new /etc ; mv /var /var.old ; mv
/var.new /var’.
Ok, so now new is new and old is old. Delete old as you see fit, but probably after a reboot ;-)
Beware that when you move /bin suddenly you won’t have mv available
any more, so you have to say /bin.old/mv to be able to use it.
Yes, that would work for some. But last time I tried something similar we ended up with an unbootable switch that had to be sent back to the vendor to be reflashed!
No new kernel needed.
I was hoping that we would not need to touch the kernel at all. Just my clumsy way of saying that I think that the cross compiler needs to know something about the kernel functions to do the cross compile. But since I have never succeeded in doing a cross-compile I don't know yet.
On the other hand, if you have no choice but to make a kernel,
‘everything you need’ should be there when you get the kernel source.
Other than special hardware that requires special drivers, just ‘build
it and they will come’. Or rather, build it and it will run. All the
majic of linking and all that is taken care of for you. The only thing
where absolute addresses is needed is (as far as I can remember right
now) just hardware.
Good luck, Mr Phelps!
The smoke you see is not the the tape destroying itself, but my boss after hearing that I 'bricked' another switch!
Rusty
--
"Creativity is intelligence having fun." — Albert Einstein