moin moin,
I've got a machine experiencing a lot of IO wait.
We had power at a datacenter go down last week. Since then IO wait has
been over 35%. At first we thought it was due to 3ware RAID verify taking
place due to the crash. That took a few days, then the weekly verify
started. We stopped that and IO wait stayed high. 8 disks in a RAID 10.
Load avg is also very high, presumably due to the IO wait.
smartctl short tests didn't turn up any issues.
We're not swapping at all.
Disk read and write are fairly low.
Network traffic is down as is the total number of process and the number
of running processes. No evidence of network errors on the box or at the
switch.
Not much going on in the logs. We've stopped several reporting processes
in order to reduce disk access.
On the positive side, entropy has been staying high :).
IO wait is not explicitly disk? It could be network, serial, USB, etc.?
How do I determine what resource is causing the IO wait? Is there a way to
track to a specific process?
vmstat, iostat, top and lots of other tools have been great at showing
that there's overall IO wait ( I've been able to show that almost all
processors have high wait, one was only at 5% ), but I haven't yet
determined what and how.
The server is running CentOS in case that matters.
ciao,
der.hans
--
# http://www.LuftHans.com/ http://www.LuftHans.com/Classes/
# Hope has two beautiful daughters: Anger and Courage. Anger at the way
# things are, and Courage to struggle to create things as they should be.
# -- St. Augustine
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss