Hi Hans:
On Mon, Jun 27, 2011 at 5:07 PM, der.hans <
PLUGd@lufthans.com> wrote:
> moin moin,
>
> I've got a machine experiencing a lot of IO wait.
>
> We had power at a datacenter go down last week. Since then IO wait has
> been over 35%. At first we thought it was due to 3ware RAID verify taking
> place due to the crash. That took a few days, then the weekly verify
> started. We stopped that and IO wait stayed high. 8 disks in a RAID 10.
>
> Load avg is also very high, presumably due to the IO wait.
>
> smartctl short tests didn't turn up any issues.
>
> We're not swapping at all.
>
> Disk read and write are fairly low.
>
> Network traffic is down as is the total number of process and the number
> of running processes. No evidence of network errors on the box or at the
> switch.
>
> Not much going on in the logs. We've stopped several reporting processes
> in order to reduce disk access.
>
> On the positive side, entropy has been staying high :).
>
> IO wait is not explicitly disk? It could be network, serial, USB, etc.?
>
> How do I determine what resource is causing the IO wait? Is there a way to
> track to a specific process?
>
> vmstat, iostat, top and lots of other tools have been great at showing
> that there's overall IO wait ( I've been able to show that almost all
> processors have high wait, one was only at 5% ), but I haven't yet
> determined what and how.
What version is your 3ware firmware? That's fairly important, you realize?
> The server is running CentOS in case that matters.
Please see this link related to known kernel bug in rhel kernel for
3ware products:
https://bugzilla.redhat.com/show_bug.cgi?id=121434
It also discusses troubleshooting commands to verify, some kernel proc
tuning and resolutions that worked for some.
I don't see where your kernel or distro version is listed? CentOs in
a 2.4 kernel? CentOs 5.6?
There are many suggestions that will give you a place to start:
For instance, try reducing the queue depth of the 3Ware driver:
can_queue from 254 to 30
command_per_lun from 254 to 4
There is a good deal of material in this post that will give you some
ideas on how to do high performance kernel tuning and troubleshooting.
But first, I would search using your firmware version and kernel
version/distro to get all the known issues in preparation for
UPGRADING. You certainly can't expect CURRENT performance without
kernel sources?
> ciao,
>
> der.hans
> --
> # http://www.LuftHans.com/ http://www.LuftHans.com/Classes/
> # Hope has two beautiful daughters: Anger and Courage. Anger at the way
> # things are, and Courage to struggle to create things as they should be.
> # -- St. Augustine
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
--
(602) 791-8002 Android
(623) 239-3392 Skype
(623) 688-3392 Google Voice
HomeSmartInternational.com
---------------------------------------------------
PLUG-discuss mailing list -
PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss