CPU idle% "is zero"

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf CPU idle% "is zero"

  • Creator
    Topic
  • #52999
    Mitchell Rawlins
    Participant

    We’re running cloverleaf 5.7 on RHEL  (5.x, can’t remember which point version), and after about 90 days of uptime we’ll get an alert that idle% is zero, which also likes to crash the monitor daemon.

    /opt/quovadx/qdx5.7/integrator/bin/top lists (for example)

    CPU states:  9.6% user,  0.0% nice, 15.5% system,  0.0% idle, 74.9% iowait

    while

    /usr/bin/top lists

    Cpu(s):  1.3%us,  1.7%sy,  0.0%ni, 86.1%id, 10.8%wa,  0.0%hi,  0.0%si,  0.0%st

    Cloverleaf top is version 3.6, and ldd outputs:

    linux-gate.so.1 =>  (0x009fa000)

    libm.so.6 => /lib/libm.so.6 (0x009b1000)

    libtermcap.so.2 => /lib/libtermcap.so.2 (0x00c3d000)

    libelf.so.1 => /usr/lib/libelf.so.1 (0x00b06000)

    libc.so.6 => /lib/libc.so.6 (0x00856000)

    /lib/ld-linux.so.2 (0x00837000)

    RedHat top is procps version 3.2.7, and ldd outputs:

    linux-gate.so.1 =>  (0x00cb2000)

    libproc-3.2.7.so => /lib/libproc-3.2.7.so (0x009b1000)

    libncurses.so.5 => /usr/lib/libncurses.so.5 (0x04e3f000)

    libc.so.6 => /lib/libc.so.6 (0x00856000)

    libdl.so.2 => /lib/libdl.so.2 (0x009dc000)

    /lib/ld-linux.so.2 (0x00837000)

    Has anyone else seen this before, or have an idea what causes it?  Other than rebooting the server does anybody know what could help fix this?

Viewing 1 reply thread
  • Author
    Replies
    • #76204
      John Parker
      Participant

      Not seen it in our Cloverleaf but have seen it on other Linux boxes.

      You have an IO issue going on and you need to isolate it so you can find a resolution.  There are tools to help you diagnose IO issues:  iostat, iotop and strace can get you started.

      Also, check out your memory utilization and make sure you have enough memory so Linux can cache properly.  You could have a memory leak that takes time to become critical

      Also check your disk storage subsystem and verify nothing is happening to decrease the throughput.

      To help you get an overview and explanation of the terms do:  man iostat and read the manpage.  It explain what iowait and the other top values actually mean.

      Hope this helps.

      John Parker

      Oconee Medical Center

    • #76205
      Mitchell Rawlins
      Participant

      iostat is showing the average cpu is 3% iowait and 96% idle.

      iotop fluctuates a lot, with multiple threads vying for the top spot.  It doesn’t look like any single process is taking up very much IO.  

      Every tool from RedHat agrees there’s no IO problem.  The only tool that thinks anything’s wrong is the version of top that came with Cloverleaf.  We’re currently working under the hypothesis that the OS-bundled tools are more accurate.

Viewing 1 reply thread
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,435
Topic Tags
286
Empty Topic Tags
10