Hello,
I have created a custom sensor to monitor the CPU process. It is running fine on CentOS 6.x however on CentOS Linux release 7.4.1708 (Core) we are seeing errors like: 0 % (Idle Percent) is below the error limit of 5 % in Idle Percent
If I go and refresh it manually it is working. The code is given below:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
#!/bin/bash
cpuinfo="/usr/bin/top"
xmlresult=`cat <<EOF
<?xml version="1.0" encoding='UTF-8'?>
<prtg>
EOF
`
if [ -f $cpuinfo ]; then
result=`/usr/bin/top -b -n 1 | grep 'Cpu(s)'`
if [[ $result == %Cpu* ]]; then
user=`echo $result | awk '{print $2}'| awk '{print int($0)}'`
system=`echo $result | awk '{print $4}'| awk '{print int($0)}'`
nice=`echo $result | awk '{print $6}'| awk '{print int($0)}'`
idle=`echo $result | awk '{print $8}'| awk '{print int($0)}'`
wait=`echo $result | awk '{print $10}'| awk '{print int($0)}'`
fi
xmlresult=$xmlresult`cat <<EOF
<result>
<channel>User Percent</channel>
<float>1</float>
<unit>Percent</unit>
<value>$user</value>
</result>
EOF
`
xmlresult=$xmlresult`cat <<EOF
<result>
<channel>System Percent</channel>
<float>1</float>
<unit>Percent</unit>
<value>$system</value>
</result>
EOF
`
xmlresult=$xmlresult`cat <<EOF
<result>
<channel>Nice Percent</channel>
<float>1</float>
<unit>Percent</unit>
<value>$nice</value>
</result>
EOF
`
xmlresult=$xmlresult`cat <<EOF
<result>
<channel>Idle Percent</channel>
<float>1</float>
<unit>Percent</unit>
<value>$idle</value>
<LimitMinWarning>10</LimitMinWarning>
<LimitMinError>5</LimitMinError>
<LimitMode>1</LimitMode>
</result>
EOF
`
xmlresult=$xmlresult`cat <<EOF
<result>
<channel>Wait Percent</channel>
<float>1</float>
<unit>Percent</unit>
<value>$wait</value>
</result>
EOF
`
xmlresult=$xmlresult`cat <<EOF
<text>OK</text>
</prtg>
EOF
`
else
xmlresult=$xmlresult`cat <<EOF
<error>1</error>
<text>This sensor is not supported by your system, missing $cpuinfo</text>
</prtg>
EOF
`
fi
echo "$xmlresult"
exit 0
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Please let me know what is wrong??
Article Comments
Hello,
Yes, these limits are perfectly working on CentOS 6.x but I am not sure why we are getting issues with CentOS 7.x servers.
Thanks
Nov, 2017 - Permalink
Hello,
I think I found the issue it's because of the command awk '{print int($0)}':
[root@test~]# /usr/bin/top -b -n 1 | grep 'Cpu(s)'
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
[root@test~]# /usr/bin/top -b -n 1 | grep 'Cpu(s)'| awk '{print $8}'| awk '{print int($0)}'
0
[root@SLYNLUBAOS01 ~]# /usr/bin/top -b -n 1 | grep 'Cpu(s)'
%Cpu(s): 0.0 us, 3.1 sy, 0.0 ni, 96.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
[root@SLYNLUBAOS01 ~]# /usr/bin/top -b -n 1 | grep 'Cpu(s)'| awk '{print $8}'| awk '{print int($0)}'
96
If the idle goes 100.0 its taking the value as 0 that's why we are getting the alerts.
Nov, 2017 - Permalink
You have these two entries for the Idle Percentage.
<LimitMinWarning>10</LimitMinWarning> <LimitMinError>5</LimitMinError>
They are establishing lower thresholds. So when it hits 0, it's triggering the Error threshold.
Nov, 2017 - Permalink