I have a custom sensor returning a couple of channels.
For a clearer image, the structure of a channel is the following:
<result>
<channel>server0</channel>
<value>100</value>
<Unit>Percent</Unit>
<LimitMinWarning>80</LimitMinWarning>
<LimitMinError>60</LimitMinError>
<LimitWarningMsg>Process is in maintenance or starting.</LimitWarningMsg>
<LimitErrorMsg>Process is stopped or in an unknown state</LimitErrorMsg>
<LimitMode>1</LimitMode>
</result>
At some point, one of the channels returned a value of 0, leading to a red status for the whole sensor(as expected), the message was the following: "(bootstrap) is below the error limit of 60% in bootstrap. Process is stopped or in an unknown state."
The issue starts when common changes in the monitored environment lead to channels receiving no data from one point forward. This is expected behavior, and no data for a couple of channels is fine. However, we encountered an issue in which another channel received a value of 0, but the error message remained identical("bootstrap" instead of the channel name that is actually down.)
All the channels have a static structure, thresholds are not changed at any point. Restarting the sensor leads to no improvement. Recreating the sensor works correctly, but it is not a solution, considering our client's monitored environment size would mean a lot of unnecessary manual labor for fixing this issue. Even more, the loss of historical data is a deal-breaker from their point of view.
Any clarifications are of great help.
Article Comments
Hello, I know that thresholds are not changeable after the initial set values. The issue at hand, summarised is: I have a channel with no received data, the error message of the sensor states that that very channel is under the threshold. When actually another channel has values under the threshold. Neither of the channels are set as being primary for the sensor.
Dec, 2017 - Permalink
Hi there,
Unfortunately, we are unable to reproduce this. In a test with a Custom Script Sensor with multiple channels and limits, when one of the channels gets no data, then the value 0 is used. But we get the message for both sensors where the values are blow the limits.
What version of PRTG are you currently using?
Best regards.
Dec, 2017 - Permalink
Hi, the version is 17.2.31.2153. The channel simply has "no data", not 0.
Just to make sure that I pictured all the details correctly:
After creating the sensor, the output is the following (the rest of the channels are removed for clarity):
Day 1:
<?xml version="1.0" encoding="UTF-8" ?> <prtg> <result> <channel>gwrd</channel> <value>0</value> <Unit>Percent</Unit> <LimitMinWarning>80</LimitMinWarning> <LimitWarningMsg>Process has a yellow status code</LimitWarningMsg> <LimitErrorMsg>Process has a status of gray/red</LimitErrorMsg> <LimitMode>1</LimitMode> <ShowTable>0</ShowTable> </result> <result> <channel>icman</channel> <value>100</value> <Unit>Percent</Unit> <LimitMinWarning>80</LimitMinWarning> <LimitWarningMsg>Process has a yellow status code</LimitWarningMsg> <LimitErrorMsg>Process has a status of gray/red</LimitErrorMsg> <LimitMode>1</LimitMode> <ShowTable>0</ShowTable> </result> </prtg>
The sensor is in an error state with the message [...] (gwrd) is below the error limit of 60% in gwrd. Process is stopped or in an unknown state.
This behaviour is correct so far.
Day 2:
<?xml version="1.0" encoding="UTF-8" ?> <prtg> <result> <channel>icman</channel> <value>0</value> <Unit>Percent</Unit> <LimitMinWarning>80</LimitMinWarning> <LimitWarningMsg>Process has a yellow status code</LimitWarningMsg> <LimitErrorMsg>Process has a status of gray/red</LimitErrorMsg> <LimitMode>1</LimitMode> <ShowTable>0</ShowTable> </result> </prtg>
The sensor does not return any data for gwrd, as the process does not exist anymore on the monitored server. (the gwrd channel still exists, but with no data message, which is correct).
The remaining channel (icman) has a value below the minimum error threshold, so the sensor should be in an error state.
However, the error message is still: [...] (gwrd) is below the error limit of 60% in gwrd. Process is stopped or in an unknown state.
To summarize: in day 2, the error message should not contain gwrd, as the channel is orphaned.
Dec, 2017 - Permalink
Hi there,
That's where the confusion came from. I assumed that you were missing the sensor message of another channel due to the missing channel.
However, this is intended as PRTG uses the last value of the sensor when the channel does not offer any new data. This is why the message of the missing channel is still appearing.
Best regards.
Dec, 2017 - Permalink
Hi there,
Unfortunately, it is not possible to change the thresholds which are set for a Custom Sensor. We have many reasons for this, but the main reason is that should we allow this, then this could cause strange errors when the values change randomly. This is why we decided to keep them static after the initial scan.
Best regards.
Dec, 2017 - Permalink