I want to switch from WMI sensors to SNMP for monitoring/recording metrics where a shorter refresh interval is useful for us. Specifically CPU and Network I/O, to start. The currently recommended 60 second interval for WMI does not give us the resolution we need to troubleshoot some issues, so I feel SNMP is the way to go.
I have everything configured and working on our Windows servers with SNMP, however I noticed some strange behaviour in the recorded (historical) data. I have my SNMP sensor refresh interval set to 10 seconds, but it doesn't seem to be receiving updated data from Windows on every scan. For the CPU sensor it seems the values change every 2-3 scans, and for I/O sensor the values every 2-3 scans fall to 0 (even though there is constant network use on the monitored interface).
This can be better described by pictures...
CPU:

Network:

I feel this is more of a Windows SNMP issue than a PRTG one - since I do not see this problem when monitoring non-Windows systems via SNMP, even with a 10 second scanning interval. It seems Windows SNMP counters do not refresh on each request, there must be some other mechanism which does this and I don't know how that can be modified/tweaked. I haven't had any luck finding information about SNMP counter refresh times in Windows, so I thought I would ask here!
Thanks,
-Alex
Article Comments
Hello Alex,
we appreciate your contact.
You're absolutely correct in your findings:
We must agree that all operating systems will have some delay when updating their SNMP Counters, this is also understandable because SNMP is designed to be lightweight and having counters updated in real-time would certainly cause a larger performance impact than having them updated every 5, 10 or even 15 seconds. Only Microsoft would be able to confirm the interval for refresh or if there's a tweak/hack, but we certainly can (in the same way as you) notice that is exists.
You can confirm this by using a known OID from the list below and use our SNMP Tester to perform some queries with 1 or 2 seconds of interval.
1.3.6.1.2.1.25.3.3.1.2 | hrProcessorLoad |
1.3.6.1.2.1.2.2.1.10 | ifInOctets |
If you perform a Walk from the OID's above you will be able to get the indexes of your interface or CPU's and query them directly using the Custom OID test, you will be able to confirm that they don't update in realtime. (Oddly enough the exception is Uptime, which appears to update in "realtime").
You may also experience similar behavior when polling Cisco Devices with Silicon Switching enabled:
What this means for PRTG (and your monitoring)
We officially don't support scanning intervals below 10 seconds and don't recommend intervals below 30 seconds, this will allow you to avoid situations like this one, reduce the storage used for monitoring data and reduce the load on PRTG and on the monitored systems.
But with short scanning intervals this behavior exists and will also lead to different results depending on the sensor type that you're using:
Delta Sensor
The Delta Sensor (SNMP Traffic for example) will calculate the difference between two readings to give you the speed over a given time (6000 bytes of difference in 30 seconds means 200 bytes per second(and so on..)). This means that if a counter isn't updated until the next scan, the current and previous reading will be 0, which can lead to un-pretty graphs with a lot of zeros(Valleys) in it. The volume measured by the sensor in the end of the day will still be correct, the averaged graphs should also look better.
Gauge Sensor
A Gauge Sensor (Like the CPU Sensor above) will behave differently, as PRTG will always query and record the current value. You won't notice this latency that much in this case, as the only effect will be a "squarish" graph, instead of a smooth one.
Best Regards,
Luciano Lingnau [Paessler Support]
Dec, 2015 - Permalink
Hello Alex,
we appreciate your contact.
You're absolutely correct in your findings:
We must agree that all operating systems will have some delay when updating their SNMP Counters, this is also understandable because SNMP is designed to be lightweight and having counters updated in real-time would certainly cause a larger performance impact than having them updated every 5, 10 or even 15 seconds. Only Microsoft would be able to confirm the interval for refresh or if there's a tweak/hack, but we certainly can (in the same way as you) notice that is exists.
You can confirm this by using a known OID from the list below and use our SNMP Tester to perform some queries with 1 or 2 seconds of interval.
If you perform a Walk from the OID's above you will be able to get the indexes of your interface or CPU's and query them directly using the Custom OID test, you will be able to confirm that they don't update in realtime. (Oddly enough the exception is Uptime, which appears to update in "realtime").
You may also experience similar behavior when polling Cisco Devices with Silicon Switching enabled:
What this means for PRTG (and your monitoring)
We officially don't support scanning intervals below 10 seconds and don't recommend intervals below 30 seconds, this will allow you to avoid situations like this one, reduce the storage used for monitoring data and reduce the load on PRTG and on the monitored systems.
But with short scanning intervals this behavior exists and will also lead to different results depending on the sensor type that you're using:
Delta Sensor
The Delta Sensor (SNMP Traffic for example) will calculate the difference between two readings to give you the speed over a given time (6000 bytes of difference in 30 seconds means 200 bytes per second(and so on..)). This means that if a counter isn't updated until the next scan, the current and previous reading will be 0, which can lead to un-pretty graphs with a lot of zeros(Valleys) in it. The volume measured by the sensor in the end of the day will still be correct, the averaged graphs should also look better.
Gauge Sensor
A Gauge Sensor (Like the CPU Sensor above) will behave differently, as PRTG will always query and record the current value. You won't notice this latency that much in this case, as the only effect will be a "squarish" graph, instead of a smooth one.
Best Regards,
Luciano Lingnau [Paessler Support]
Dec, 2015 - Permalink