Hi

I have a remote probe installed on a customer site, which monitors itself and 2 other servers.

Every now and then I get the same problem, which errors on every sensor: WMI Connection could not be established, RPC Server unavailable, PE015.

This only ever comes up on the NON probe servers. Restarting the Probe service clears them off.

Question is, why is this happening? Is it a resource error? Tweak to be made?

Granted, the servers in question aren't exactly powerful, (ML110/115) but they are essentially not doing anything bar hosting a couple of shares and running a low impact PBX.

Is there anything I can do to stop it from happening?


Article Comments

Hello,

the first thing you should really do is checking the scanning intervals, especially on those WMI Vital System Data sensors and set those to at least 5 minutes. Maybe also remove any WMI sensors that are not "really necessary" (and only add them in certain debug-situations), or pause them. These two actions should help already. Please also make sure that the WMI Sensors inherit the scanning interval set on higher level, you can do this with the "Sensors"->"Cross Reference"-Tables, which can be sorted after the Interval-column. The tricky thing with WMI "Load" is that you usually do not see this in CPU load (of the Probe Process), because there are other bottlenecks involved in the process of WMI Monitoring. Where exactly these bottlenecks are, we sadly do not know, but suspect them in the area of RPC-Calls towards domain controllers (etc.), because each WMI Request needs to be authenticated with the domain controller first. You could also consider more Remote Probes on some of these problematic hosts. Or use SNMP to monitor them with something like SNMP Informant (which would have to be installed on each target though).

best regards.


Jun, 2014 - Permalink