What are the most common errors when monitoring WMI and what can I do about them?


The Most Common WMI Errors


Note: This is only a small overview and we cannot guarantee that we offer the solution to your specific problem here, but it is a start and we will constantly update this article.


WMI Overload

Probe Health Sensor Showing a WMI Delay

The delay value shows how many WMI requests had to be globally postponed from their intended scanning intervals. This indicates an overload problem. A delay of 0% is the most favorable value. If you keep seeing a higher number over a significant amount of time, you should reduce the total amount of WMI requests on this probe by increasing the scanning intervals of the sensors. Alternatively, you can distribute the sensors among one or more additional remote probes.

Note: On Windows 7, you can run about 10,000 WMI sensors with a 1-minute scanning interval under optimal conditions (such as exclusively running the core and the target systems under Windows 2008 R2 and being located within the same LAN segment). The actual performance can be significantly smaller depending on the network topology and the WMI health of the target systems. We have seen configurations that could not go beyond 500 sensors (and even fewer).

Tip: The bottlenecks for WMI monitoring are these two services:

  • WmiPrvSE.exe
  • lsass.exe

These services do not support the usage of multiple processors. So if you encounter WMI delays and one or both of these services are running with maximal load (100% per number of processors) on the PRTG probe and/or one of the target computers, you might know where to decrease the amount of WMI monitoring requests.

WMI Timeouts

WMI timeouts are caused by several reasons. For an overview, see My WMI sensors show errors with a PE code. What does that mean?

Errors in the 800xxxxx Range

The underlying DCOM or WMI system of Windows often throws an error code that starts with "800". As there are lots of different error codes in this context, we recommend that you extend your search to Google or Bing with the specific code you encounter.

WMI Connection-Based Errors - "Connection could not be established"

Very often, PRTG is blocked from monitoring WMI counters. As these errors are on a very low communication level, no WMI sensor in the device will run and all of them will show one of the following errors:

Port Error 135: RPC Server Not Accessible

If you see this precise error message, the port that all DCOM communication protocols are routed over is blocked. This is very likely the case because the RPC server on the monitored machine:

  • is blocked by a local firewall.
  • is blocked by domain policies.
  • is not running.
  • is running on a different port than specified in the PRTG settings for this computer.

A possible solution is to add the device name and IP address in the probe host file and to use the IP address in the settings.

800706BA - RPC Server Is Unavailable

This is quite an ambiguous message, as there can be different causes for it, among which are:

  • The RPC service is not running on the monitored machine.
  • Sometimes, this error occurs when using an IP address to connect to a device. Try to use the hostname or FQDN (Fully Qualified Domain Name) instead. In PRTG, enter this information in the device's settings, section Credentials for Windows Systems.
  • If the server on which PRTG is installed is part of a domain, whereas you are trying to monitor a target machine that is not part of the domain, see How can I monitor WMI sensors if the target machine is not part of a domain? for more information.
  • The monitored machine is not able to connect to the Primary Domain Controller and is thus unable to verify the Windows credits provided for the WMI sensor. Check your Domain Controller settings.
  • Either the computer running the PRTG probe or the monitored machine are provided with wrong DNS entries. This might be the case when the machine has opened one or more VPN connections in addition to the normal network connection. You can test this with a simple ping to the machine in question. Do you see the correct IP?
  • Allow Remote Administration Exception - enabling this option is a fix that helps in some cases (even if the Windows Firewall is turned off). However, the WMI connection only works with the target's name, not with its IP address. Read the Microsoft Technet article Help: Enable or disable the remote administration exception and the Microsoft article https://support.microsoft.com/kb/947709 for how to do this.
  • Time difference: One customer reported that this error vanished when they synchronized the time on the target device, which differed for some hours, with the official time of the domain.
  • The ISA Serveris blocking all RPC traffic by default, so you have to explicitly configure the server to use WMI sensors. Read the following articles on external sites:
  • UAC blocks root access to disk drives, as one of our customers found out (see What can I do about "Connection could not be established" errors on my WMI sensors?) You can add the following registry key to disable this feature of UAC.

    Path:
    HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System
    Add a new DWORD value:
    Name: LocalAccountTokenFilterPolicy
    Value: 1

    Note: This disables some of the protection provided by UAC. Specifically, any remote access to the server using an administrator security token is automatically elevated with full administrator rights, including access to the root folder. More information can be found in the Microsoft article https://support.microsoft.com/kb/951016.
  • A couple of customers found out that opening TCP port 1091 helped them.
  • One customer reported that they had two NICs running in teaming mode for a long time without any problems when suddenly this error came up. The only solution found here was to disable one of the adapters.
  • Another customer was able to get their WMI sensors running again on their machine with two network adapters after they enabled the Windows Management Instrumentation (WMI-In) rule in the Windows Firewall.
  • Another customer reported that opening the X11 port range (6001-6032) fixed this issue.
    Background: RPC is categorized as the X11 protocol and is in the 6001 to 6032 port range. Commonly 6007 is what is being blocked in this case.
    Certain firewalls like Checkpoint do NOT allow X11 traffic even when set to Allow All, and require an explicit allow rule.
    SourceTroubleshooting ‘RPC server unavailable’ 0x800706BA

80070005 - Access is denied and 80041003 - Access Denied

There is a fine distinction between these two errors.

  • 80070005 means that the Domain Controller or the local Windows system could not verify the credentials for the target computer.
    • Perhaps wrong credentials were provided, so you might want to check your entries in the Credentials for Windows Systems section of your device, group, probe, or even root group settings.
    • DCOM needs to be enabled on probe and target computer. Check the respective registry entry.
    • If the server on which PRTG is installed is part of a domain, whereas you are trying to monitor a target machine that is not part of the domain, see How can I monitor WMI sensors if the target machine is not part of a domain? for more information.
    • We have also seen trouble with DNS/DHCP entries that directed the host name to the wrong IP address, resulting in this error. Try to use the explicit IP address as host setting if possible.
    • If the target hosts are accessible with WMI Tester, but PRTG still insists on showing the 80070005 error, try to use "localhost" as Domain or Computer Name in the Credentials for Windows Systems section of the device's settings.
    • Check the access rights of the user account under which PRTG is running. We recommend that this user account is part of the local "Administrators" group. If the user account is part of a different group, make sure that this group is part of "Administrators".
    • If this error sporadically shows up and vanishes again, we have no explanation at the moment because PRTG only reports the error the Windows System has encountered.
  • 80041003 means that the user has no sufficient rights to use WMI, so you might want to check your access rights or the respective policies.

80041002: The object could not be found

This means that the probe is able to connect to the host's WMI system, but for some reason, it is not able to see the objects that are needed for the sensor's functionality. Most likely, this is because of configuration problems regarding the access rights. One of our customers was able to avoid this error by moving the erroneous device to a different probe.

80004002: No such interface supported

This is unfortunately a very vague error. One customer reported that it appeared when the Primary Domain Controller was offline and the attempts of PRTG to monitor remote Windows computers failed because PRTG was not able to assert the respective credentials.

80070553: Cannot start a new logon session with an ID that is already in use.

Perhaps this Microsoft article might help in this case: https://support.microsoft.com/kb/2283089

80040155: Interface not registered

Perhaps this Microsoft article might help in this case: https://support.microsoft.com/kb/318956

Note:

If any one of these errors occurs, make sure that your systems meet all of the basic requirements as listed in the respective section of our main WMI article.


Sensor-Based Errors

The following errors might not affect all sensors in a device but are widespread nonetheless:

80041010 – The specified class is not valid

This is quite an ambiguous message as there can be different causes for it, among which are:

  • If certain services (for example, Exchange) are not started when the WMI AutoDiscovery/AutoPurge (ADAP) process is started, the performance counters are not transferred to WMI because WMI uses ADAP to build its internal performance counter table. Follow the steps below to fix this. For more details, see this Microsoft article(deprecated, but available via archive.org and the information is still valid).
    • Open the command console.
    • Execute wmiadap.exe /f.
    • Start WMI.
  • Performance counters are disabled for a specific service. There’s a registry entry for WMI counters for each service. To check and fix this, you have to edit the registry. For more details, see this Microsoft Technet article (deprecated, but still available via archive.org and the information is still valid).
    Note:Always back up your system before manipulating the Windows registry!
    • Key: HKLM\SYSTEM\CurrentControlSet\Services\Service-name\Performance
    • Data type: REG_DWORD
    • Range: 0|1
    • Default value: 0
    • According to Microsoft: "If the value of this entry is 1, then the Performance Library (Perflib) does not retrieve performance data about these counters from the registry. As a result, System Monitor and other tools that use the data cannot display it. Instead, the tools display a value of 0 or 100 percent, depending on the counter."
    • To activate any changes, restart the Windows computer.

Nonsensical or Wrong Results

Although the WMI counters and their results are well defined, this does not necessarily mean that Windows always adheres to this fact.

Depending on the Windows version and the current patch level, it can happen that Windows limits 64-bit counters to 32-bit values.

This causes wrong results, for example, in the system memory, or 64-bit processes never show more than 4 GB in PRTG. Or you see strange errors as described in My WMI sensors show errors with a PE code. What does that mean?, section WMI Counter Value-Related Errors.

Windows Process Sensor

We have found the reason for the limitation to 4 GB with 64-bit systems or processes. It seems that Windows' own WoW64 emulation layer for 32-bit applications (which PRTG is) somehow caps off these values at 4 GB.

Yes, you read that right: for the correct monitoring of 64-bit processes, you have to run the PRTG probe on a 32-bit machine until Microsoft fixes that bug in the WoW64 layer.

The only solution we can recommend at the moment is that you use a (remote) probe running on a 32-bit Windows for these sensors.

Windows Network Card Sensor

If your graphs and live data tables show large gaps and/or the results are too low, DEcrease your scanning interval for this sensor. As WMI only features 32-bit counters for Traffic In/Out, it is very likely that overflows happen during sacnning intervals that are too long, especially on fast (GBit) interfaces.

Shorter scanning intervals may solve this problem.

Note:

Unfortunately, for all other cases there is nothing we can do to fix these errors as they are caused by Windows. You can try to use the alternative query of a sensor if applicable.


0% Free Disk Space

Windows 2008

One customer found the following solution for this problem on W2k8 servers:

"A while back I started closely monitoring the WMI service on the various servers that often false alarm during WMI overload. Most often, the free space check would fail returning a null or zero value instead of the correct value for free disk space. To troubleshoot, I added the PRTG WMI service monitor for the WMI Service itself with a 60-second interval. I also monitored the system and application event logs with 15 minute intervals. It helped to get more specific errors about what was going on with WMI. Most often I'd get one of these errors in succession:

  1. WMI Free Disk Space [Multi Drive]: 0 % (Free Space C:) is below the error limit of 10 %
  2. Windows Management Instrumentation (WMI Service) Warning: 80041006: There was not enough memory for the operation.
  3. Event Log (Windows API) Value changed: Faulting application wmiprvse.exe, version 6.1.7600.16385, faulting module 4a5bc794, version ole32.dll, fault address 0x6.1.7600.16624

Armed with this new information, I was able to find two Microsoft KB articles that seemed to match the symptoms:

  • 958124 A wmiprvse.exe process may leak memory when a WMI notification query is used heavily on a Windows Server 2008-based or Windows Vista-based computer
  • 954563 Memory corruption may occur with the Windows Management Instrumentation (WMI) service on a computer that is running Windows Server 2008 or Windows Vista Service Pack 1

Applying these two hotfixes stopped the false alarms on my Windows 2008 and 2008 R2 servers."

ESXi 5 and Windows 2012

Another customer found a correlation between this error and the hot plug capability of drives under ESXi. After disabling the hot plug capability, WMI was able to report the correct values of the drives again. See this article in VMware's knowledgebase for further instructions.

Here is an excerpt of what the customer told us: "The circumstances appear to only be with Server 2012 and ESXi 5.x. Hard disks and other devices are viewed as hot swappable, these devices are actually not picked up by a bunch of windows services. [...] IF you are unable to see the drives via the WMI Test and/or by browsing to the administrative IPC share of the disk this should fix the problems.**


More information about WMI and PRTG


Disclaimer:
The information in the Paessler Knowledge Base comes without warranty of any kind. Use at your own risk. Before applying any instructions please exercise proper system administrator housekeeping. You must make sure that a proper backup of all your data is available.