Hi,

we got a HPE DL380G10 amongst many others that work fine with the VMWare Hardware Status sensor. But this particular one shows a ton of errors for hard drives that do not exist in real life:

Error. 13 elements return an error state: Disk 15 on HPSA1 : Port Box 0 Bay 185 : 0GB : Unconfigured Disk : Disk Error; Disk 14 on HPSA1 : Port Box 0 Bay 169 : 0GB : Unconfigured Disk : Disk Error; Disk 13 on HPSA1 : Port Box 0 Bay 163 : 0GB : Unconfigured Disk : Disk Error; Disk 12 on HPSA1 : Port Box 0 Bay 162 : 0GB : Unconfigured Disk : Disk Error; Disk 11 on HPSA1 : Port Box 0 Bay 161 : 0GB : Unconfigured Disk : Disk Error; Disk 10 on HPSA1 : Port Box 0 Bay 157 : 0GB : Unconfigured Disk : Disk Error; Disk 9 on HPSA1 : Port Box 0 Bay 121 : 0GB : Unconfigured Disk : Disk Error; Disk 8 on HPSA1 : Port Box 0 Bay 105 : 0GB : Unconfigured Disk : Disk Error; Disk 7 on HPSA1 : Port Box 0 Bay 49 : 0GB : Unconfigured Disk : Disk Error; Disk 6 on HPSA1 : Port Box 0 Bay 44 : 0GB : Unconfigured Disk : Disk Error; Disk 5 on HPSA1 : Port Box 0 Bay 43 : 0GB : Unconfigured Disk : Disk Error; Disk 4 on HPSA1 : Port Box 0 Bay 41 : 0GB : Unconfigured Disk : Disk Error; Disk 3 on HPSA1 : Port 2I Box 0 Bay 7 : 0GB : Unconfigured Disk : Disk Error

What could have gone wrong there?

Things I tried so far: Cleared the IPMI like this: localcli hardware ipmi sel clear Cleared the ILO Logs (Integrated Management Log and Eventlog) Didn't help unfortunately.


Article Comments

Hello,

this kind of error points to the CIM service not recognizing the change. Sometimes it is not enough to restart it from the vSphere-Client, but you have to connect to the host via SSH and execute the commands

/etc/init.d/sfcbd-watchdog stop /etc/init.d/sfcbd-watchdog start

to get it working again.

We had some users who only brought up the CIM-Service with a restart of the machine. If the ssh-commands are not working please plan a maintenance for restarting the VMs on another host and restart the physical machine.


Sep, 2022 - Permalink

Hi,

thanks for your suggestion Arne. But it did not change the error message.

I'm also not sure what you mean by "CIM service not recognizing the change". This is a fresh PRTG setup at a new customer. This sensor was never used before on their vmware servers and since implementing these, this error exists on that particular machine.


Sep, 2022 - Permalink

Hello,

the CIM service runs on the VM, it can happen that for example a hardware component gets removed and the service still thinks it should be present, reporting that error.

If it cannot be fixed on the VM side, you can copy the errors as "known errors" into the sensor, so the sensor knows to ignore them.


Sep, 2022 - Permalink

Hi,

well that is strange though, because there has never been more than two disks attached to the server, so I really don't know where this error should come from. There have been no hardware changes to any of the vmware esx servers since they were taken into service 3 years ago. Also the fibre channel SAN hasn't been changed since then.


Sep, 2022 - Permalink

Hi,

I have some news: https://communities.vmware.com/t5/ESXi-Discussions/Hardware-Status-Alerts-Unconfigured-Disk-Disk-Error-HPSA/td-p/2847302/page/4 This seems to be a common problem and there is no fix available for the HP Smart Array P408i-a SR Gen10 Raid Controller. Apparently it is a bug in the smartpqi driver or the controllers firmware. Uninstalling the SMX-Provider fixes the issue, but then you can't monitor any disks. So I guess ignoring the sensor is the best option for now and just monitor any changes in the alarm.


Sep, 2022 - Permalink

Hello,

thank you for sharing that link!


Sep, 2022 - Permalink