I have two identical machines running PRTG. The master node will start to eat up system memory overtime while the failover node is connected. Once failover is disconnected by setting it inactive or stopping core server service, there is a huge drop and memory is freed up. It's to the point that over a few hours the Master gets to a low 20% health or almost crash. I have taken PRTG completely off both and reinstalled but will do the same. Any suggestions to fix this ?
Article Comments
Currently we are using around 375 sensors to monitor various devices. I would say about all read SNMP numeric values. Everything will be operating fine and then the Master will start to increase memory usage and will continue until we power cycle the computer or disconnect the cluster. Once done the memory goes back to normal and sometimes will stay normal for days or 5-10 mins and begin all over. Nothing changes to the sensors to make the cluster have to talk excessively. One thing I have just noticed is once this problem starts, looking at the "Cluster Health" sensor there will be a huge spike of cluster messages out (195Msg/min) and continue until shutting the cluster down then memory returns to normal and we get back to 2-5Msg/min out.
One work around that resolves itself is when you have the cluster link broken for a few hours then reengaged, sometimes this will let all run normal. Sounds like a software glitch to me.
Jul, 2012 - Permalink
We've asked you and your colleague Scott several times for the logs from both nodes, but so far have only received the logs from the Master Node. It's very hard to help for us, this way.
Jul, 2012 - Permalink
Hello,
I'm afraid we need much more details in such a case. Starting with details on the number of sensors in your cluster, the underlying machines, some screenshots showing the memory usage of the "PRTG Server.exe" in the Taskmanager. Most of the first details can be sent to us, by sending us the PRTG System Logs with the FTP-Upload-Option in the "PRTG Server Admin"-Tool.
best regards.
Jul, 2012 - Permalink