We are gradually bringing our implementation of PRTG live and are currently enabling notifications for our sensors, although the sensors themselves have been active for nearly two months.

We have four Cloud HTTP sensors monitoring customer facing sites with a simple Get, and notifications set with basic Down state and threshold triggers. Scanning intervals are set to 10 minutes, and Timeout to the maximum supported value of 5 seconds.

Since enabling the down state notification we have noticed that one or more of our four sensors occasionally return socket error # 10060 for one or more scanning intervals. Checking the actual site when the sensor reports the error shows that there is no issue with the actual site.

Reviewing the historical data for one such sensor shows that these occur infrequently but regularly enough to be a concern, and occasionally for extended periods of time (to a maximum of 90 minutes in July).

Today (26th August) we've had 4 such occurrences between 8:30 am and 11:21 am, prompting me to ask the following questions:

1. Is this expected behaviour at present with these sensors?

2. Is there something I can do to mitigate the issue?

NB: We do also have external monitoring in place for these sites (running from a raspberry pi at my house) but I was hoping to be able to prioritise in-house monitoring using PRTG.


Article Comments

This continues to be a major concern, with several down alerts over the UK bank holiday weekend reporting the same socket 10060 error.

I'd really appreciate some input on this.


Aug, 2016 - Permalink

Hello,

Please forward us some screenshots showing the "Overivew", "Log" and "Settings" tab of the cloud http sensor to support@paessler.com. Please refer to this kb post.


Sep, 2016 - Permalink

Screenshots sent and acknowledged as received.

Thank you.


Sep, 2016 - Permalink

Sorry to bump this, however 11 days on from sending the info as requested I've received nothing further. I've also had no reply to a followup email yesterday reporting another set of the #10060 across 4 of 5 sensors.

The intention is to bring the monitoring fully online shortly - at the moment I'm the only one getting the notifications produced as I try to tune out any static produced by normal operations.

At the moment I cannot rely on the cloud http sensors.


Sep, 2016 - Permalink

I came here with this same issue... ALL cloud sensors reporting socket error, yet sites are up... Scarry.. Is something going on with your cloud servers?


Feb, 2017 - Permalink

@pir8radio

Not sure if this will help, however...

We traced our issue to firewall problems resolving the api.prtgcloud.com due to it being hosted on Amazon, and our firewall does not currently support wildcard FQDNs (i.e. *.prtgcloud.com). This meant that periodically our monitoring server would lose contact with the cloud sensors and report the socket error. I worked round this by permitting HTTP and HTTPS traffic to Amazon CloudFront edge servers based on Amazon's own documentation:

http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/LocationsOfEdgeServers.html http://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://ip-ranges.amazonaws.com/ip-ranges.json

This was then backed up with a web filter to restrict traffic to the relevant FQDNs (api.prtgcloud.com, download-cdn.paessler.com, download-s3-eu.paessler.com.s3.amazonaws.com) to prevent unwarranted traffic to other (possibly malicious) AWS sites from the monitoring server.

This seems to have resolved our issues with the cloud sensors so far - we've had no failures since adopting this approach in September last year.

YMMV depending on firewall provider.


Feb, 2017 - Permalink

Hi there, thanks for helping out here, @CodeMonkey. Regarding your ticket, we haven't received one from you (that is, if you were using the same mail address as in here) :(


Feb, 2017 - Permalink