I have a need to review a web pages contents to find out if a particular service is running. The only identifier which specifies if the service is active or not is a HTML Label, so there is no string on the page to read.

Is there a known script which can read a HTML object in the source and return whether the string contains "Running" to prtg?


Article Comments

Hi Gary,

I can write you a small script which is exactly doing what you want. For this purpose, please upload the sourcecode of the website to Pastebin or any other code-paste service.

Best regards.


May, 2017 - Permalink

Thank you Dariusz,

I have added the pages source to pastebin: https://pastebin.com/2xs5wEH6

I've set it to be active for a week...

The object we are looking for is label id="status" found in div id="container-content-source". We want to identify the string within the tags. If ="Running", everything is OK. If ="Completed", service is not active and must be reviewed.

However, as I have been digging through the source, I discovered the div id="container-content-source" object is entirely dynamic and doesn't exist in the actual HTML file source but only in the active source in the web browser.

Regards, Gary


May, 2017 - Permalink

Hi Gary,

Please try the following script in PowerShell first. It should result in:

<prtg>
    <result>
        <channel>HTTP Status Code</channel>
        <value>200</value>
        <float>0</float>
        <unit>CustomUnit</unit>
        <customunit>HTTP</customunit>
    </result>     
    <text>Keyword "Running" found, as expected!</text>
</prtg>

Afterwards just add the Custom Sensor to the following directory on your PRTG Core Server or/and your Remote Probes Server: C:\Program Files (x86)\PRTG Network Monitor\Custom Sensors\EXEXML\. Add a "EXE/Script Advanced Sensor" in PRTG and use the following parameters:

-url "<URL-TO-THE-WEBSITE>"
-keyword "<KEYWORD TO LOOK FOR (Default: Running)>"

The script:

#    ____  ____  ____________
#   / __ \/ __ \/_  __/ ____/
#  / /_/ / /_/ / / / / / __  
# / ____/ _, _/ / / / /_/ /  
#/_/   /_/ |_| /_/  \____/                         
#    NETWORK MONITOR
#-------------------
#(c) 2017 Dariusz Gorka, Paessler AG
#
# Checks certain website for keyword.
#

# Parameter "-url" for the URL of the website
# Parameter "-keyword" for the keyword that has to be available

param(
        $url="http://my-computer/test.html",
        $keyword="Running"
)

$regex_status = "(<label id=.status. class=.pull-right.*.>)(.*)(</label>\s+</div>\s+<div class=.content-source-statistics.>)"

$site = Invoke-WebRequest -Uri $url -UseBasicParsing
$site_content = $site.Content
$site_statuscode = $site.StatusCode

$site_result = $site_content -match $regex_status
$site_result = $matches[2]

if($site_result -eq $keyword){
$keyword_found = $true
} else {
$keyword_found = $false
}

Write-Host @"
<prtg>
    <result>
        <channel>HTTP Status Code</channel>
        <value>$site_statuscode</value>
        <float>0</float>
        <unit>CustomUnit</unit>
        <customunit>HTTP</customunit>
    </result>     
"@

if($keyword_found){
Write-Host @"
    <text>Keyword "$keyword" found, as expected!</text>
"@
} else {
Write-Host @"
    <text>Keyword not found, found "$site_result" instead!</text>
    <error>1</error>
"@
}

Write-Host "</prtg>"

Best regards


May, 2017 - Permalink

Hi Dariusz,

I really appreciate your help here.

When running the script in Powershell, it breaks with the following error:

Cannot index into a null array. At U:\Scripting\Scripts\PRTG-Find-Web-page-keyword.ps1:28 char:1 + $site_result = $matches[2] + ~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (:) [], RuntimeException + FullyQualifiedErrorId : NullArray

I assume this is because the query is unable to pick up the $regex_status value on the page, possibly? Could this be down to the content being dynamic?

Regards, Gary


May, 2017 - Permalink

Hi Dariusz,

When calling $site.content outside of the running script, it returns 20,000 lines of numbers that appear to be pretty random, like below:

60 33 68 79 67 84 89 80 69 32 72 84 77 76 32 80 85 66 76 73 ...

I guess this is why $Matches ends up empty as $site_result = $site_content -match $regex_status does not find the string in the content.

Any ideas why this may be?

Thanks and Regards, Gary


May, 2017 - Permalink

Hi Gary,

What kind of site is this? Is it a normal HTML/PHP Site or some Java application? The thousands lines of numbers indicates that PRTG queried some kind of file or executable instead of a normal webpage.

What exact URL are you using in the "-url" parameter?


May, 2017 - Permalink

Hi Dariusz,

The URL is http://localhost:50505/aspire/files/home.html - running powershell on the local server. It is a third party tool we use. It does use Javascript and the under lying service uses Java.

Regards, Gary


May, 2017 - Permalink

Sorry, just checking now and it does run as a Java application.

Gary


May, 2017 - Permalink

Hi Gary,

Then I am afraid that there is no possibility to monitor the status via the Java Application. :(

Perhaps the software offers some kind of API in XML/JSON?


May, 2017 - Permalink

Thanks Dariusz, I'll see if we can do something on those lines. Your help is greatly appreciated.

Regards, Gary


May, 2017 - Permalink