DX Unified Infrastructure Management

  • 1.  URL Response Probe

    Posted Mar 02, 2012 07:26 PM

    I have some url profiles setup with the following settings:

     

    Check interval - 300 sec

    Timeout - 30 sec

    Retries before failure - 3

    Average over 5 samples

     

    I have this setup to alert when a certain substring in the page is found.  Can someon explain how the timeout and retries work?

    I'm confused if this timeout means just not being able to access the url or something different?

     

     

    The reason I ask is because if the substring is Found, I want it to somehow retry before alarming.  I'm assuming this is the Retries before failure?



  • 2.  Re: URL Response Probe

    Posted Mar 02, 2012 09:41 PM

    You are right. Retries just means trying to load the URL again if there was a failure loading. If the URL successfully loads, the probe assumes it has good information and alarms on the string check if the condition is met.



  • 3.  Re: URL Response Probe

    Posted Mar 02, 2012 11:48 PM

    I see now there is an option under Alarm substring-"Retry page fetch on alarm situation"

     

    I have it set to alarm when the substring is found but I dont want it to alarm unless it has retried 3 times with 30 seconds each.

     

    Do I just check "Retry page fetch on alarm situation", set the Retries to 30 and Retry timeout to 30 seconds?



  • 4.  Re: URL Response Probe

    Posted Mar 03, 2012 02:36 AM

    It sounds like that option is probably just what you need. You should be able to test it by using a different string or URL just to be sure it behaves the way you want.



  • 5.  Re: URL Response Probe

    Posted Mar 03, 2012 03:06 AM

    I set this up and I see in the logs it is retrying 9 times waiting 30 seconds before each retry which is what I specified.  I thought it would only retry if the substring was found but it looks like its retrying 9 times regardless if the substring is found or not. 

     

    I see this in the logs:

     

    "http.Get() returned 200 (The request completed successfully.)
    Mar  3 00:01:46:189 [58704] url_response: Set for Redirect Retry."

     

    I am also seeing multiple alerts saying "Profile is delayed because it is already running"

     

    How can I stop it from being delayed so that we don't receive those alarms?

     



  • 6.  Re: URL Response Probe

    Posted Mar 05, 2012 05:57 AM

    How it the profile still running 300 seconds later if it retrying 9 times with 30 seconds between each retry? That should stop it 30 seconds before the next run unless the page loads are taking a long time. Can you get the exact timings of each retry from the log to figure out what is happening? The simple way to make the profile not run at the next interval would probably be to cut the retries down to 8. But I think it would make sense to figure out exactly why it is still running.

     

    Are you sure the probe is retrying regardless of whether it finds the substring or not? Does it log something differently that lets you know when the substring is there and when it is not?



  • 7.  Re: URL Response Probe

    Posted Mar 05, 2012 06:38 PM

    keith,

     

    I am going to go through the logs to get the exact times it is retrying. 

     

    Our urls are set up with a health status page that we are checking with the url response probe.  We have a regular expression being used in the substring to check for a Dead Rollup in the healthstatus page of this url.  I have been checking the health status page while viewing the logs and I can see that is it retrying even though I am not seeing the dead rollup in the health status page.

     

    I've verified the substiring and url response worsk as before the retries were configured the url response probe would generate an alert only when the dead rollup was there.



  • 8.  Re: URL Response Probe

    Posted Mar 05, 2012 10:42 PM

    Yeah, I was specifically wondering what the probe sees (which might not be easy to tell) rather than what you see on the page, :smileyhappy: but your description of the behavior before you enabled the retries probably answers my question. I thought we might need to make sure the "bad string" did not show up in an unexpected place even when everything is okay, but if the probe worked properly without the retries, it must be seeing the right thing.

     

    I find it odd that the log message is "Set for Redirect Retry", which would indicate the retry is the result of a redirect. But if the redirects were the problem, you should see an issue even without retries. It might just be a bad log message.

     

    The configuration option seems to be pretty explicit that the probe should only retry when there is an alarm condition, which would be exactly what you want. Retrying when everything is okay does not make any sense, so it seems like there must either be a bug or something that looks wrong to the probe but is not obvious. Maybe I can setup a test URL and check if I am able to reproduce the problem.



  • 9.  Re: URL Response Probe

    Posted Mar 07, 2012 09:10 AM

    I seem to be having the opposite problem. When the substring is found, the probe is not retrying at all but just generates the alarm. The probe help seems to indicate that the retry on alarm situation has something to do with redirects, which would be consistent with the message you see in the log. The test page I am using does not have a redirect.

     

    What is the substring you are generating an alarm on? I would like to test with the same string. Maybe you can also try my test URL once I have it setup properly.

     

    BTW, what version of url_response? It might matter...



  • 10.  Re: URL Response Probe

    Posted Oct 04, 2013 04:36 PM
    Sorry for the stupidly late response, but figured I just add my 2cent worth.

    According to Nimsoft, the documentation regarding that "Retry page fetch on alarm situation" is incorrect. The wording implies that you if you are unable to load the page, or if your content match fails, it would retry.

    However, for some absurd reason, this option is actually related to redirects.