DX Unified Infrastructure Management

View Only

Back to discussions

Expand all | Collapse all

Hub queue check tool

1. Hub queue check tool

3 Recommend
Luc Christiaens
Posted Jun 08, 2022 07:03 AM
Edited by Luc Christiaens Mar 08, 2023 01:02 AM
| view attached (0)

Reply Reply Privately
Attached LUA script: queuecheck.lua will: (version: 2.9.2)
- monitor all hub and nas queues in your environment
- send alarms when thresholds' are reached (optional)
- send email when thresholds are reached (optional)
- create qos metrics to create Dashboard overviews or Jaspersoft report (optional)
- sample dashboard/prd/listview/cabi is included
This is a UIM 20.4 version that can create valid QOS entries with metricid when launched via NSA 20.30 or higher
The attached zip contains: LUA script, doc file, sample dashboard zip that can be imported, script used to email
Sample dashboard:

You can now select a queue to obtain more detail:

In version 2.9.1:

you can run now without creating qos metrics, only generating alarms

you can run now without generating alarms, only creating qos metrics for reporting

there are now 3 threshold levels for queued alarms

like recommended modified script to make use of local variables

added example Listview report

each type of alarm can now have a custom severity level (or the value: 'n' to not generate that alarm)

added threshold and alarm for the internal nas queues

tested on 20.4.6 (cu6)

In version 2.9.2:

added another idea to create a controlled loop to access the hubs (instead of only 1 retry). You can specify in a variable the maximum iterations.

Updated package 2.9.2 can be found at the end.
All comments, ideas are very welcome

#uim #queue #check #tool #lua #threshold #monitor #hub #nas #script
2. RE: Hub queue check tool

0 Recommend
Garin Walsh
Posted Jun 08, 2022 09:58 AM

Reply Reply Privately
Seems to be a delay on posts.....

Original Message
3. RE: Hub queue check tool

0 Recommend
Luc Christiaens
Posted Jun 09, 2022 09:57 AM
| view attached (0)

Reply Reply Privately
Zip does not contain example dashboard if you set the parameter:
target_source='target'
Note: If you want to create PRD graphs of the creates qos metrics you need to set this target_source parameter to 'target'
This means that the qos will be created
- source column will be your robot name
- target column will be the queue name prefixed by hubq_ or nasq_ and optionally the hubname (if parameter long_qos='y' is set)

Original Message
4. RE: Hub queue check tool

0 Recommend
Garin Walsh
Posted Jun 10, 2022 10:16 AM

Reply Reply Privately
Really appreciate the presentation of this content.

To highlight a couple things:

I am really jealous of your attention to detail regarding documentation of your efforts. I strive to match it but understand that I'll never come close (I'm too lazy I think....)

I struggled a long time trying to get the new QOS metrics thing to work right and never got it where I needed it to be. I read through your code and it looks just like mine so I'll have to go back again and compare. The main issue I was having is the inability to specify a source that wasn't the robot running the script. Not sure if that comes into play here. Regardless having a working example is a huge benefit given the number of errors and omissions in the documentation and physical implementation. For those who do try this, discovery_server has to vacuum up the new metric ids created locally before you'll see correct results which could be a long time depending on how you have discovery configured to run.

Were I writing this code for my environment I'd make these additions:

Nothing in Nimsoft happens reliably 100% of the time - especially when the number of hubs gets largish and/or you have something more than robots talking to a hub or a hub talking to a hub. As soon as you hit hub talking to hub talking to hub the likelihood of any communication failing (in my personal experience) starts being a two digit number. Ultimately I've wound up wrapping all the nimsoft.* calls in a retry loop so that can be addressed.

I haven't tested with the latest Lua 5.4 (I think that's the current version) but the 5.2 version was very slow when accessing global variables. There's not a lot of looping in this code but when you get a couple hundred accesses to a global variable you can measure with a stop watch the runtime difference compared with a local. I've gotten in the habit of wrapping all code within a do/end pair to force any variable creation to be local rather than global. I've not looked at the actual Lua interpreter code but the argument is that the global variables are searched sequentially and that the locals are in a hash table so runtime is comparable with a small number of items but very quickly favors local variables as the number increases.

Similar to the above everything in Lua winds up being a table or a list so when you execute nimbus.request for instance, the code has to locate the global nimbus table by looking in the global table of globals and then find the table entry named "request" in that nimbus table and then it executes that value as a function. It's a lot of lookups. You can save a bunch of that lookup time with something like "local NimbusRequest = nimbus.request" and then use the local NimbusRequest() to invoke the function. Again, it makes little difference if you hit it 10 times in a script but if you do it a thousand times you can measure it.

Thank you again

Garin

Original Message
5. RE: Hub queue check tool

0 Recommend
Garin Walsh
Posted Jun 10, 2022 10:16 AM

Reply Reply Privately
Really appreciate the presentation of this content.

To highlight a couple things:

I am really jealous of your attention to detail regarding documentation of your efforts. I strive to match it but understand that I'll never come close (I'm too lazy I think....)

I struggled a long time trying to get the new QOS metrics thing to work right and never got it where I needed it to be. I read through your code and it looks just like mine so I'll have to go back again and compare. The main issue I was having is the inability to specify a source that wasn't the robot running the script. Not sure if that comes into play here. Regardless having a working example is a huge benefit given the number of errors and omissions in the documentation and physical implementation. For those who do try this, discovery_server has to vacuum up the new metric ids created locally before you'll see correct results which could be a long time depending on how you have discovery configured to run.

Were I writing this code for my environment I'd make these additions:

Nothing in Nimsoft happens reliably 100% of the time - especially when the number of hubs gets largish and/or you have something more than robots talking to a hub or a hub talking to a hub. As soon as you hit hub talking to hub talking to hub the likelihood of any communication failing (in my personal experience) starts being a two digit number. Ultimately I've wound up wrapping all the nimsoft.* calls in a retry loop so that can be addressed.

I haven't tested with the latest Lua 5.4 (I think that's the current version) but the 5.2 version was very slow when accessing global variables. There's not a lot of looping in this code but when you get a couple hundred accesses to a global variable you can measure with a stop watch the runtime difference compared with a local. I've gotten in the habit of wrapping all code within a do/end pair to force any variable creation to be local rather than global. I've not looked at the actual Lua interpreter code but the argument is that the global variables are searched sequentially and that the locals are in a hash table so runtime is comparable with a small number of items but very quickly favors local variables as the number increases.

Similar to the above everything in Lua winds up being a table or a list so when you execute nimbus.request for instance, the code has to locate the global nimbus table by looking in the global table of globals and then find the table entry named "request" in that nimbus table and then it executes that value as a function. It's a lot of lookups. You can save a bunch of that lookup time with something like "local NimbusRequest = nimbus.request" and then use the local NimbusRequest() to invoke the function. Again, it makes little difference if you hit it 10 times in a script but if you do it a thousand times you can measure it.

Thank you again

Garin

Original Message
6. RE: Hub queue check tool

0 Recommend
Garin Walsh
Posted Jun 10, 2022 11:17 AM

Reply Reply Privately
Only took 48 hours for those posts to be moderated - wonder what happened to cause that.....

Original Message
7. RE: Hub queue check tool

0 Recommend
Luc Christiaens
Posted Aug 01, 2022 02:34 AM

Reply Reply Privately
To have this LUA script generate correct QoS metrics (that can also be used in OC metric view) you will need:
nsa 20.50 build 190

Original Message
8. RE: Hub queue check tool

0 Recommend
Luc Christiaens
Posted Feb 26, 2023 05:46 AM

Reply Reply Privately
Version 2.9.1 LUA package (doc file included in zip file)

Original Message
9. RE: Hub queue check tool

1 Recommend
Luc Christiaens
Posted Mar 08, 2023 01:06 AM
| view attached

Reply Reply Privately
Version 2.9.2 LUA package (doc and dashboard included in the zip)

See initial post for details

Attachment(s)

queuecheck_2.9.2.zip 1.43 MB 1 version

Original Message

DX Unified Infrastructure Management

Hub queue check tool

Luc ChristiaensJun 08, 2022 07:03 AM

Garin WalshJun 08, 2022 09:58 AM

Luc ChristiaensJun 09, 2022 09:57 AM

Garin WalshJun 10, 2022 10:16 AM

Garin WalshJun 10, 2022 10:16 AM

Garin WalshJun 10, 2022 11:17 AM

Luc ChristiaensAug 01, 2022 02:34 AM

Luc ChristiaensFeb 26, 2023 05:46 AM

Luc ChristiaensMar 08, 2023 01:06 AM

1. Hub queue check tool

2. RE: Hub queue check tool

3. RE: Hub queue check tool

4. RE: Hub queue check tool

5. RE: Hub queue check tool

6. RE: Hub queue check tool

7. RE: Hub queue check tool

8. RE: Hub queue check tool

9. RE: Hub queue check tool