Layer7 Identity Management

Expand all | Collapse all

Identity Manager Health Checks

  • 1.  Identity Manager Health Checks

    Posted 02-18-2015 12:09 PM

    What kinds of health metrics is everyone using for Identity Manager?  I'm looking for any good ideas that I might have overlooked.

     

    All of our servers have an automated log maintenance job that runs daily to zip log files older than 24 hours and delete log files older than 90 days.  I check the health of our Identity Manager infrastructure through a series of metrics.  I check these interactively (via script) every day at about 9:30am, right after my log file maintenance job finishes.  Many of these are also plugged into our company's official monitoring system so it will alert me if a service goes down.  The log thresholds are based on just watching each service to see how many files the system typically generates, since some are more prolific than others.

     

    • Automation server: # uncompressed log files older than today <= 3
    • Directory server: # DXserver services configured == # running
    • Directory server: Latest .zdb (binary) backup date == today
    • Directory server: Latest .ldif (text) backup date == Latest .zdb (binary) backup date
    • Directory server: # .zdb (binary) backups >= # non-Router DXservers
    • Directory server: # uncompressed .ldif (text) backups older than today <= # non-Router DXservers
    • Directory server: SNMP monitoring of each DXserver
    • JBoss server: # services configured == # running
    • JBoss server: IME status web page at /iam/im/status.jsp reports "OK"
    • JBoss server: JBoss status web page at /status?XML=true
      • memory used % < 80%
      • request error % (# request errors / # requests) < 40%
      • current thread % (# current threads / # max. threads) < 80%
      • busy thread % (# current busy threads / # current threads) < 80% (unless there is only 1 - the status page thread)
    • Provisioning server: # services configured == # running
    • Provisioning server: # uncompressed log files older than today <= # services * 4
    • Connector server: # services configured == # running
    • Connector server: # uncompressed log files older than today <= # services * 6
    • Reporting server: # services configured == # running
    • Reporting server: # uncompressed log files older than today <= # services + 20
    • Web Proxy server: # services configured == # running
    • Web Proxy server: # uncompressed log files older than today <= # services


  • 2.  Re: Identity Manager Health Checks

    Posted 02-27-2015 01:58 PM

    Any one able to assist Eric with his question?

     

    Thank you

    Eric Laney wrote:

     

    What kinds of health metrics is everyone using for Identity Manager?  I'm looking for any good ideas that I might have overlooked.

     

    All of our servers have an automated log maintenance job that runs daily to zip log files older than 24 hours and delete log files older than 90 days.  I check the health of our Identity Manager infrastructure through a series of metrics.  I check these interactively (via script) every day at about 9:30am, right after my log file maintenance job finishes.  Many of these are also plugged into our company's official monitoring system so it will alert me if a service goes down.  The log thresholds are based on just watching each service to see how many files the system typically generates, since some are more prolific than others.

     

    • Automation server: # uncompressed log files older than today <= 3
    • Directory server: # DXserver services configured == # running
    • Directory server: Latest .zdb (binary) backup date == today
    • Directory server: Latest .ldif (text) backup date == Latest .zdb (binary) backup date
    • Directory server: # .zdb (binary) backups >= # non-Router DXservers
    • Directory server: # uncompressed .ldif (text) backups older than today <= # non-Router DXservers
    • Directory server: SNMP monitoring of each DXserver
    • JBoss server: # services configured == # running
    • JBoss server: IME status web page at /iam/im/status.jsp reports "OK"
    • JBoss server: JBoss status web page at /status?XML=true
      • memory used % < 80%
      • request error % (# request errors / # requests) < 40%
      • current thread % (# current threads / # max. threads) < 80%
      • busy thread % (# current busy threads / # current threads) < 80% (unless there is only 1 - the status page thread)
    • Provisioning server: # services configured == # running
    • Provisioning server: # uncompressed log files older than today <= # services * 4
    • Connector server: # services configured == # running
    • Connector server: # uncompressed log files older than today <= # services * 6
    • Reporting server: # services configured == # running
    • Reporting server: # uncompressed log files older than today <= # services + 20
    • Web Proxy server: # services configured == # running
    • Web Proxy server: # uncompressed log files older than today <= # services


  • 3.  Re: Identity Manager Health Checks

    Posted 04-07-2015 10:19 AM

    You may also want to monitor other items that can impact performance, such as:

    Number of records in task persistence store.

    Number of records in audit tables.

    In addition, for SNMP monitoring of Directory ensure that you are monitoring the health of the multi-write queues.



  • 4.  Re: Identity Manager Health Checks

    Posted 04-08-2015 04:07 PM

    Thanks, Sidney.  Do you have suggestions for what number of records are "green," "yellow," or "red?"  I'm currently working on some age-based record cleanup out of the audit and archive tables in order to meet corporate records retention policies, so it shouldn't be difficult to add the record number checks in also.



  • 5.  Re: Identity Manager Health Checks

    Posted 04-07-2015 10:19 AM

    Hello,

    As far as CA Directory goes, you have all in place. What I would like to know is what kind/type of SNMP monitoring (traps) you are doing or have in place. Can you please elaborate more?

     

    Thanks,

    Hitesh Patel

    CA Support



  • 6.  Re: Identity Manager Health Checks

    Posted 04-08-2015 04:04 PM

    That's a great question.  This is very much a work in progress.  I need to follow up with the monitoring group to see what they configured (if anything) out of the MIB I sent them.  Based on the alerts I've gotten so far, it appears that they just have a simple up/down check in place.



  • 7.  Re: Identity Manager Health Checks

    Posted 04-24-2015 11:18 AM

    I'm still working with the team on what statistics to poll for an evaluation of server health, but at this time I have set up SNMP alerts for:

    • failed authentication (set auth-trap = true;)
    • multi-write replication error (set multi-write-error-trap = true;)
    • operation error (set op-error-trap = true;)


  • 8.  Re: Identity Manager Health Checks

    Posted 11-21-2017 12:44 PM

    Hi,

     

    it would be nice to have a set of predefined healthchecks into CA Identity Suite, we can start with the vApp embedded OS snmpd service and add the process monitored by the CA Identity Portal Dashboard,

     

    Can Product Management endorse this?

     

    thanks

    Attachment(s)

    zip
    snmpd.conf.zip   5K 1 version


  • 9.  Re: Identity Manager Health Checks

    Posted 11-22-2017 09:16 AM

    Good idea. Could you create an Idea item in this forum?