Symantec Identity Management

Expand all | Collapse all

IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

  • 1.  IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 05-20-2016 02:05 PM
      |   view attached

    Team,

     

    If you plan to use the IMPS CLI tool of etautil, please consider using the input file switch for a performance gain.

     

    Comparison examples:

     

    Slowest:      etautil used for 100,000 updates.    Where etautil command is used on every line.

    etautil.exe" -o -d im -u etaadmin -p Password01   ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000001

    etautil.exe" -o -d im -u etaadmin -p Password01   ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000002

    etautil.exe" -o -d im -u etaadmin -p Password01   ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000003

    ...

    etautil.exe" -o -d im -u etaadmin -p Password01   ACTION_VERB_+_BASE_DN_+_ITEM_HERE_100000

     

    The reason is the etautil process has a built-in error checking routine that executes to validate the input text or file.

    If this error checking routine takes 1 second,  then 100,000 lines would add this many seconds to processing of the job as overhead.

     

    Observed Rate:    1 account per 4 seconds

    100,000 rows x 4 (sec/id) = 400,000 seconds = 112 hours ( about 5 days)

     

    *** ***

     

    Faster: etautil used for 100,000 updates.   Where etautil is called ONCE and uses a file for input.

    Create an input file, e.g. input.txt where each line now has the content + a SEMICOLON for each line.

     

    input.txt

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000001;

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000002;

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000003;

    ...

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_100000

     

    And call this input file ONCE with etautil.

     

    @echo on

    "C:\Program Files (x86)\CA\Identity Manager\Provisioning Server\bin\etautil.exe" -o -d im-u etaadmin-p Password01 -f  input.txt

    pause

     

    Since etautil is only called once, it will only perform its error checking routine once for a 100 K row input file.

    If any part of this file is not formatted correctly, etautil will stop and error out.

    If there are incorrect attributes that don't exist in the IMPD user store, the etautil script will report this error but continue.

     

    Observed Rate:   15-17 accounts per second

    100,000 rows x (1/15 rows/sec) = 6667 seconds = 2 hours

     

    *** ***

     

    Even Faster:   PARALLEL Processing

    etautil used for 100,000 updates.   Where input file is broken into two (2) sections of 1/2 size AND etautil is called ONCE for each file and uses the files for input.

     

    Create a single input file, e.g. input.txt where each line now has the content + a SEMICOLON for each line.

     

    input.txt

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000001;

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000002;

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_000003;

    ...

    ACTION_VERB_+_BASE_DN_+_ITEM_HERE_100000

     

    Then split this file into two (2) each sections:   input_01.txt and input_02.txt

     

    And call each file ONCE with etautil in two (2) different batch processes

     

    @echo on

    "C:\Program Files (x86)\CA\Identity Manager\Provisioning Server\bin\etautil.exe" -o -d im-u etaadmin-p Password01 -f  input_01.txt

    pause

     

    @echo on

    "C:\Program Files (x86)\CA\Identity Manager\Provisioning Server\bin\etautil.exe" -o -d im-u etaadmin-p Password01 -f  input_02.txt

    pause

     

    The IMPS server AND IMPD userstore will be able to handle up-to 8-16 parallel scripts before a 4vCPU IMPS server is 100% constrained.

    Again:   Since etautil is only called once per script, it will only perform its error checking routine once for a 1/2 of the 100 K row input file.

    If any part of this file is not formatted correctly, etautil will stop and error out.

    If there are incorrect attributes that don't exist in the IMPD user store, the etautil script will report this error but continue.

     

    Observed Rate:   15-17 accounts per second    x    2

    100,000 rows x (1/30 rows/sec)  = 3334 seconds =  < 1 hour

     

    *** ***

     

     

    Fastest:   IAMCS Load Balancing - Avoid single thread connectors to endpoints + PARALLEL Processing

    Ensure that the IAMCS "Routing Rules" allow for Round Robin Load Balancing to the CCS and JCS connector servers.

    This can be validated by monitoring the CCS logs or JCS logs, when two (2) separate transactions are pushed through for a single endpoint.

    Transaction 1, should show up on IMPS Server #1

    Transaction 2, should show up on IMPS Server #2      (If not, then IAMCS is setup in FAILOVER model)

     

     

    Observed Rate:   15-17 accounts per second    x    2   x  2

    100,000 rows x (1/60 rows/sec)  = 1667 seconds = 28 minutes

     

     

     

    *** ***

     

     

    -  Notes:

    etautil Gotchas -

    1) Ensure the attribute values are wrapped with single or double quotes if there are spaces or UTF characters in the values

    2) Ensure when using the input file, that each line ends with a semi-colon character; except for the very last line.  This lets etautil know there are addition lines to process and when to stop.

    3)  Use the -DYN switch if any of the content contains a dynamic endpoint, e.g. CX connectors.    Not needed for standard OOTB connectors

     

     

    2016-05-21 Edit:   Added reference deck to support observations and testing methodology

     

     

     

    2018-03-15 Edit:   Add clarity to the IM provisioning tier "Routing Rules" that provide the load-balancing feature.

    Clarity on the IM Provisioning Routing Rules for IAMCS/CCS Connectors 

     - Note:   IAMCS (with embedded CCS or remote managed single CCS service) may only have a 1:1 relationship with a single CCS service.   See note above for more information.

     

     

     

     

    Cheers,

     

    A.



  • 2.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 05-23-2016 10:32 AM

    Wow - from about 5 days down to 28 minutes. Really nice work!!



  • 3.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 03-27-2018 09:05 PM

    Nice analysis Alan!



  • 4.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 10-24-2018 07:45 AM

    As a PM on a large implementation I wish I had found this earlier.  I was looking for parallel processing during activities that were being executed, but I didn't know what to look for.  I only recently learned it was etautil that was being run and found your great article through Google search.  In any case, pursuing this now, with the backup of this great article.  With more than one PS I assume we could get even greater throughput and we have an opportunity to take advantage still.

     

    In the interest of even greater efficiency I wonder if you can address the apparent inability to query/extract the user ID's in a given state that require a certain etautil execution?  Currently the team is casting a wide net, triggering some kind of sync for users that were just assigned a new role, but for which the connection to that endpoint has not been established yet.  I apologize for not being able to be precise in describing the exact scenario.  They are able to drill in through the UI on a user by user basis and see the missing connection, but doing that for thousands of users obviously isn't practical.  As a result they are executing for a much larger group that is known to have the role assigned, regardless of whether the connection has been made or not.

     

    Even if there is no method through the UI or a command line utility to query the same, I'm thinking through a safe, read-only query of the underlying DB, an informed admin might be able to piece together a query that could extract the user ID's in the state needing action.  I know this is vague, but interested in your thoughts.

     

    Thanks much!



  • 5.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 10-24-2018 12:37 PM

    I would place this question on how to build better reports for extracts that may be used for scripts or bulk loads.

     

    This is very possible.    The amazing feature about the IMPS provisioning tier, is that it acts like a virtual directory with its connector tier.    You can directly query the IMPS provisioning tier and ALL managed endpoints via any standard LDAP client tool.

     

    I recommend teams use either SoftTerra LDAPBrowser (free & read-only) or Apache Directory Studio (free) to build CSV extracts for reports of LDAP queries.    It is possible to use CLI (ldapsearch/dxsearch) commands and then convert the LDIF file to a CSV file if this needs to be automated.

     

     

     

    Here is a view of using SoftTerra LDAPBrowser to connect to the IMPS Service (TCP 20389/20390) to view a managed endpoint.   (And perform the exact query and extract the data to a CSV file or LDIF or other).

     

     

     

     

    IMPS Connection Information:

     

    Select an IMPS Admin or Auditor account (IMPS Profile) Example: etaadmin


    Connection DN:
    eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta

    Password: Password of above IMPS account

    Port: 20389    (non-TLS)    or   20390 (TLS)
    Host: IMPS HostName (FQDN or DNS name)

     

     

     

     

     

    Softerra LDAP Administrator & Browser: Directory Management Tool Download 

     

    Note:   I recommend SoftTerra LDAPBrowser for two (2) reasons:   It is free & READ-ONLY.

    - You can give it to auditors or non-technical resources to build their reports, and provide them with the reassurance they will not be able to change any data by accident.   

    - Note2:   Softerra offers their full range/update paid tool of Administrator.   Which I do recommend for any technical resource that will need this ability and functionality.   To select Browser version, you will need to tab over to that section of the download page.

     

    Below message from Softerra web site: 

    Softerra LDAP Browser is a lightweight version of Softerra LDAP Administrator. It supports read-only operations that do not modify LDAP directory data, e.g. browsing, search, export, etc. For complete, fully functional management of LDAP directories you need Softerra LDAP Administrator.

     

     

     

    For an alternative, if the customer can not install SoftTerra LDAP Browser, I will recommend Apache Directory Studio.

    - There is a method to "extract" rather than "install" that will allow use of this tool.

    Use an LDAP GUI client on a locked-down Workstation: JDK & Apache Directory Studio 

     

    Downloads — Apache Directory 

     

     

     

    The other LDAP client tools that I always use regardless for queries are Jxplorer and CLI tools (etautil/ldapsearch/dxsearch).   However, these tools do not have a report mechanism to extract to CSV files by themselves.

    JXplorer - an open source LDAP browser 

     

     

     

    Here is my process I have used to built to pull the Provisioning Server's inclusion entries, to allow use to build better etautils scripts, besides queries to endpoints attributes.

     

     

    0 - Validate Explore and Correlation (E&C) operation has already occurred to the defined endpoints. This will create the EA pointers (explore) and inclusions (correlation) to GU


    1-  Acquire IMPS Endpoint in SoftTerra LDAPBrowser r4.5 Tool or similar LDAP tool (Apache Directory Studio) that will export to CSV file.
    a) BIND DN: eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta
    b) PORT: 20389/20390
    c) Ensure Timeout/LDAP results are set to large values or unlimited

    2 - Perform LDAP query on Inclusion Branch {Returns GU-EA; PR-AT; AT-EP;}
    a) IM r8.1 BASE DN: eTInclusionContainerName=Inclusions,eTNamespaceName=CommonObjects,dc=im,dc=eta
    IM r12.5
    b) FILTER: (&(objectClass=eTInclusionObject)(eTSuperiorClass=*))
    ATTRIBUTES: eTSuperiorClassEntry, eTSubordinateClassEntry

    3 - Perform LDAP query on GU Branch {Returns Provisioning Roles per GU}
    a) IM r8.1 BASE DN: eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta
    IM r12.5
    b) FILTER: (&(objectClass=eTGlobalUser)(eTGlobalUserName=*))
    ATTRIBUTES: dn, eTRoleDN

    4 - Perform LDAP query for Endpoint Branches (Repeat as needed) {Return AT per EA}
    a) IM r8.1 BASE DN: dc=im,dc=eta
    IM r12.5
    b) FILTER: (one) (&(objectClass= eTADSAccount)(eTADSAccountName =*))
    ATTRIBUTES: dn, eTPolicyDN

    5 - Save results to CSV (not MS EXCEL format) {Repeat for each query}
    a) Save Results to CSV to avoid possible format issue with SoftTerra and MS EXCEL built-in assumptions.

    6 - Open CSV file with MS Excel 2007/2010
    a) Row Count < 64,000 lines; file may be open with MS Excel 2003
    b) Row Count < 1,048,576 lines; file may be open with MS Excel 2007/2010
    c) If Row Count > 1,048,576
    i) Open File with TextPad (or NotePad++) and split file into two or more file with less than Row Count = 1048576
    ii) Alternative: Perform LDAP query on sub-branch of Inclusion OU to ensure Row Count < 1048576

    7 - Edit CSV file
    a) Save file as XLS/XLSX format
    b) Label TAB as Raw Data
    c) Create new TAB called Working
    d) Copy data from Raw Data Tab to Working Tab
    e) Manipulate data on Working Tab into eTautil CLI format
    f) Save Worksheet into TAB delimited format TXT file.

    8 - Edit TXT file
    a) Remove duplicate double quotes
    b) Ensure double quotes or single quotes wrap any attribute that has a space character.
    c) If the variable has a single quote, ensure that variable is wrapped by double quote.
    d) Ensure each line has a semi-colon after the statement; except for the very last line.
    e) Save updated file with a new TXT name
    f) Save 10 lines from file into another TXT name

    9 - Execute test of script
    a) Run the limited 10 line batch file with the etautil command as a feed file ( -f switch)
    i) etautil -d DOMAIN -u etaadmin -p password -f FILENAME.TXT     {note: DYN endpoints will need extra switch}
    e.g. etautil -d im -u etaadmin -p password -f Limited_10_line_inclusion_update_for_gu_and_endpoint_account.txt
    b) Monitor output from batch process.

    10 Execute full load of script
    a) Run the full batch file with the etautil command as a feed file ( -f switch)
    i) etautil -d DOMAIN -u etaadmin -p password -f FILENAME.TXT
    e.g. etautil -d im -u etaadmin -p password -f Full_inclusion_update_for_gu_and_endpoint_account.txt
    b) Monitor output from batch process.

     

     

     

     

     

     

    See if this helps.

     

     

    Cheers,

     

    Alan



  • 6.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 10-25-2018 07:41 AM

    Thanks very much for the thoughtful response.  I'm going to see if I might be able to get RO access credentials to allow me to work with the above - and I'll share with the team too.  Thanks again.



  • 7.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 10-25-2018 10:29 AM

    Excellent.   

     

    When the exact query is confirmed that has value, then it can be added to the IM business rule engine, e.g Policy Xpress.

     

    Then depending on the trigger event, the PX Rule will act upon this data element (ldap query through the provisioning tier) to give the response or action needed.

     

    Side note(s):   

    - Use PX UI rules as much as possible (for the submitted task) for performance.  

    - Avoid PX Event rules that are using ModifyUserEvent (due to excessive MUE triggers that impact performance) as the trigger, 

     

     

    The above PX UI rule would then address the issue you listed.     (or the ldap query could be used with a CLI script process).

       "... the apparent inability to query/extract the user ID's in a given state ..."

     

     

    Cheers,

     

    Alan



  • 8.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 11-15-2018 11:14 PM

    I still haven't had a chance to try out above.  The CA SI team has not been able to provide read-only access to PM directory.  We have a great need for it.  We have two different PM instances, both suffering from a similar problem.  We need to query which global users don't have a role assigned or don't have a healthy/sync'd endpoint. 

     

    However, while working on that I have been experimenting with some IM directory queries.  I started with your suggested tools - both Softerra and OpenLDAP.  However, I ended up discovering a pure Python ldap3 library.  Combining it with Python concurrent.futures an ldapsearch batch file iterating serially through 400 ID's taking over 5 minutes came down to 5 seconds using the following.  You struck me as someone who might be interested.  If a hack like me can make it work, you'll do more.

     

    ldap3 Tutorial — ldap3 2.5.1 documentation  

     

    import concurrent.futures
    from ldap3 import Server, Connection, ASYNC

    n=0
    IDS=[]
    f = open('./IDs.txt', 'r')
    IDS = f.readlines()
    f.close()
    print(len(IDS))



    server = Server('servername', port =19289, use_ssl=True)
    conn = Connection(server, 'uid=ldapreporting,ou=people,ou=company,ou=im,ou=ca,o=com', 'password',
    client_strategy=ASYNC, check_names=True, read_only=True, raise_exceptions=True, return_empty_attributes=True, auto_bind=True)

    def ldap_search(ID, timeout):

    response_id = conn.search('ou=people,ou=sdm,ou=im,ou=ca,o=com', '(imString62=' + ID + ')',
    attributes=['imEnabledState','imString52','imString67','employeeNumber','givenName','initials','sn',
    'imEmployeeStatus','employeeType','imString59','imString54','imCostCenter','imString53',
    'imString50','street','l','st','postalCode','imString55','imManagerEmployeeNumber','imString56',
    'imString62','imString61','imString66','imString57','imString58','imString68','imString64','imString71',
    'imString72','imString51','imString69','departmentNumber','imLoginID','mail'])

    response, result = conn.get_response(response_id)
    return conn.response_to_ldif(response)

    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:

    future_to_ldap = {executor.submit(ldap_search, ID, 30): ID for ID in IDS}
    for future in concurrent.futures.as_completed(future_to_ldap):

    searchresults = future_to_ldap[future]
    try:
    data = future.result()
    except Exception as exc:
    print('%r generated an exception: %s' % (searchresults, exc))
    else:
    n=n+1
    print(n)
    print(data)



  • 9.  Re: IM CLI - Faster scripts for loads - up to 60 updates/second with two (2) IMPS Servers

    Posted 11-16-2018 03:25 PM

    Excellent work!.

     

     

    You might like this process as well.

     

    DXsoak:  Stress Testing & Scaling the CA Identity Suite Provisioning Tier 

     

     

     

    I have been using this process to identify which "variables" of possible configuration changes provide the highest value:

    -  I will write this up in full, but quick observations.

     

     

     

     

    - Load Balancing for the IAMCS (JCS)

         -  Thread count must be above 2 to see this kick in.   Once it does, then full round-robin load sharing will be observed.

        -   If using the remote IAMCS(JCS)/CCS connector server architecture, ensure that ONLY the JCS is stop/started; do not manually or set CCS to start automatically in NT services.    JCS must have full ownership of the CCS to ensure it is "aware" of the state of the CCS service.

     

     

    Table to manage variables:   

     -   Side note:   ADS managed endpoint should have 4 vCPU 8 GB RAM (minimal) & have the PDC emulator role:   netdom query pdc    (to assist with rapid password changes)

     

     

     

     

    Ensure the CCS server has the correct OS ENV settings:

     

     

     

    IAMCS(JCS) must have "ownership" of the start/stop of the CCS service.   This ensure that the data path is built correctly.    The system will "self-heal", but this may take 2-10 minutes, depending on the state of the CCS service.

        -  Avoid this concern by just stop/starting the im_jcs service.

     

     

     

     

     

     

    Cheers,

     

    Alan

     

     

     

     

    P.S.

     

    As a baseline check, I formed my queries directly to the CCS service 20402/20403, to see what it could perform at:

      -  I was able to see over 500 updates/second with the OS ENV settings (these enforced endpoint reliability)