CA Service Management

  • 1.  Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 11, 2017 07:10 AM

    CA officially supports F5 load balancer and an example configuration can be found on the Wiki article and its comments. However, we don't have an F5 so we're looking into alternative solutions. So far all the solutions we've tried have had their shortcomings one way or another, making them not really fit for our needs. The implementation guide has a chapter that sounds promising, "Configure the Load Balancer" but it falls miles short when the only part actually talking about load balancing is "Configure the session persistence on each load balancer. For more information, see your load balancer document. This process ensures that a request coming from one application server is routed back to the same application server." This is complemented by the Wiki article and comments I mention above, but they all have one major shortcoming: What does the user see when an application server is quiesced?

    I understand providing an example configuration for all the available software and hardware load balancers out there is impossible. What I'm looking for is whether anyone has ever managed to configure the load balancing the way that would need our needs. Below are the key details of our environment and the requirements/wishes for the load balancer we use.

    Environment

    • Windows environment
    • 14.1 AA configuration with
      • 1 BG server
      • 1 stand-by server
      • 4 application servers
      • Using Cisco ACE for load balancing.

    Current scenario

    The Cisco ACE serves requests to a single host name and transparently redirects the traffic to the chosen app server, so the users only see the hostname of the LB. The LB is configured to monitor the HealthServlet response on each of the app servers and to take down the app server from the load balancing pool when the Health Servlet no longer returns a HTTP 200 response. The LB has sticky sessions configured to make sure the users always use the same app server.

    Challenges

    • As you might've already noticed on the scenario description, the LB health probes aren't the smartest ones. Whenever the app server is quiesced it will instantly throw out any user connected instead of allowing them to gracefully log out and have their connection re-balanced.
    • Because of the first item the possibilities the AA configuration theoretically grants us are lost; we can't do maintenance during the service hours as the users will get disrupted in a non-acceptable way. The quiesce mode with its "please log out" notification goes half the way but the LB solutions we've seen so far do not play nice with this.
    • HealthServlet returns 5xx response even when the app server is still good to serve the existing users. Current LB solution doesn't know the difference between quiesced and shut down app server.
    • Even if the above issues are resolved it still leaves the issue of sticky sessions and logout. For the duration of the quiesce the user would be logged out, unable to log back in but still stuck to the server being quiesced, which leads us to the last item...
    • Re-balancing of the connection seems to be rather hard, it seems the jsessionid (that most of the load balancers use for sticky sessions) is set somewhere inside the webengine and even though creating a filter for Tomcat to remove the cookie from the response might address this issue it still leaves the IIS unresolved.

    Goal

    We hope to retain a single host name towards the users but so far any of the solutions capable of doing that hasn't met the other requirements, so we're OK to let that requirement slide. So in essence the requirements are these:

    • Sticky sessions until logout and a re-balancing of the connection on logout. This would spread the load evenly among remaining app servers when one is quiesced
    • Existing sessions must remain instead of instant re-balancing (and lost data) when the app server is quiesced. It seems nginx might be able to do this with the "drain" directive on a node but haven't seen it done yet so I'm a bit sceptic.
    • Single host name towards the users (negotiable)

    So do you have or do you know that someone has a setup like this? If we let the single host name requirement go, even a simple round-robin balancer that doesn't channel the subsequent traffic through itself would work for us, just didn't find a working solution yet.



  • 2.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 11, 2017 08:37 AM

    Hi Jussi,

    Great detail here, thanks for sharing with us.  As you can see from the documentation, the only real tested solution is a BIG-IP F5 Load Balancer - which is the most popular load balancing solution these days.  There are some folks using Cisco ACE load balancing appliances with SDM, however I dont know if they are using the health servlet monitoring part of it, so I cant speak on that part.   Given the fact that I can only provide you information from experience and what is documented, unfortunately I dont have anything to really give you on this one.  I think your biggest hurdle here is the fact that once you quiesce a server, the load balancer instantaneously drops all the sessions.  I am really not sure how to get past that part.  If you could get past that and figure out a way to have it continue those sessions until they are closed, that would be the best scenario, and i think that it would allow the solution to work for you.    There may be some folks out here who have gotten past that and they may be willing to share their info with you.

    Anyone else have anything to share on this??  Anyone using Cisco ACE equipment with SDM in an AA configuration??

    Thanks,

    Jon I.



  • 3.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 11, 2017 08:49 AM

    Just to clarify, a solution without Cisco ACE is acceptable and even desirable, we're rolling out the ACEs as they're out of support life now. So anything goes, from Apache to nginx and beyond.



  • 4.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 11, 2017 09:39 AM

    Some folks have used Apache, but I havent come across anyone using nginx for that.  Those are also software load balancing, where as F5 and ACE are hardware appliances - so they are very different things as far as configurations are concerned.  We do recommend using hardware load balancing, but there may be folks out there that have gotten it to work using Apache or nginx who can give you some insight about how they have it configured.

    Thanks for the update!

    Jon



  • 5.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 11, 2017 01:21 PM

    Hello,

     

    I was working with some customer and we used NGINX+ is the enterprise version of this load balancer. After 2 years everything is working really good.

     

    The arquitecture was:

    ServiceDesk

    • 1 BackGround Server
    • 1 Stand By Server
    • 8 Application Server
    • 1300 concurrent analyst sessions.

    Catalog:

    • 2 Catalog Server

    Pam:

    • 3 Nodes of PAM

     

    NOTE: Another customer that i know used Cisco to balance everything and it works well.

     

    Regards,

    Yonatan Sosa Sanchez



  • 6.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 14, 2017 02:10 AM

    Can you confirm the load balancers (or only the nginx if you don't know about the other one) handled the quiesce mode gracefully without kicking the connected users out?

    This is our biggest challenge and so far we have no solution in view that would include proper load balancing, only a half-way idea where the load balancer would be pure round robin balancing and instead of handling all the sessions it'd just redirect users with HTTP 302 and not know anything about the user's connection after this. This would tackle the quiesce issue but is otherwise an inferior solution to any alternatives so you can see why we're keen to find a proper solution



  • 7.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 17, 2017 10:27 AM

    I'm going to start by saying I've not done this and I'm not a Load Balancer guy.  HOWEVER, we are planning our 17.x rollout and had all the exact same questions on setup that you outlined above.  After discussing with the CA Swat team, this is the information we were provided on how to use the Health Servlet.  With SD, don't just check for 200/500.  Actually evaluate the header information.

     

    Hopefully with the below information combined with the knowledge of your networking team who know the configuration options in your load balancer you will be ok.  Realllly curious if it helps since this is where we're headed.

     

    From CA

    • If the server is not available, you would get the following as a header
      • AA-Server-Status: NOT OK
      • AA-Server-Role: BG or SB or AP
    • If the server is good, you would get:
      • AA-Server-Status: OK
      • AA-Server-Role: BG or SB or AP
    • If the server is not available and If it is AP (application server), you would get 1 extra header
      • AA-Quiesce-Status: APP_RUNNING or APP_QUIESCE_PENDING or APP_QUIESCED
    • And in addition, if the time remaining for quiescing is greater than 0, they will get another header
      • AA-Quiesce-Time:

     

     

    So my anticipation is that we have 5 likely response conditions given the information above.

    1. Case 1: Server is up and running normally (balance as normal)
      1. AA-Server-Status: OK
      2. AA-Server-Role: AP
    2. Case 2: Server us up and running but going to quiesce soon (maintain current sessions, send new ones elsewhere)
      1. AA-Server-Status: NOT OK
      2. AA-Server-Role: AP
      3. AA-Quiesce-Status: APP_QUIESCE_PENDING
      4. AA-Quiesce-Time: some number
    3. Case 3: Server is unavailable in Quiesce mode for administrative reasons (kill all sessions, send everyone elsewhere)
      1. AA-Server-Status: NOT OK
      2. AA-Server-Role: AP
      3. AA-Quiesce-Status: APP_QUIESCED
    4. Case 4: Server is unavailable with SD stopped (kill all sessions, send everyone elsewhere)
      1. AA-Server-Status: NOT OK
      2. AA-Server-Role: AP
    5. Case 5: Health Servlet doesn’t respond at all (kill all sessions, send everyone elsewhere)


  • 8.  Re: Working (non-F5) load balancer setup in an AA configuration

    Posted Aug 21, 2017 06:40 AM

    Yup, you've laid out the exact same requirements we have. The cases 1, 3, 4 and 5 already work like a charm, it's the 2nd case that is giving me the grey hair. So far I haven't seen or heard of a solution that would work like this, but my bet for the winning horse at the moment is the nginx+ with it's session draining capability, but still missing the confirmation from anyone that this is how they did it and it's doable.