Layer 7 Access Management

Expand all | Collapse all

XAuthRadius Failover Issue

Jump to Best Answer
  • 1.  XAuthRadius Failover Issue

    Posted 06-20-2019 02:10 PM
    Hi All,

    We have XAuthRadius v6.2 integrated with our CA SSO R12.7 infrastructure. The setup is to enable MFA and is integrated with NPS server serving as Radius server and CA SSO PS is Radius client. The configuration is working as expected. 

    However, we have run into an issue with respect to failover. As per the XAuthRadius document, load balancing is not supported and request fails over to 2nd NPS server when 1st fails. However, we are seeing concurrent sessions to both NPS servers, resulting in CA SSO PS going unresponsive. When we disable multiple NPS IP's in radiusconfig file and have a single NPS IP, there is no issue.

    Has anyone faced similar issues earlier? Any suggestions/inputs is greatly appreciated.


    ------------------------------
    Thanks and Regards,
    Nirmala
    ------------------------------


  • 2.  RE: XAuthRadius Failover Issue

    Posted 06-25-2019 03:48 AM
    Hi Team,

    Any suggestions on this issue please?

    Thanks,
    Nirmala



  • 3.  RE: XAuthRadius Failover Issue

    Posted 06-26-2019 02:29 AM
    Hi Nirmala,

    As you run XAuthRadius integrated with MFA, could you test without MFA
    and have only the Authentication Scheme with Radius enable and see if
    you still see the requests distributed to both NPS ?

    You mentioned :

    "When we disable multiple NPS IP's in radiusconfig [...]"

    Could you share with us the config file ?

    Best Regards,
    Patrick


  • 4.  RE: XAuthRadius Failover Issue

    Posted 06-26-2019 03:45 AM
    Hi Patrick,

    The radiusconfig file looks as below in our environment. When I say disable NPS, I am removing the IP's from the list. 

    default.ip=<NPS IP1> <NPS IP2> <NPS IP3>
    default.secret=**********
    default.port=1645
    default.timeout=300
    default.retries=5
    default.reactivate=300

    With respect to disabling MFA and testing only Auth scheme, we are trying to replicate this issue in our lower environment as the main issue was faced in Prod and we cannot impact Prod users. We are seeing if we can replicate the issue in lower environment.

    Any suggestions would really help here.

    Thanks,
    Nirmala


  • 5.  RE: XAuthRadius Failover Issue

    Posted 06-26-2019 04:56 AM
    Hi Nirmala,

    I'm surprised that you see requests distributed among several of
    the Radius Server, because the protocol itself doesn't support
    loadbalancing :

    XauthRADIUS Integration for CA Single Sign-On Installation and Configuration Version 6.3

    RADIUS server failover

    The RADIUS protocol does not provide for round robin or load
    balancing of RADIUS servers;

    SmXauthRADIUS Installation Guide.pdf

    Are you sure you see requests distrubuted among all the Radius servers
    ? Don't you see only connections ? How do you see them ?

    More, the config files seems to be ok :

    Configuration File Format

    1. IP Description:

    • it begins with the name of the RADIUS server (comparisons are case insensitive)
    • followed by a period
    • followed by the word ip
    • followed by an equals sign
    • followed by the IP number

    To enable RADIUS Server failover a space and an additional IP
    address may be entered. An unlimited number of additional IP
    addresses may be specified using this notation.

    So said, if you see when starting the Policy Server with XAuthRadius
    module, and you see requests going to all the defined Radius Servers,
    then you should open a Support case by providing :

    - Full Policy Server logs and traces
    For the traces, enable the Profiler with all Components and all Data;

    - radiusconfig file;

    - Full Network traces from the Policy Server :

    WireShark if running on Windows;

    If running on Linux :

    D.3. tcpdump: Capturing with "tcpdump" for viewing with Wireshark
    tcpdump -i <interface> -s 65535 -w <some-file>
    https://www.wireshark.org/docs/wsug_html_chunked/AppToolstcpdump.html

    If running on Unix :

    snoop
    snoop -r -o arp11.snoop -q -d nxge0 -c 150000
    https://wiki.wireshark.org/snoop

    and precise us the Policy Server, XAuthRadius, OS and Radius server versions.

    I hope this helps,

    Best Regards,
    Patrick


  • 6.  RE: XAuthRadius Failover Issue

    Posted 06-26-2019 01:28 PM
    Hi Patrick,

    We tried replicating this issue in lower environment and unfortunately, we are unable the same in lower envt. We notice the XAuthRadius module sends the request to single NPS server unless there is a failover. We are not observing distribution of requests amongst various NPS servers. However, the user load or number of requests in lower envt is drastically less compared to Prod envt. 

    Is there any tuning required on CA SSO PS end to handle Radius requests considering there is some amount of time required by CA SSO PS to receive a successful authentication message from NPS where NPS performs primary and secondary authentication? Or have any issues been raised earlier of CA SSO PS going unresponsive while handling multiple Radius requests?

    In Prod, we see request goes to 1st NPS server, and another request on 2nd NPS server and user being challenged by 2nd NPS server, although 1st is active. User also gets multiple MFA notifications (ex: multiple SMS from Microsoft). We don't observe this in lower envt. 

    CA SSO PS version is R12.7 and XAuthRadius version is 6.2.

    We do plan to open a case with CA on the same. However, any suggestions here will help.

    Thanks,
    Nirmala


  • 7.  RE: XAuthRadius Failover Issue

    Posted 06-27-2019 03:46 AM
    Hi Nirmala,

    Do you have firewall or loadbalancer between Policy Server and Radius
    Servers ? A loadbalancer might explain why requests are distributed or
    some Radius Server connection time is longer than the other ?

    Best Regards,
    Patrick


  • 8.  RE: XAuthRadius Failover Issue

    Posted 07-01-2019 03:47 AM
    Hi Patrick,

    Nope, there are no Load balancer's or firewall between CA SSO PS and Radius servers. We tried the same scenario in lower envt and could not replicate. 

    The number of users in Production is drastically high compared to lower envt. Do you think high user load be causing this issue in Prod? Additionally, do we need to change any parameters for CA SSO PS to handle Radius protocol requests? As this is the first time we are using Radius protocol in our environment.

    Thanks,
    Nirmala


  • 9.  RE: XAuthRadius Failover Issue
    Best Answer

    Posted 07-01-2019 08:45 AM
    Hi Nirmala,

    With load, you might need to check the performances on the Radius
    servers. If one Radius server is intermittently too loaded, then it
    may be possibile that the XAuthModule makes a failover connection to
    the next Radius server, and as such, with time passing, you see
    connections to all the Radius servers.

    It's a though. But again, to trouble shoot this, you need the full
    traces from the SiteMinder side and the Radius server to determine
    what causes the connections to all the Radius server.

    I hope this helps,

    Best Regards,
    Patrick


  • 10.  RE: XAuthRadius Failover Issue

    Posted 09-13-2019 06:48 AM
    Hi Patrick,

    We have still been unable to identify this issue. We have raised a CA case for the same, but no luck. Would you be able to assist here?

    1. We still connections from XAUthRadius to all 4 NPS servers (which are configured In failover mode) though the 1st NPS is up and running.
    2. CA SSO PS goes unresponsive after few MFA requests are accessed. 
    3. Can we use a load balancer for NPS servers loadbalancing? Does XAuthRadius support this? We are planning to have a single VIP configured on XAuthRadius module and that VIP has 4 NPS servers as pool members and distributes the request. Will this method work?
    4. Currently in radiusconfig file we have set timeout as 5mins. However, we still intermittently see timeout errors, what is the max timeout value we can increase to and will that not impact performance?
    default.ip=<NPS IP1> <NPS IP2> <NPS IP3>
    default.secret=**********
    default.port=1645
    default.timeout=300
    default.retries=5
    default.reactivate=300

    Any help here would really be appreciated.

    Thanks,
    NIrmala


  • 11.  RE: XAuthRadius Failover Issue

    Posted 09-16-2019 02:48 AM
    Hi Nirmala,

    The way to identify the root cause of this is that you need the full
    traces from the SiteMinder side and the Radius server to determine
    what causes the connections to all the Radius server.

    Then we might find the way to fix it.

    Best Regards,
    Patrick