Are there any api gateway experts that have experience with scaling the api gateway against DDOS attacks.The idea is to scale our gateway, so that while hitting rate limiting assertions, we can process a load coming from an attack consuming 1Gbps of bandwith. Are there people who have done a similar thing. Is there any idea on how much cpu we would need when hitting rate limits and dropping/responding to ddos clients?Any input or experience would be greatly appreciated.
There are SO many different kinds of DDOS you could be talking about.
For an ICMP ddos, (Ping flood) that's a router thing, not a gateway thing: specifically rate limiting in the router itself.
For a fully formed HTTP request flood, it comes down do what the attempt looks like... Load Balancers in HTTP mode can help you here, as they can usually do valid URL filtering too.
If the attack gets past the Load Balancer, and it just heavy random queries, the CPU in two or three 16 core hardware gateways, or 4 to 6 8 core VMware gateways should be lots to process 1 gigabit. Overall CPU consumption should be fairly minimal as long as you don't have hundreds of wildcard services There's a workaround assertion to use in a global message received policy if you do.
The strategy to handle that kind of traffic is to discard queries as quickly as possible. The gateway only processes calls that have valid URLs. Assuming the attackers found valid URLs, and are attempting to attack a specific API, there are several ways to limit attacks: Requiring and validating of credential being the most obvious and cheap one, as we heavily cache authentications. By using the customize error response, you can drop the connection directly from policy.
Customize Error Response Assertion - CA API Gateway - 9.2 - CA Technologies Documentation
Part of how we struggle with this is how many ways you can DDOS HTTP based systems. What kind of attack are you thinking about mitigating at the Gateway (as compared to Firewall or Load Balancer) layer?
Thanks, that is already some amazing input. Discarding queries as quickly as possible seems to be of utmost importance. I am certainly interested in that workaround assertion, as we are building more rest-api's we are getting more and more wildcard endpoints.
We are indeed mostly looking at attacks getting up to the gateway. We have 2, 2core gateways at this moment, but as i understand this is likely not sufficient. We are also looking at using the rate-limiting assertions, to help discard overload before doing additional logic. For example using multiple rate limits sequentially (for the entire policy, for a client ip, for a logged in user).Any feedback is welcome.
Using rate limits in branches of policies with customized keys is a tried and very effective way to discard requests.
For instance, based on API composition, there may be no good reason for 30 extremely similar queries per second, for instance, which might be the case in a DDOS.
Similarly, 30 concurrent requests from the same logged in user as might be the case if the attacker has only compromised one account would be effective as well: Concurrency limiting is in rate limit assertion.
Note that Rate limit, not throughput is preferred. Throughput is database backed, and has TPS limitations when the keys are complex. Rate limit has no such limitation.