Layer7 API Management

 View Only

 API Gateway: How to optimize API TAT?

MARUBUN SUPPORT's profile image
MARUBUN SUPPORT posted Jun 03, 2025 07:09 AM

Hello Team,

Please let me know any idea.
Should I ask this to CASE?

Currently the following issue has occurred.
- issue: once every several minutes TATs of multiple API calls are simultaneously delayed to 1 sec.
  Ave. TAT is 0.119 sec usually.
- hardware: 8 machines
- API call rate: 156 calls/s per machines, total 1250 calls/s
- Product: Layer7 API Gateway v11.0
 
 
The following is tested the fallowing to turn a performance:
 
The following properties change:
Tasks > Data Sources > Manage JDBC Connections > JDBC 
Connection Properties > Pool Configuration
(1) Minimum Pool Size
(2) Maximum Pool Size 
 
Test 1
- changes both (1) and (2) from 100 to 200, and
- calls API at 1700 calls/s.
Result of Test 1
- once every several minutes TAT of a API call was delayed to 1 sec, and
- number of the delayed API calls was increased.
 
Test 2
- changes both (1) and (2) from 100 to 50, and
- calls API at 1700 calls/s.
Result of Test 2
- All of TATs were less than 1 sec.
 
Question
Q1. Why did Test 2 enhance the TAT? What logic?
Q2. /dev/urandom instead of /dev/random  is used in java connection creation.
       Is there a possibility of any control related to the connection creation causes the TAT delay?
 
 
Also the customer tested the following in a environment which is that
- 1700TPS rush test and
- once every 5 minutes TATs of multiple APIs are simultaneously delayed to 1 sec.
 
Test 3
- changes the property (1) and (2) from 100 to 200, and
Result of Test 3
- once every 2 or 3 minutes the multiple delays occurred, and
- number of the delayed API calls was increased.
 
Test 4
- changes the property (1) and (2) from 100 to 200, and
- enhance RDS from m5.large(2 vCPU/RAM 8 GB)+gp2(600IOPS) to 
  m5.xlarge(4 vCPU/RAM 16 GB)+gp3(3000IOPS)
Result of Test 4
- same as the result of Test 3
 
Test 5
- keeps m5.xlarge(4 vCPU/RAM 16 GB)+gp3(3000IOPS), and
- changes back the property (1) and (2) from 200 to 100.
Result of Test 5
- once every 2 or 3 minutes the multiple delays occurred, and
- number of the delayed API calls was lower than Test 4.
 
Test 6
- keeps m5.xlarge(4 vCPU/RAM 16 GB)+gp3(3000IOPS), and
- changes back the property (1) and (2) from 100 to 50.
Result of Test 6
- no delay occurred.
 
Question
Q3. Number of the simultaneous delays is increased by an increased  number of the requests from clients.
       The customer supposes that a load of each connection/disconnection is increased by the request increase.
       Why is number of the simultaneous delays increased?
Q4. The delay issue is improved to decrease number of connections to OTK DB.
       What is a reason of that?
 
Q5. What is a log for Connected/Disconnected/Exhausted/Waiting related to the connection pool?
       How to get that log?
There has been progress in an additional investigation, so please allow me to add a new inquiry.
 
[Investigation]
When checking the connection status for the JDBC⇔OTK DataBase connection using the netstat command, the connection port was swapped at regular intervals.
In Policy Manager, the number of connections for JDBC⇔OTK DataBase was set to 100, and those 100 connections were divided in 5 groups. Connections in each groups were refreshed at 150-secound interval, and a refresh interval between groups was 30 seconds.
 
Time Number of connections refreshed
11:09:17    08 connections (Connection group A refreshed)
11:09:47    10 connections (Connection group B refreshed)
11:10:17    20 connections (Connection group C refreshed)
11:10:47    13 connections (Connection group D refreshed)
11:11:17    49 connections (Connection group E refreshed) (←Repeat from here on.)
 
11:11:47    08 connections
11:12:17    10 connections
11:12:47    20 connections
11:13:17    13 connections
11:13:47    49 connections
 
[Hypothesis]
The refresh cycle (disconnection of old connection/establishment of new connection) of the JDBC connection pool was set to 150 seconds, and new requests from the application were waiting to obtain a connection during the refresh process.
By reducing the number of connections, the load on the connection refresh process was reduced and the wait time for obtaining a connection was reduced.
 
[Question]
Currently, we are checking various settings to see which settings (Java, DB, Internet) are effective and which connections are being refreshed.
Q6. Please tell me the setting value for the refresh cycle of the JDBC connection pool.
Based on my hypothesis, I think it is set to 150 seconds, but what is the setting value?
 
Q7. Please tell me whether the setting value for the refresh cycle of the JDBC connection pool can be changed.
Is it possible to change the cycle using Policy Manager or other means? If the setting value can be changed, could you please tell me the procedure?
 
Q8. Is it possible to set it to avoid simultaneous connection refreshes and distribute the load over time?
Example) Refresh in 10 groups with a 150-second cycle at 15-second intervals.
 
Regards,
MARUBUN