Description
The config-assessment-tool (v1.7.2) consistently fails with asyncio.exceptions.TimeoutError during the "Extracting Agent Details" phase (getAppServerAgents) when running against large SaaS controllers. The root cause is a hardcoded 5-minute timeout in aiohttp.ClientSession() within backend/api/appd/AuthMethod.py.
Environment
- Tool version: v1.7.2 (Windows executable bundle)
- Controller: Large SaaS controller
- Controller size: 500+ APM applications, 4300+ servers, 1048 dashboards
- Auth method: API Client (token)
- OS: Windows
Steps to Reproduce
- Configure
DefaultJob.json pointing to a large SaaS controller
- Run:
config-assessment-tool.exe -j DefaultJob -t DefaultThresholds -c 1
- Tool authenticates successfully, extracts APM/EUM/MRUM applications, servers, and dashboards without issue
- Tool reaches "Extracting Agent Details" phase and calls
getAppServerAgents
- The controller takes longer than 5 minutes to respond due to the large number of agents
- Tool crashes with
asyncio.exceptions.TimeoutError
Error Output
[ERROR] root run: Traceback (most recent call last):
File "uplink/clients/io/asyncio_strategy.py", line 17, in invoke
File "uplink/hooks.py", line 109, in handle_exception
File "six.py", line 719, in reraise
File "uplink/clients/io/asyncio_strategy.py", line 17, in invoke
File "uplink/clients/aiohttp_.py", line 135, in send
File "aiohttp/client.py", line 544, in _request
File "aiohttp/client_reqrep.py", line 905, in start
File "aiohttp/helpers.py", line 656, in exit
asyncio.exceptions.TimeoutError
Root Cause
In backend/api/appd/AuthMethod.py, the aiohttp.ClientSession() is created without a custom timeout parameter (lines ~64 and ~371), which defaults to aiohttp built-in 5-minute (total=300s) timeout.
For large controllers with thousands of agents, the getAppServerAgents API call can take 30+ minutes for the controller to respond. The 5-minute default is insufficient.
Additional Observations
- All phases prior to Agent extraction complete successfully (APM apps, EUM, MRUM, Servers, Dashboards)
- With
-c 1 (single connection), the agent extraction request takes 30+ minutes waiting for the controller to respond before timing out
- With higher concurrency (e.g.
-c 50, the default for SaaS), the controller becomes unresponsive, likely because it only has about 10 threads dedicated to REST API processing
- With
-c 5, the same long wait behavior is observed during agent extraction
Description
The config-assessment-tool (v1.7.2) consistently fails with
asyncio.exceptions.TimeoutErrorduring the "Extracting Agent Details" phase (getAppServerAgents) when running against large SaaS controllers. The root cause is a hardcoded 5-minute timeout inaiohttp.ClientSession()withinbackend/api/appd/AuthMethod.py.Environment
Steps to Reproduce
DefaultJob.jsonpointing to a large SaaS controllerconfig-assessment-tool.exe -j DefaultJob -t DefaultThresholds -c 1getAppServerAgentsasyncio.exceptions.TimeoutErrorError Output
[ERROR] root run: Traceback (most recent call last):
File "uplink/clients/io/asyncio_strategy.py", line 17, in invoke
File "uplink/hooks.py", line 109, in handle_exception
File "six.py", line 719, in reraise
File "uplink/clients/io/asyncio_strategy.py", line 17, in invoke
File "uplink/clients/aiohttp_.py", line 135, in send
File "aiohttp/client.py", line 544, in _request
File "aiohttp/client_reqrep.py", line 905, in start
File "aiohttp/helpers.py", line 656, in exit
asyncio.exceptions.TimeoutError
Root Cause
In
backend/api/appd/AuthMethod.py, theaiohttp.ClientSession()is created without a custom timeout parameter (lines ~64 and ~371), which defaults to aiohttp built-in 5-minute (total=300s) timeout.For large controllers with thousands of agents, the
getAppServerAgentsAPI call can take 30+ minutes for the controller to respond. The 5-minute default is insufficient.Additional Observations
-c 1(single connection), the agent extraction request takes 30+ minutes waiting for the controller to respond before timing out-c 50, the default for SaaS), the controller becomes unresponsive, likely because it only has about 10 threads dedicated to REST API processing-c 5, the same long wait behavior is observed during agent extraction