-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Describe the bug
When you configure root agent able to call same AgentTool which is configured with MCP toolset in parallel, and one of them finishes first, the rest of them fails.
To Reproduce
Before we go..
- Any stateful toolset(which means that its
close()method does something) may cause this error. - Since I encountered this is when using
McpToolset, I will explain the step to reproduce usingMcpToolset. - It is more likely to reproduce the issue if:
- The execution time of AgentTool is fairly long(about >10s).
- The execution time of AgentTool is random.
- These will cause the invocation of AgentTools not to finish at the same time, making a certain call to finish at first.
You may reproduce this issue by 2 ways.
First case: call Runner which shares same Agent concurrently.
- A root agent, configured with an
McpToolset. - Execute two runners simultaneously.
- When any runner finishes first, the rest of runner fails, raises exception(related to resource exhaution).
Second case: call AgentTool with McpToolset in parallel.
- A root agent, configured with an AgentTool.
- This AgentTool consists of an agent, which has a MCP toolset.
- Write an instruction to trigger root agent to call AgentTools in parallel. (e.g.
You must call 10 'search_item' tool in parallel.) - Execute the root agent.
- When any AgentTool finishes first, the rest of AgentTool fails, raises exception(related to resource exhaution).
Expected behavior
Even if one of the runner has finished first, the rest of runners should not fail.
Screenshots
None, but I could reproduce both case by adding 2 test sample agent setups. You may look into it this PR #4046.
Desktop (please complete the following information):
- OS: macOS
- Python version(python -V): 3.13
- ADK version(pip show google-adk): 1.9.0
Model Information:
- Are you using LiteLLM: No
- Which model is being used: gemini-2.5-flash
Additional context
This happens due to the early release of shared resource in runner, especially when the toolset is stateful.
First case: call `Runner` which shares same `Agent` concurrently
Starting with the first case, I described the initial state to a diagram as below.
Note that 2 runners are connected to a single toolset, demonstrates that the runner will close the toolset after it finishes since both runners are based on the same agent.
Assume that Runner 1 is done first, cleaning up the toolset by calling self.close().
Now the toolset has been closed. But Runner 2 is not finished yet. Therefore, tool call throws exception, making the runner to fail.
Second case: call `AgentTool` with `McpToolset` in parallel.
Same thing goes on with the second case. The initial state is quite complicated, but the configuration is basically same: 2 runners are being executed concurrently.
And as you expected, same problem occurs.
IMO users usually do not use runner explicitly, I think first case would not occur frequently.
The second will occur quite frequently since calling same AgentTool in parallel is common(and useful!), e.g.
- RAG search in parallel (this is my case)
- run PR review bot in parallel
- fetch data chunks in parallel
Also, calling tool in parallel is decided by LLM, and from what I know is that LLM usually calls tool in parallel. We could controll this behaviour only by restricting LLM to call tools sequentially, which is not deterministic(99% maybe, but not 100%).
I made a PR to address the second issue, adding AgentToolManager which functions as reference-count for agent when using AgentTool.