Exploring Google ADK: A Backend Developer’s Perspective

Over the past few days, I’ve been exploring Google ADK or Agent Development Kit. My initial impression is this allows backend developers like me more flexibility and capability than before. There have been API’s provided by several AI companies, those are almost the same to its UI versions. But with MCP and ADK, it feels like there’s a whole new level of opportunity for backend integration with AI.

On day 1, I just followed Quickstart and Tutorials in ADK site, and learned the basic concepts. ADK adopts much of the traditional backend development SDK’s patterns like callbacks, so it was easier to understand than MCP. Multi-agent system, one of the new concepts itself was quite acceptable, but I was not sure when or why to use it. Still, I believe the ADK will help me a lot to integrate legacy backend services and LLMs.

On day 2, I started some self-driven experimental development. First of all, I ported NWS API tools, which was provided by MCP Quickstart guide document, into ADK. Copying and pasting do the most of the work, but I could get some hints for the purpose of multi agent system.

There were two tools in MCP Quickstart guide; get_alerts(State) and get_forecast(latitude, longitude). Those names are quite self-explanatory. One returns the weather alerts of a given state, the other returns the weather forecast with given latitude and longitude. Claude AI worked perfectly with this. With given city name, Claude automatically converted that into state name, latitude and longitude with its own knowledge.

However, things were a little different on ADK dev UI with Gemini. It answered that it needed latitude and longitude to give me back the weather forecast. Wait — an LLM like Gemini does not know coordinates of major cities? It doesn’t make sense at all.

After a while, I modified the instruction I had given to the Gemini agent. Originally, it said, “You can answer questions about the weather forecast in a city and alerts in a state.” I changed it to: “You can answer questions about the weather forecast in a city and alerts in a state. If needed, infer latitude and longitude from city name.”

Then I asked again about the weather in Los Angeles — and bingo! It returned the forecast. However, since the updated instruction didn’t mention that it’s allowed to infer the state from a city name, I suspected the get_alerts tool still wouldn’t work. And as expected — it didn’t.

So I changed the instruction once again: “You can answer questions about the weather forecast in a city and alerts in a state. If needed, infer latitude and longitude from city name. Also infer state name from given information.” Then I ran the test questions again using just city names — and this time, I got the correct answers for both weather forecasts and alerts.

At this point, I started to understand the intention behind the multi-agent system. It’s designed to define and control the roles of each agent — and to enable their reuse. Like teams in the real world. That’s likely why the tutorial document uses the term “agent team” when referring to a multi-agent system. It comes to me that the agent team is very similar to the human ones in real world. Team manager have more roles and each team members plays different roles.

Then it comes to me that maybe each LLM could follow the instructions in different manner. So I tested with different models. Until now it was ‘gemini-2.0-flash-exp’. I changed the model to ‘openai/gpt-4o’ and rolled back the instruction to the original one: “You can answer questions about the weather forecast in a city and alerts in a state.” And asked the same questions.

To provide you with the weather in Los Angeles, I need its latitude and longitude. Los Angeles is approximately at latitude 34.05 and longitude -118.25. I’ll get the weather forecast for this location.

To check for weather alerts, I need to know the US state. Since Los Angeles is in California, I’ll check for alerts in CA.

Interesting. I changed the model again to ‘anthropic/claude-3-5-sonnet-20240620’.

To provide you with accurate weather information for Los Angeles, I’ll need to use the weather forecast tool. However, this tool requires specific latitude and longitude coordinates. Let me fetch those for Los Angeles and then get the forecast for you. Los Angeles, California has approximate coordinates of latitude 34.0522° N and longitude 118.2437° W. I’ll use these to get the forecast.

To check for weather alerts in Los Angeles, we need to look at the alerts for the state of California. I’ll use the get_alerts function to fetch this information for you.

Both GPT-4o and Sonnet inferred the state name, latitude and longitude with their own knowledge, and answered to the question. This clearly shows that different models follow instructions differently. To ensure the multi-agent system to work correctly, the instruction should be followed strictly as Gemini did. This could also be an interesting competition point between AI models.

After this, I ran few more tests which I cannot share in detail. Integrating legacy services with Gemini using ADK wasn’t too difficult. But fine-tuning the agent’s instruction to behave exactly as expected? That was challenging. I guess that’s also what people mean by prompt engineering. There’s still so much more to explore — and I am really enjoying this!

Share this:

Related

Leave a comment Cancel reply