Link
Building Gen AI apps using agentic workflow and prompt chaining
My thoughts
As AI applications become more sophisticated, developers are moving beyond single LLM calls to create more complex, interactive systems.
One basic pattern for this evolution is “Prompt Chaining” - starting with a single LLM call and building on top of it.
To chain LLM prompts effectively, you need infrastructure to retrieve information from the LLM, store responses in memory, process those responses with various tools, and then make additional LLM calls based on the results.
These calls build on top of each other, with the LLM acting as an agent making decisions at each step.
The linked blog post leveraged three main tools to create AI Agents and implement LLM calls:
Project 1: CPF Policy Enquiry System using Two AI Agents
The first of the projects they made was:
a CPF Policy Enquiry search page where users can enter queries related to CPF policies and schemes. This tool uses agentic workflow to retrieve information and summarizes the responses to be as accurate and detailed as possible. A website link is also generated at the end of the response for the user to click and find out more detailed information. The target audience are users who have queries related to CPF policies and schemes.
Technology Stack:
- CrewAI for agent orchestration
- WebsiteSearchTool for RAG capabilities
- GPT-4o-mini as the base LLM
Workflow Overview:
- CrewAI: orchestrate role-playing, autonomous AI agents that collaborate to tackle complex tasks
- WebsiteSearchTool: RAG tool for searching website content, optimized for web data extraction
CrewAI was used to create and orchestrate two agents:
- Researcher: use the WebSearchTool to search for information from specific URLs (CPF Policy FAQs) to answer the query
- Support researcher: provide fact checking on information retrieved by the researcher from the specific URL, if no answer can be found, then use the tool to search the CPF Overview page
From this, you can see how the LLM prompts were chained:
- The “Researcher” takes in the query, runs it through the RAG, and then his GPT-4o-mini
- The response is stored in memory
- Support researcher looks at response stored in memory
- If information is found, then it verifies the information through LLM
- If no information is found, then search a different page
- If the different page has relevant information, then it gives the user the information
- If the information is not relevant, then it returns that no information is available
This first project demonstrated agent collaboration.
The second project took a different approach, focusing on sequential prompt chaining.
Project 2: Retirement Savings Calculator using Prompt Chaining
The second project they made helps individual users plan for retirement in Singapore.
Technology Stack:
- GPT-4o-mini as the base LLM
Workflow Overview: The App asks the user:
- takes in personal inputs from users like current age
- takes in retirement details like retirement age
Then, it does some calculations to find
- total projected retirement savings
- total projected CPF savings
Then it does prompt chaining to provide a brief suggestion on whether the user has enough savings for retirement.
It then also advises which type of CPF Life Plan is suitable.
Here, the workflow relies on LLMS and memory and on basic calculators coded into the App to determine what the math looks like.
In this second instance, the LLM calls are structured to run one after another.
Learnings from Implementation
After reviewing these projects and their write-up, several key insights emerge about building AI agent applications:
An interesting finding they made is that they could have done some pre-calculations for what information could be retrieved from the webpage, so you didn’t actually need to use the ScrapeWebsiteTool to scrape the policy page each time.
This makes sense and is worth considering in terms of the data you are aggregating and how frequently the data changes.
While it can make sense to hit an LLM each time or use an RAG application, oftentimes, the data changes so infrequently that you can precompute, prescribe, and precalculate the information so that you can do a lookup rather than doing an LLM call.
In some applications, this is worth considering because combining AI Agents with Prompt Chaining allows you to short-cut or skip certain calls, making your App easier to test and faster to implement.
Key Takeaways for AI Agent Development
Agent Collaboration vs Prompt Chains
In their first project, the first agent worked while the second agent was dormant. When the first agent finished, the second agent started up and ran their workflow.
I think they could have simplified it to a single prompt chain.
On the other hand, if they were set on using multiple AI Agents, then an improvement would have to be added to another agent.
We saw that the second agent did two jobs:
- Checked what they thought of the first agent’s work
- Decide what information to show the user based on their own research
They could break up this workflow into two agents:
- one to check the second agent’s work
- one to decide what to show the user
This makes it easier to test agents under the theory that an agent should do one thing well.
Performance Optimizations
After finishing the project, the person realized they could have done some work ahead of time to optimize the performance.
Figuring out that you could have precomputed some of the answers ahead of time happens often.
This is often codified in the aphorism:
- Make it work
- Make it right
- Make it fast
When working with AI Agents, doing the work ahead of time as a human before launching will frequently show you where you don’t need an agent or need to chain multiple LLM calls.
What happens next
For commercial applications, success depends on fully solving user problems, not just providing partial solutions.
This usually means adding workflow steps or integrating with other services to create a complete solution path - ensuring users don’t need to seek out additional apps to achieve their goals.
These additional workflow steps and service integrations will typically involve either LLM Prompt Chaining or AI Agents, building upon the core application’s capabilities.