In this chapter, we will continue from the introduction to agents
and dive deeper into agents. Learning how to build our custom agent execution loop
for v0.3 of LangChain.
What is the Agent Executor?
When we talk about agents, a significant part of an "agent" is simple code logic,
iteratively rerunning LLM calls and processing their output. The exact logic varies
significantly, but one well-known example is the ReAct agent.
Reason + Action (ReAct) agents use iterative reasoning and action steps to
incorporate chain-of-thought and tool-use into their execution. During the reasoning
step, the LLM generates the steps to take to answer the query. Next, the LLM generates
the action input, which our code logic parses into a tool call.
Following our action step, we get an observation from the tool call. Then, we feed the
observation back into the agent executor logic for a final answer or further reasoning
and action steps.
The agent and agent executor we will be building will follow this pattern.
Creating an Agent
We will construct the agent using LangChain Epression Language (LCEL).
We cover LCEL more in the LCEL chapter, but as before, all we
need to know now is that we construct our agent using syntax and components like so:
text
agent = (
<input parameters, including chat history and user query>
| <prompt>
| <LLM with tools>
)
We need this agent to remember previous interactions within the conversation. To do
that, we will use the ChatPromptTemplate with a system message, a placeholder for our
chat history, a placeholder for the user query, and a placeholder for the agent
scratchpad.
The agent scratchpad is where the agent writes its notes as it works through multiple
internal thought and tool-use steps to produce a final output for the user.
python
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
("system", (
"You're a helpful assistant. When answering a user's question "
"you should first use one of the tools provided. After using a "
"tool the tool output will be provided in the "
"'scratchpad' below. If you have an answer in the "
"scratchpad you should not use any more tools and "
To add tools to our LLM, we will use the bind_tools method within the LCEL
constructor, which will take and add our tools to the LLM. We'll also include the
tool_choice="any" argument to bind_tools, which tells the LLM that it MUST use a
tool, ie it cannot provide a final answer directly (therefore not using a tool):
python
from langchain_core.runnables.base import RunnableSerializable
Because we set tool_choice="any" to force the tool output, the usual content field
will be empty as LangChain reserves that field for natural language output, i.e. the
final answer of the LLM. To find our tool output, we need to look at the tool_calls
field:
python
out.tool_calls
python
[{'name': 'add',
'args': {'x': 10, 'y': 10},
'id': 'call_bI8aZpMN1y907LncsX9rhY6y',
'type': 'tool_call'}]
From here, we have the tool name that our LLM wants to use and the args that it
wants to pass to that tool. We can see that the tool add is being used with the
arguments x=10 and y=10. The agent.invoke method has not executed the tool
function; we need to write that part of the agent code ourselves.
Executing the tool code requires two steps:
Map the tool name to the tool function.
Execute the tool function with the generated args.
python
# create tool name to function mapping
name2tool = {tool.name: tool.func for tool in tools}
Despite having the answer in our agent_scratchpad, the LLM still tries to use the tool
again. This behaviour happens because we bonded the tools to the LLM with
tool_choice="any". When we set tool_choice to "any" or "required", we tell the
LLM that it MUST use a tool, i.e., it cannot provide a final answer.
There's two options to fix this:
Set tool_choice="auto" to tell the LLM that it can choose to use a tool or provide
a final answer.
Create a final_answer tool - we'll explain this shortly.
We now have the final answer in the content field! This method is perfectly
functional; however, we recommend option 2 as it provides more control over the
agent's output.
There are several reasons that option 2 can provide more control, those are:
It removes the possibility of an agent using the direct content field when it is not
appropriate; for example, some LLMs (particularly smaller ones) may try to use the
content field when using a tool.
We can enforce a specific structured output in our answers. Structured outputs are
handy when we require particular fields for downstream code or multi-part answers. For
example, a RAG agent may return a natural language answer and a list of sources used to
generate that answer.
To implement option 2, we must create a final_answer tool. We will add a
tools_used field to give our output some structure—in a real-world use case, we
probably wouldn't want to generate this field, but it's useful for our example here.
"""Use this tool to provide a final answer to the user.
The answer should be in natural language as this will be provided
to the user directly. The tools_used must include a list of tool
names that were used within the `scratchpad`.
"""
return None
Our final_answer tool doesn't necessarily need to do anything; in this example,
we're using it purely to structure our final response. We can now add this tool to our
agent:
We see that content remains empty because we force tool use. But we now have the
final_answer tool, which the agent executor passes via the tool_calls field:
python
out.tool_calls
python
[
{
'name': 'final_answer',
'args': {
'answer': '10 + 10 equals 20.',
'tools_used': ['functions.add']
},
'id': 'call_reBCXwxUOIePCItSSEuTKGCn',
'type': 'tool_call'
}
]
Because we see the final_answer tool here, we don't pass this back into our agent, and
instead, this tells us to stop execution and pass the args output onto our downstream
process or user directly:
We've worked through each step of our agent code, but it doesn't run without us running
every step. We must write a class to handle all the logic we just worked through.
python
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage