Technical

reAct (reasoning Action)

Aidan Reese
#AI#agents#LLM#reAct#reasoning#action#actions#OoriChat

an AI generated image of a robot talking to a human

OoriChat utilizes several novel LLM (Large Language Model) technologies and techniques, with reAct being one of the most core and revolutionary. Described by Yao et al., this method of creating agents involves chaining multiple LLM calls to go through a “thought process” of reasoning, reaction, action, and reply.

The overall goal of our implementation of reAct is to allow an LLM-based agent to to gain access to more information than just what it knows from its training data, a 4 phase process:

  1. User sends a message to the agent.
  2. Reasoning - Planning LLM determines from the user’s message what steps are needed to respond.
  3. Reaction - Decision-making LLM uses the steps laid out in the reasoning phase to decide exactly how to gather the correct information to respond, based on the available actions and actions.
  4. Action - The selected actions and/or actions activate, gathering and formatting information, and ultimately a report of useful information which can be used to respond to the user.
  5. Reply - Conversational LLM takes in the report and uses the information within to generate a response to the user’s message.

This approach to creating and running agents has several advantages, including:

  • Specialization of tasks
    • Taking the “thought process” piecemeal creates opportunities to use specialized LLMs for specific tasks, or even utilize actions that are not LLMs at all, such as APIs (search engines, real-time financial data).
  • Cheaper and more efficient than a single general-purpose LLM
    • Splitting the workload between several smaller, more specialized LLMs allows for several small models to contribute to responses that are greater than the sum of their parts. In many cases, this technique can even surpass the capabilities of a single, general-purpose LLM.
  • Increased transparency and easier debugging
    • The LLMs in the “thought process” must speak their intent at every step, which significantly decreases the chance of the agent “going rogue” and saying something that is not intended, as well as making it easier to develop and debug if the agent does go off the rails.
  • Increased flexibility
    • The “thought process” can be easily modified to fit different use cases, and is less reliant on the performance of any one individual LLM.

Reasoning

This step was by far the most difficult for me to work out. I didn’t even have it included in my first prototypes of reAct, initially believing that if I simply used a stronger LLM such as GPT-4, I would be able to directly choose what action was going to be needed to respond to the user’s message. This was not the case; even with a good system message and few-shot prompting, the model wanted to have a conversation more than it wanted to reason. I ended up spending a few too many hours trying to prompt engineer my way into a quick and easy solution.

However, thanks to this blog post, I discovered a means of giving the LLM some extra direction by asking it to first give me the steps that the agent should take to respond to the user’s message. This was a game changer, and allowed me to create a much more robust system that was able to reason about the user’s message, and determine the steps that it should take to respond to it.

User: How did the CU buffs do in their last college football game?

Reasoning prompt: The OoriChat Ambassador is a useful assistant who will respond to user’s messages the absolute best of their ability.

You are the decision making thought process for the OoriChat Ambassador, where you go through a loop of reasoning, reaction, action, and reply in order to make useful and helpful responses to a user.

The first step is reasoning, which you are currently performing; you are going to use reason and logic to determine the steps you need to take to respond to the user’s most recent message.

As the thought process of the OoriChat Ambassador, explain your reasoning on what to do to the Ambassador, who will then take action(s) based on your reasoning.

Reasoning LLM Response: I should use a search engine to find the most recent score from a Colorado University football game.

This first phase, while appearing superficial, is actually critical to the success of the reAct process. It allows the agent to determine what actions it will need to use to respond to the user’s message, and also allows the agent to determine a plan which the rest of the reAct process will follow.

Reaction

Now that the agent has a plan, it needs to determine how it will execute upon that plan. This is accomplished by feeding the plan given in the reasoning phase into a prompt that asks an LLM to determine what action(s) from a repertoire of actions the agent can access.

Reaction prompt: The OoriChat Ambassador is a useful assistant who will respond to user’s messages the absolute best of their ability.

You are the decision making thought process for the OoriChat Ambassador, where you go through a loop of reasoning, reaction, action and reply in order to make useful and helpful responses to a user.

This is now the action step, where based on the thoughts provided by the REASONING step, you decide on what action(s) to take. Only take more than one action if it is necessary.

Here is the user’s message: “How did the CU buffs do in their last college football game?”

Here is the reasoning that occurred on the previous step: “I should use a search engine to find the most recent game played in college football.”

Here is a list of your available action(s):

• Google - Use the “Google” action to find information you don’t have. This action uses the Google Search engine to find information on the internet. This action should only be used if you don’t have the information you need.

• Stocks - Use the “Stocks” action to get information about a any time a stock (or its price) is mentioned.

• Oori - Use the “Oori” action to get information about Oori Data, the company you represent.

• Basic - Use the “Basic” action when no other actions are needed.

You will reply with just the name(s) of the action(s) which you will take, separated by ampersands. Then, after two newlines (“\n\n”), you will explain your reasoning for choosing the action(s) you did.

Reaction LLM Response: Google

I chose the Google action because I don’t have the information I need to respond to the user’s message.

The goal of this phase is modularity, allowing for simple additions of new actions.

Action

The next phase, action, utilizes the information gathered in the last step formatted into a final prompt that will generate a response back for the user. This phase will look very different on a case-by-case basis, as it is entirely dependent on the action(s) that were chosen in the previous phase. In our case, a Google search was chosen, so the agent will search for “How did the CU buffs do in their last college football game?“. The agent will then take the first result from the search and use that in the next phase.

Reply

Reply is fairly straightforward and mostly dependent on the functionality of the action(s) that have been chosen.

I would include the response following the example, but Beautiful Soup 4, true to name, returns some pretty unreadable HTML. If you’re a human, that is. LLMs can parse it just fine, so I’ll just ask you to imagine it.

Reply Prompt: Based on this data “[result from Google]” The OoriChat Ambassador is a useful assistant who will respond to user’s messages the absolute best of their ability.

You are the decision making thought process for the OoriChat Ambassador, where you go through a loop of reasoning, reaction, action and reply in order to make useful and helpful responses to a user.

You are in the final step, RESPONSE, where you will use the action that you decided upon in the previous step to respond to the user.

As a representative of Oori, you also have access to a variety of action that can help you answer questions based on real time information and data. Given this data from the action step:

“{search results from Google}“

which was generated from these action(s):

• Google

Respond to the following user message utilizing any combination of your own knowledge, the real time data provided, and the current message history:

“How did the CU buffs do in their last college football game?”

Reply LLM Response: The CU Buffs lost their last game 43-46 to the Stanford Cardinals.


About the author

portrait
Aidan Reese (he/him)

Aidan is a full-stack and AI engineer at Oori contributing to the reAct engine project. Just before starting at Oori in the summer of 2023, Aidan got his undergraduate degree in computer science from the University of Colorado.

aidan@oori.dev
LinkedIn