Thursday, July 11, 2024

Some notes on the Supervisor Agent based approach in langchain

 Everything is based on ChatPromptTemplate in lanchain 

- A supervisor is nothing but a chain of prompt, function to decide which agent should be next, and a JSON output parser. 

- In the supervisor chain, the prompt is made from ChatPromptTemplate.from_messages function. Which accepts the system prompt, and message place holder 

- The supervisor's prompt is given an idea that you are the coordinator and you should coordinate among the various agents that are available. To give the info on the agents available, given the array of agent names.  

- Now the prompt is given only string array of options. When selecting one of the option, how does it know which to pick?  This must be with the help of system prompt given to each of the individual agents such as researcher, coder agents. With these system prompts, internally the Langchain framework must be figuring out which agent to executed and picking up agent accordingly. 




So the most crucial pieces of code is as below 


#creation of prompt for supervisor , which is as below 


prompt = ChatPromptTemplate.from_messages(

    [

        ("system", system_prompt),

        MessagesPlaceholder(variable_name="messages"),

        (

            "system",

            "Given the conversation above, who should act next?"

            " Or should we FINISH? Select one of: {options}",

        ),

    ]

).partial(options=str(options), members=", ".join(members))



#now the next important piece is function definition 

# Using openai function calling can make output parsing easier for us

function_def = {

    "name": "route",

    "description": "Select the next role.",

    "parameters": {

        "title": "routeSchema",

        "type": "object",

        "properties": {

            "next": {

                "title": "Next",

                "anyOf": [

                    {"enum": options},

                ],

            }

        },

        "required": ["next"],

    },

}



In this, the function name is route and description is Selecting any of the role. Now with it, For any project that involves supervisor, this part can be re-used possibly

Then then next part is individual agents. 


In this case, the researcher agent is like below. 


def create_agent(llm: ChatOpenAI, tools: list, system_prompt: str):

    # Each worker node will be given a name and some tools.

    prompt = ChatPromptTemplate.from_messages(

        [

            (

                "system",

                system_prompt,

            ),

            MessagesPlaceholder(variable_name="messages"),

            MessagesPlaceholder(variable_name="agent_scratchpad"),

        ]

    )

    agent = create_openai_tools_agent(llm, tools, prompt)

    executor = AgentExecutor(agent=agent, tools=tools)

    return executor


The prompt is similar to what we have in other places. The difference here is that there is Agent and AgentExecutor created here. 

The function tools partial is used to create multiple functions instead of defining those differently. 

Now question is, how it identifies when to end and whether it got the desired data from an agent's execution. 

This is where the supervisor agent concept comes in. This is achieved by conditional node in Langgraph. The implementation is like below 


for member in members:

    # We want our workers to ALWAYS "report back" to the supervisor when done

    workflow.add_edge(member, "supervisor")

# The supervisor populates the "next" field in the graph state

# which routes to a node or finishes

conditional_map = {k: k for k in members}

conditional_map["FINISH"] = END


workflow.add_conditional_edges("supervisor", lambda x: x["next"], conditional_map)



With above, each member is always connected to the supervisor and supervisor decides where to move next. 

The conditional edge function decide which is the next. This is how the whole thing is achieved. 


No comments:

Post a Comment