Question

I want to add a ConversationBufferMemory to pandas_dataframe_agent but so far I was unsuccessful.

I have tried adding the memory via construcor: create_pandas_dataframe_agent(llm, df, verbose=True, memory=memory) which didn t break the code but didn t resulted in the agent to remember my previous questions.
Also I have tried to add memory into the agent via this pieace of code: pd_agent.agent.llm_chain.memory = memory. Which resulted in ValueError: One input key expected got [ input , agent_scratchpad ]

This is my code so far (which doesn t work):

llm = ChatOpenAI(temperature=0, model_name="gpt-4-0613")

memory = ConversationBufferMemory()

pd_agent = create_pandas_dataframe_agent(llm, df, verbose=True, memory=memory)
#pd_agent.agent.llm_chain.memory = memory #Or if I use this approach the code breaks when calling the .run() methods

pd_agent.run("Look into the data in step 12. Are there any weird patterns? What can we say about this part of the dataset.")
pd_agent.run("What was my previouse question?") #Agent doesn t rember

Answer 1

In the version 0.0.202 the only way I found out to add memory into pandas_agent is like this (you also need to change the prompt.py file - how-to is written below the code):

We want to create two diffrent models - one for generating code and the second one for the context
llm_code = ChatOpenAI(temperature=0, model_name="gpt-4-0613") #gpt-3.5-turbo-16k-0613
llm_context = ChatOpenAI(temperature=0.5, model_name="gpt-4") #gpt-3.5-turbo

chat_history_buffer = ConversationBufferWindowMemory(
    k=5,
    memory_key="chat_history_buffer",
    input_key="input"
    )

chat_history_summary = ConversationSummaryMemory(
    llm=llm_context, 
    memory_key="chat_history_summary",
    input_key="input"
    )

chat_history_KG = ConversationKGMemory(
    llm=llm_context, 
    memory_key="chat_history_KG",
    input_key="input",
    )

memory = CombinedMemory(memories=[chat_history_buffer, chat_history_summary, chat_history_KG])

pd_agent = create_pandas_dataframe_agent(
    llm_code, 
    df, 
    verbose=True, 
    agent_executor_kwargs={"memory": memory},
    input_variables=[ df_head ,  input ,  agent_scratchpad ,  chat_history_buffer ,  chat_history_summary ,  chat_history_KG ]
    )

First you specify for each memory type you want to use a memory_key. This memory_key needs to be passed into input_variables.

You also need to pass the memory object into the pandas_agent like this:

agent_executor_kwargs={"memory": memory}

VERY IMPORTANT!!!

You need to change the prompt.py file located in ../langchain/agents/agent_toolkits/pandas/prompt.py to take into account the new memory you added.

The only thing you need to change is PREFIX. This is the change that worked for me:

PREFIX = """
You are working with a pandas dataframe in Python. The name of the dataframe is `df`.
You should use the tools below to answer the question posed of you:

Summary of the whole conversation:
{chat_history_summary}

Last few messages between you and user:
{chat_history_buffer}

Entities that the conversation is about:
{chat_history_KG}
"""

Answer 2

what do you mean when you said -

""You need to change the prompt.py file located in ../langchain/agents/agent_toolkits/pandas/prompt.py to take into account the new memory you added.""

Where is this prompt.py file? I can t see it anywhere.

I did it like this:

pd_agent = create_pandas_dataframe_agent(
    llm_code, 
    df1, 
    verbose=True,
    prefix = PREFIX,
    agent_executor_kwargs={"memory": memory},
    input_variables=[ df_head ,  input ,  agent_scratchpad ,  chat_history_buffer ,  chat_history_summary ,  chat_history_KG ]
    )

Here PREFIX is the custom prompt you gave. Is that correct? Another follow uo query, what if I want to use multiple df. Like df1,df2 etc. In that case how to pass ?

TIA

VERY IMPORTANT!!!

友情链接