Question

I am using Django, and Langchain with OpenAI to generate responses to my prompts. I was trying to enable streaming using Server-Sent-Events (SSE) in my API function. When I run my code It does stream the OpenAI output in the terminal, but it returns the output as a whole to the client once the streaming has ended.

我尝试了精简HttpResponse,但却没有取得成功。如果任何人能够提出该法典中的错误,那将是真的麻烦。

页: 1

@api_view([ GET , POST ])
def sse_view(request):
if request.method !=  POST :
    return Response({"message": "Use the POST method for the response"})

url = request.data.get("url")
questions = request.data.get("questions")
prompt = request.data.get("promptName")

if not url or not questions:
    return Response({"message": "Please provide valid URL and questions"})

# Process the documents received from the user
try:
    doc_store = process_url(url)
    if not doc_store:
        return Response({"message": "PDF document not loaded"})
except Exception as e:
    return Response({"message": "Error loading PDF document"})


custom_prompt_template = set_custom_prompt(url,prompt)
# Load and process documents
loader = DirectoryLoader(DATA_PATH, glob= *.pdf , loader_cls=PyPDFLoader)
documents = loader.load()   

text_splitter = CustomRecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=30)
texts = text_splitter.split_documents(documents)

#Creating Embeddings using OpenAI
embeddings = OpenAIEmbeddings(chunk_size= 16, openai_api_key= openai_gpt_key,)

db = FAISS.from_documents(texts, embeddings)
db.save_local(DB_FAISS_PATH)
search_kwargs = {
     k : 30,
     fetch_k :100,
     maximal_marginal_relevance : True,
     distance_metric :  cos ,
}
retriever=db.as_retriever(search_kwargs=search_kwargs)

# get the list of questions from the body
questionList =request.data[ questions ]

#This function is to generate responses from OpenAi
def openai_response_generator():
        
        # Create an instance of ChatOpenAI
        llm = ChatOpenAI(
            model_name="gpt-3.5-turbo-16k",
            streaming=True,
            callbacks=[StreamingStdOutCallbackHandler()],
            temperature=0,
            openai_api_key= openai_gpt_key,
        )

        # Iterating through the questions list
        for question in (questionList):
            qa = RetrievalQA.from_chain_type(
                llm=llm,
                chain_type="stuff",
                retriever=retriever,
                return_source_documents=True,
                chain_type_kwargs={"prompt": custom_prompt_template},
            )

        res = qa({ query : question})
        yield f"data: {res[ result ]}

"

response = StreamingHttpResponse(openai_response_generator(), content_type="text/event-stream")
return response

Here s what I see in PostMan when I run the function.

Answer 1

如果你使用<代码>Callback Handler<>/code>处理来自LM的新流,则会有所帮助。

langchain>/code>提供许多内联网接听器,但我们能够使用定制的手稿

class CustomStreamingCallbackHandler(BaseCallbackHandler):
"""Callback Handler that Stream LLM response."""

    def __init__(self, queue):
        self.queue = queue

    
    def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
    """Run on new LLM token. Only available when streaming is enabled."""
        self.queue.put(token)

we are doing here that whenever OpenAI API streams a new token, we put it in a Queue. Our Custom Handler Accept the queue and put tokens in that queue

我们现在需要这样做,

def generate_stream(q: Queue):
    #Any Custom Condition To stop while loop
    #like while the queue is not empty
    while (...):
        stream = q.get()
        yield stream

queue = Queue()
response = StreamingHttpResponse(generate_stream(queue), content_type="text/event-stream")

除此以外,您需要开始履行关于校对的主要<条码>。

queue = Queue()
task = threading.Thread(
    target=openai_response_generator,
    args=(queue,)
)
task.start()
response = StreamingHttpResponse(generate_stream(queue), content_type="text/event-stream")

露天-反应功能将发生变化。

def openai_response_generator(queue):
    
    # Create an instance of ChatOpenAI
    llm = ChatOpenAI(
        model_name="gpt-3.5-turbo-16k",
        streaming=True,
        temperature=0,
        openai_api_key= openai_gpt_key,
    )
    llm.callback_manager = CustomStreamingCallbackHandler(queue)
    # Iterating through the questions list
    for question in (questionList):
        qa = RetrievalQA.from_chain_type(
            llm=llm,
            chain_type="stuff",
            retriever=retriever,
            return_source_documents=True,
            chain_type_kwargs={"prompt": custom_prompt_template},
        )

    res = qa({ query : question})

www.un.org/Depts/DGACM/index_spanish.htm 摘要:

在界定“LM”时,我们制定了“<条码>cus背书/代码>,接受<条码>queue,并将新的条码输入到此处,而我们认为,我们回去了<条码>。简化HttpResponse,使用这一格言

......

Answer 2

这可能是现在的一点点,我可能误解了一点,但像你一样,对问题进行思考,从而打造了卡片物体,但只是在电梯完成之后通过。如果你想单独回答问题,你就不想逐一回答问题(在主持人内部)。

友情链接