English 中文(简体)
f string to pass file path issue
原标题:

I have a function which accepts a file path. It s as below:

def document_loader(doc_path: str) -> Optional[Document]:
        """ This function takes in a document in a particular format and 
        converts it into a Langchain Document Object 
        
        Args:
            doc_path (str): A string representing the path to the PDF document.

        Returns:
            Optional[DocumentLoader]: An instance of the DocumentLoader class or None if the file is not found.
        """
        
        # try:
        loader = PyPDFLoader(doc_path)
        docs = loader.load()
        print("Document loader done")

PyPDfLoader is a wrapper around PyPDF2 to read in a pdf file path

Now,when I call the function with hardcoding the file path string as below:

document_loader( /Users/Documents/hack/data/abc.pdf )

The function works fine and is able to read the pdf file path.

But now if I want a user to upload their pdf file via Streamlit file_uploader() as below:

uploaded_file = st.sidebar.file_uploader("Upload a file", key= "uploaded_file")
print(st.session_state.uploaded_file)

if uploaded_file is not None:
    filename = st.session_state.uploaded_file.name
    print(os.path.abspath(st.session_state.uploaded_file.name))
    document_loader(f "{os.path.abspath(filename)}" )

I get the error:

ValueError: File path "/Users/Documents/hack/data/abc.pdf" is not a valid file or url

This statement print(os.path.abspath(st.session_state.uploaded_file.name)) prints out the same path as the hardcoded one.

Note: Streamlit is currently on localhost on my laptop and I am the "user" who is trying to upload a pdf via locally runnin streamlit app.

问题回答

The object returned by st.file_uploader is a "file-like" object inheriting from BytesIO.

From the docs:

The UploadedFile class is a subclass of BytesIO, and therefore it is "file-like". This means you can pass them anywhere where a file is expected.

While the returned object does have a name attribute, it has no path. It exists in memory and is not associated to a real, saved file. Though Streamlit may be run locally, it does in actuality have a server-client structure where the Python backend is usually on a different computer than the user s computer. As such, the file_uploader widget is not designed to provide any real access or pointer to the user s file system.

You should either

  1. use a method that allows you to pass a file buffer instead of a path,
  2. save the file to a know path,
  3. use tempfiles

A brief example working with temp files and another question about them that may be helpful.

import streamlit as st
import tempfile
import pandas as pd

file = st.file_uploader( Upload a file , type= csv )
tempdir = tempfile.gettempdir()

if file is not None:
    with tempfile.NamedTemporaryFile(delete=False) as tf:
        tf.write(file.read())
        tf_path = tf.name
    st.write(tf_path)
    df = pd.read_csv(tf_path)
    st.write(df)




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签