Question

我一直在研究这个问题——我正在利用摄取的数据建造一个名为“Q&”的文件查询的RAG chatbot仪。

我在竭力要求我的法学硕士提供数据;在开发当地数据时,我要么把数据放在安装加病媒储存的同一档案中,要么尽可能地把数据放在一起,但我却在失去数据?

当我在当地和什么时候发展时,什么“经验”? 我可以编织和装货文件,然后编印这些文件,然后放置+病媒储存,然后问。我怀疑,所有这一切都应该用同样的卷宗。如何工作?

我用分开的档案/磁盘和所有手稿进行了磨擦。结果通常是“无法得到任何数据来回答这一问题”,但我可以打印在指挥线上的经认可的文件文本。我也尝试了在线和地方的LLLM方案,如谷歌、Gemini和Ollama地方模式。

To clarify - if I run a Python script that loads documents only, would runtime memory include this data if I import my documents variable from the first into a split/chunk process in another Python script? If not, I could have the first script call the functions in the second, but I m back to all one file again, really.

3. 技术组别细节:

Python 12
LangChain tools including Supabase integration
Supabase vector store w/ postgreSQL database
Streamlit for UI
tried but not set on using: Vecs (library for postgreSQL vector storage), different LLMs (not set on Gemini), various LlamaIndex options

Answer 1

鉴于你是用沙尔来的,我们没有详细说明你的执行情况。您能否分享你正在使用的打字/图书馆?

如果我清楚地理解,你似乎正在永远忘却一切。与病媒储存数据库的简单链接可以解决你的问题。

友情链接