Question

我试图从路段上加载一个模型到马拉马指数和这样的HugingfaceLLLM类:

from llama_index.llms.huggingface import HuggingFaceLLM

llm = HuggingFaceLLM(
   context_window=2048,
   max_new_tokens=300,
   generate_kwargs={"temperature": 0.5, "do_sample": True},
   #query_wrapper_prompt=query_wrapper_prompt,
   tokenizer_name="local_path/leo-hessianai-7B-AWQ",
   model_name="local_path/leo-hessianai-7B-AWQ",
   device_map="auto"
)

文件夹从拥抱面框下载, 模型正在装入, 然而,当我询问时, 它只返回胡言乱语( 如hohohohohohohohohohohohohohohoho等) 。

源节点是可信和正确的, 我检查过, 只是产生部分似乎是错误的。

我在这里缺少什么吗?当我把模型从枢纽上装上链接的时候,它很好,但是在IDE(而Ollama等也不行)中,它不起作用。

我很感激你的帮助,谢谢!

Answer 1

This model is not a regular model, it has a custom quantization scheme that is probably not supported out of the box by the library you re using. See https://huggingface.co/TheBloke/leo-hessianai-7B-AWQ#about-awq

Indeed, in the vllm example, they specify the quantization https://huggingface.co/TheBloke/leo-hessianai-7B-AWQ#serving-this-model-from-vllm

不知道什么是山羊指数, 但我会尝试不同的模型如果我是你

友情链接