Question

我正在与大型语言模型和即时工程合作,我注意到,许多即时模板使用了如下结构:<tag>{价值的}</tag>,以插入投入值或位置持有人。我很想知道这种做法背后的理由及其带来的任何好处。

例如,在《人类库手册》(https://github.com/anthropics/anthropic-cookbook/blob/main/misc/building_evals.ipynb 上,有一个使用这一格式的密码:

# Define our input prompt template for the task.
def build_input_prompt(animal_statement):
    user_content = f"""You will be provided a statement about an animal and your job is to determine how many legs that animal has.
    
    Here is the animal statment.
    <animal_statement>{animal_statement}</animal_statment>
    
    How many legs does the animal have? Return just the number of legs as an integer and nothing else."""

    messages = [{ role :  user ,  content : user_content}]
    return messages

具体地说,我要知道:

与将价值观直接纳入迅速案文相比,使用这种结构化、带有特别标志的迅速手段的目的或好处是什么?
在使用这种技术时,是否注意到对LLMs有特殊影响或业绩改进?
我能找到权威来源、研究文件或文件来讨论或介绍这种技术? 如果能更深入地探讨这一议题,我将不胜感激。
之所以使用这种格式,是因为对LLM进行了这种培训(对地主有特殊标志),或者说LMs一般不进行具体培训就理解和解释这种形式? 采用这种迅速与内陆发展中国家组合的典型做法是什么?

通过理解这种迅速安排办法背后的理由和潜在好处,我可以更好地利用这一方法与内陆发展中国家合作并迅速进行工程。请你提供的任何见解或参考材料将受到高度赞赏。

Answer 1

Actually most transformer-based LLMs, including Chat-LLama and Mistral, will not give any contextual significance to a lot of kind of tags, including html tags. Now xml tags as you show in your prompt? Remember text is tokenized. Then a specific embeddings model is utililized to convert the tokens into vectors in a vector space, such as in vectordbs like Pinecone or ChromaDB or FEISS. The list of numeric values provide both contextual and structural meaning and those k nearest neighbor is performed to fit the text and phrases close to other vectors in the vector store. If you have something semantically meaningless, then it will also have not have semantic meaning when vectorized. I have rarely seen xml tags used in prompt engineering. But I found an article that found a use case for it online:

a) “改进: 您通过明确标明您的时段,帮助您的工作。 Claude理解哪些部分是指示,哪些是例子,哪些是询问。 ......

b) “加强反应准确性:清晰的结构导致对Claude的混淆程度降低,导致反应更加接近您的意图。 ......

c) “加快处理后:结构化的产出,特别是在应急使用XML标记、简化提取和使用Claude提供的信息时。 ......

来源::https://cheatsheet.md/claude/claude-prompt-engineering.en# Chapter-7-ruct-with-xml-tags

What you mean by performance improvements? The performance has to do with the retrieval object which vector stores use in order to do a similarity search. You ask a question, and in LangChain using a chain like RetrievalQA, it will embed the question’s text, vectorize it in the vector store and find contextual text, usually recommended to have four or five contextual blocks, depending on the chunk size and chunk overlap when passing these input parameters. That’s where performance comes into play.

是否在快速工程中使用xml? 我不认为你会在网上查阅Llama 2号文件。但您可以在此读到:。我过去和现在都不知道使用xml的好处。但是,我发现上述第一条在网上链接。

法学硕士使用NLP技术,如 stemm、lem子化、NER等,并使用诸如BERT等co器和变压器等脱焦器。 NLP是一项复杂的任务。如果对模型具有属人性意义,那么对同类物进行类似的检测可显示质量结果。

Answer 2

许多即时模板使用了如下结构:<代码><tag>{数值}</tag>。

<代码>{数值><>>部分与LLM没有任何关系。参看 ,Despha,将参数输入该示意图;LLLM从来未实际见{t}/code> 自那时以来,Adhur在穿透模型/API之前便穿透了窒息。


另一方面,XML标记确实提供了更多一种结构,从用户提供的投入中抽出的速率,并在关于Claude模型的人类文献中加以讨论。 并非所有模式都知道如何处理XML标记,而Claude模式确实与它们一起接受了培训。


Claude特别熟悉具有XML标记的信号,因为Claude在培训期间暴露了这种信号。


它们提供了迅速和投入分离的好处,并且可以与其他一系列的迅速行动挂钩,或用于指定多个产出结果,并按方案分类。

友情链接