English 中文(简体)
在语言模式的快速模板中使用诸如<tag>{ > </tag>等特殊标志的目的是什么?
原标题:What is the purpose of using special tokens like <tag>{value}</tag> in prompt templates for language models?

我正在与大型语言模型和即时工程合作,我注意到,许多即时模板使用了如下结构:<tag>{价值的}</tag>,以插入投入值或位置持有人。 我很想知道这种做法背后的理由及其带来的任何好处。

例如,在《人类库手册》(https://github.com/anthropics/anthropic-cookbook/blob/main/misc/building_evals.ipynb上,有一个使用这一格式的密码:

# Define our input prompt template for the task.
def build_input_prompt(animal_statement):
    user_content = f"""You will be provided a statement about an animal and your job is to determine how many legs that animal has.
    
    Here is the animal statment.
    <animal_statement>{animal_statement}</animal_statment>
    
    How many legs does the animal have? Return just the number of legs as an integer and nothing else."""

    messages = [{ role :  user ,  content : user_content}]
    return messages

具体地说,我要知道:

  1. 与将价值观直接纳入迅速案文相比,使用这种结构化、带有特别标志的迅速手段的目的或好处是什么?

  2. 在使用这种技术时,是否注意到对LLMs有特殊影响或业绩改进?

  3. 我能找到权威来源、研究文件或文件来讨论或介绍这种技术? 如果能更深入地探讨这一议题,我将不胜感激。

  4. 之所以使用这种格式,是因为对LLM进行了这种培训(对地主有特殊标志),或者说LMs一般不进行具体培训就理解和解释这种形式? 采用这种迅速与内陆发展中国家组合的典型做法是什么?

通过理解这种迅速安排办法背后的理由和潜在好处,我可以更好地利用这一方法与内陆发展中国家合作并迅速进行工程。 请你提供的任何见解或参考材料将受到高度赞赏。

问题回答
  1. Actually most transformer-based LLMs, including Chat-LLama and Mistral, will not give any contextual significance to a lot of kind of tags, including html tags. Now xml tags as you show in your prompt? Remember text is tokenized. Then a specific embeddings model is utililized to convert the tokens into vectors in a vector space, such as in vectordbs like Pinecone or ChromaDB or FEISS. The list of numeric values provide both contextual and structural meaning and those k nearest neighbor is performed to fit the text and phrases close to other vectors in the vector store. If you have something semantically meaningless, then it will also have not have semantic meaning when vectorized. I have rarely seen xml tags used in prompt engineering. But I found an article that found a use case for it online:

a) “改进: 您通过明确标明您的时段,帮助您的工作。 Claude理解哪些部分是指示,哪些是例子,哪些是询问。 ......

b) “加强反应准确性:清晰的结构导致对Claude的混淆程度降低,导致反应更加接近您的意图。 ......

c) “加快处理后:结构化的产出,特别是在应急使用XML标记、简化提取和使用Claude提供的信息时。 ......

来源::https://cheatsheet.md/claude/claude-prompt-engineering.en# Chapter-7-ruct-with-xml-tags

  1. What you mean by performance improvements? The performance has to do with the retrieval object which vector stores use in order to do a similarity search. You ask a question, and in LangChain using a chain like RetrievalQA, it will embed the question’s text, vectorize it in the vector store and find contextual text, usually recommended to have four or five contextual blocks, depending on the chunk size and chunk overlap when passing these input parameters. That’s where performance comes into play.

  2. 是否在快速工程中使用xml? 我不认为你会在网上查阅Llama 2号文件。 但您可以在此读到:。 我过去和现在都不知道使用xml的好处。 但是,我发现上述第一条在网上链接。

  3. 法学硕士使用NLP技术,如 stemm、lem子化、NER等,并使用诸如BERT等co器和变压器等脱焦器。 NLP是一项复杂的任务。 如果对模型具有属人性意义,那么对同类物进行类似的检测可显示质量结果。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...