English 中文(简体)
在对案文产生任务进行微调时,我们为什么能把LLM的投入和产出同起来?
原标题:Why can we set LLM s input and output to be the same when fine tuning on text generation task?
问题回答

GPT-2旨在根据前线的顺序预测下线。 例如,鉴于“我爱”这一短语,它可能预测“你”。

<>1. 为什么使用<条码>输入-tensor 用于输入和Labels

The confusion often arises when seeing input_tensor used for both input and labels. This is due to the Masking mechanism inherent in GPT-2.

与BERT不同的是,当具体症状被掩盖,模型预测这些症状时,GPT-2的面罩是控制每个预测步骤中发现的症状。 模型预测“我爱音乐”这样的顺序:

  1. "I" -> "love"
  2. "I love" -> "music"

This is achieved through internal masking. The model doesn t see future tokens, ensuring genuine next-token prediction based on the given context. So, using input_tensor for both input and labels doesn t make the model a mere repeater. It s training the model to predict subsequent tokens based on prior context.

2. Splitting Songs inhal

人工分立的歌曲体质是一种理想。 GPT-2的设计必然会按顺序预测每个职位的下一个标线。 唱歌可能偏袒后几部分的模式,可能限制其学习。

3. A Better Approach

考虑采用描述性症状:

  • 制作你的数据集时,应说明歌曲的风格或主题,随后是相应的课程。 例如:

    投入:A melancholic ballad about loss care in Winter >

    Output: "Snowflakes fall, my heart calls, for the love lost in winter s thrall..."

这种办法可以更有效地指导该模式在理想方式中生成摩擦。





相关问题
Java Stanford NLP: Part of Speech labels?

The Stanford NLP, demo d here, gives an output like this: Colorless/JJ green/JJ ideas/NNS sleep/VBP furiously/RB ./. What do the Part of Speech tags mean? I am unable to find an official list. Is it ...

Java Stanford NLP: Find word frequency?

I m using the Stanford NLP Parsing toolkit. Given a word in the lexicon, how can I find its frequency*? Or, given a frequency rank, how can I determine the corresponding word? *in the entire language,...

c/c++ NLP library [closed]

I am looking for an open source Natural Language Processing library for c/c++ and especially i am interested in Part of speech tagging.

Clustering text in Python [closed]

I need to cluster some text documents and have been researching various options. It looks like LingPipe can cluster plain text without prior conversion (to vector space etc), but it s the only tool I ...

Natural language rendering

Do you know any frameworks that implement natural language rendering concept ? I ve found several NLP oriented frameworks like Anthelope or Open NLP but they have only parsers but not renderers or ...