Summary
-
Starting from a seed instruction, Evol-Instruct randomly selects In-Depth Evolving (blue line) or In-Breadth Evolving (red line)
-
Since the evolved instructions are generated from LLMs, sometimes the evolving will fail. They adopt an instruction eliminator to filter the failed instructions, which is called Elimination Evolving.
Detailed
Deepening prompt template
I want you act as a Prompt Rewriter.
Your objective is to rewrite a given prompt into a more complex version to make those famous AI systems
(e.g., ChatGPT and GPT4) a bit harder to handle.
But the rewritten prompt must be reasonable and must be understood and responded by humans.
Your rewriting cannot omit the non-text parts such as the table and code in #Given Prompt#:. Also, please
do not omit the input in #Given Prompt#.
You SHOULD complicate the given prompt using the following method:
Please add one more constraints/requirements into #Given Prompt#
You should try your best not to make the #Rewritten Prompt# become verbose, #Rewritten Prompt# can only
add 10 to 20 words into #Given Prompt#.
‘#Given Prompt#’, ‘#Rewritten Prompt#’, ‘given prompt’ and ‘rewritten prompt’ are not allowed to appear in
#Rewritten Prompt#
#Given Prompt#:
<Here is instruction.>
#Rewritten Prompt#:
- In practice, they provide few-shot examples too.
Breadth prompt template
I want you act as a Prompt Creator.
Your goal is to draw inspiration from the #Given Prompt# to create a brand new prompt.
This new prompt should belong to the same domain as the #Given Prompt# but be even more rare.
The LENGTH and difficulty level of the #Created Prompt# should be similar to that of the #Given Prompt#.
The #Created Prompt# must be reasonable and must be understood and responded by humans.
‘#Given Prompt#’, ‘#Created Prompt#’, ‘given prompt’ and ‘created prompt’ are not allowed to appear in
#Created Prompt#.
#Given Prompt#:
<Here is instruction.>
#Created Prompt#:
Elimination Evolving
They classify the following four situations as instruction evolution failure:
- The evolved instruction does not provide any information gain compared to the original one.They use an LLM judge to score this.
- The evolved instruction makes it difficult for the LLM to generate a response. They found that when the generated response contains “sorry” and is relatively short in length (i.e., less than 80 words), it often indicates that the LLM struggles to respond to the evolved instruction.
- The response generated by the LLM only contains punctuation and stop words.
- The evolved instruction obviously copies some words from the evolving prompt, such as “given prompt”, “rewritten prompt”, “#Rewritten Prompt#”, etc.
Creating the responses to the instructions
- They concatenate the instruction with ”### Response:” as the prompt to train the model to generate responses in a standard supervised way.