• Paper by Wei
  • Standard few-shot prompting: <input,output>
  • Chain of thought prompting: <input, thought, output>
  • extremely data-efficient for a given task (requires very few samples)

Self-consistency (SC)

  • “Self-consistency improves chain of thought reasoning in language models.”
  • The self-consistency method contains three steps:
    • (1) prompt a language model using chain-of-thought (CoT) prompting;
    • (2) replace the “greedy decode” in CoT prompting by sampling from the language model’s decoder to generate a diverse set of reasoning paths;
    • (3) marginalize out the reasoning paths and aggregate by choosing the most consistent answer in the final answer set.
  • We need marginalization because the text= “rationale+answer”, so greedy decoding is wrong.
  • When there is no reasoning path, we don’t need self-consistency, since we can directly choose the most likely answer using P(Y|X)

Universal Self-Consistency (USC)

  • Just ask LLMs to select the most consistent response based on majority consensus
  • USC consistently improves the performance on free-form generation tasks, like summarization, where SC is inapplicable

Least-to-most prompting

  1. Decompose a complex problem into a list of easier subproblems
  2. Sequentially solve these subproblems (from least to most complex)
  • Least-to-Most Prompting = Planning/Sketch + Reasoning