English
Share
Sign In
⛓️

Chain-of-Thoughts (CoT)

There is a technique named Chain-of-Thought announced in 2022 by Jason Wei, who previously defined Zero-shot. As the name suggests, this is related to chain prompting. The slight difference is that intermediate inference steps are intentionally added to lead to better results. This is especially effective for complex tasks that require detailed thought processes.
Advantages of CoT Prompting
Multi-step problem decomposition: CoT allows models to decompose multi-step problems into intermediate steps, which means additional computation can be assigned to problems that require more inference steps.
Interpretability of model behavior: CoT provides an interpretable window into how the model arrives at a specific answer and provides an opportunity to debug where the inference path went wrong.
Applicability to a variety of tasks: CoT reasoning can be used for tasks such as mathematical problems, common sense reasoning, symbolic manipulation, and in principle any task that humans can solve through language.
Easy induction in large-scale language models: CoT inference can be easily derived in sufficiently large existing language models simply by including examples of CoT sequences.
Shall we give an example? This is the example shown earlier in the argument part .
💡
Let's distinguish between odd and even numbers and add the odd numbers in order:
Odd numbers: 343, 1, 423, 3, 433, 21, 51 Odd addition: 343 + 1 + 423 + 3 + 433 + 21 + 51 = 1275
Therefore, if you add up all odd numbers among the given numbers, you get 1275.
Here, 'Distinguish between odd and even numbers and then add up all the odd numbers. The part that says, ‘Please proceed in order.’ is an instruction to separate tasks in a chain. When this method first received attention, it was done 'step by step'. It has been said that asking step-by-step ‘in order’ produces better results.
Of course, the CoT method also has clear limitations. To put it simply, it can only be effective on models that have a lot on their minds. In other words, in a model with a small number of parameters, called sLM, CoT does not perform well at all. In this case, the Few-shot or One-shot method shows better results.
Limitations of CoT
Dependent on model size: CoT Prompting primarily shows positive performance gains only for large-scale models (approximately 70B parameters). For small-scale models, CoT may be ineffective or perform worse than standard prompts.
Limitations: Although CoT mimics the thought process of a human reasoner, it is still unknown whether this means the neural network is actually ‘reasoning’. Additionally, while the cost of manually extending an example to CoT may be small, the cost of annotations for fine-tuning can increase significantly. CoT does not guarantee the correct inference path, and it must also be taken into account that the cost of using large models in real applications is high.
Nonetheless, CoT Prompting is an effective way to improve reasoning ability in a variety of tasks utilizing language models. In the first place, since the models served to our users are currently around 100B, this method is not meaningful.
🙌
🧊
ⓒ 2023. Haebom, all rights reserved.
The source is indicated and may be used for commercial purposes with the permission of the copyright holder.