English
Share
Sign In
🧮

Demonstration

Inference tasks using language models are currently one of the important and challenging areas in the field of artificial intelligence. Reasoning tasks involve analyzing information and drawing logical conclusions from it. This is a more complex task that goes beyond simple information retrieval or text generation.
🤖
The result of multiplying 57 and 32 is 1824.
I can now solve simple problems like this easily. Let's bring up a slightly difficult problem.
🤖
Let's add odd numbers among the given numbers:
Odd numbers: 343, 1, 423, 3, 433, 21, 51
Adding: 343 + 1 + 423 + 3 + 433 + 21 + 51 = 1675
Therefore, if you add the odd numbers among the given numbers, you get 1675.
Is this correct? no. Wrong. "1275" is the correct answer. Why does this happen?
Language models, especially Large Language Models (LLMs), have the ability to process various types of data and extract meaning from complex contexts. However, these models still have limitations in performing inference tasks.
Text-based learning: Language models are mainly trained with text data such as sentences, paragraphs, and conversations. This data primarily reflects the meaning, context, and syntax of the language, with an emphasis on linguistic meaning rather than numerical computation or precise mathematical processing.
Statistical inference: Language models use statistical methods to predict the most likely word or sentence. This process is effective at recognizing and imitating patterns in language, but is limited at performing complex mathematical calculations or logical reasoning.
Data diversity and limitations: If your training data doesn't contain enough examples of number crunching or mathematical problem solving, your model may struggle to handle these types of problems.
Language-centric design: Language models are designed primarily to focus on language understanding and production, and mathematical abilities are not the primary purpose or function of these models.
Of course, it is not for those who study artificial intelligence to leave it like this. In GPT-4, etc., this problem is solved through coding using libraries specialized in existing inference/computation. Actually running this on GPT-4 solves the problem as follows:
More advanced prompt engineering techniques are needed to overcome these limitations. Prompt engineering is a technique that provides clearer and more specific instructions to the model, allowing it to more accurately derive the desired information or answer. For example, it involves clearly presenting the various variables and conditions needed for complex problem solving or decision-making so that the model can understand them. In the example above, it can be used like this.
Before change
After
💡
Let's distinguish between odd and even numbers and add the odd numbers in order:
Odd numbers: 343, 1, 423, 3, 433, 21, 51 Odd addition: 343 + 1 + 423 + 3 + 433 + 21 + 51 = 1275
Therefore, if you add up all odd numbers among the given numbers, you get 1275.
We only changed the prompt, but we can see that we came up with the correct answer. When teaching someone something, having them do it step by step sequentially makes them understand faster and the quality of their work increases.
Reasoning using language models is still a developing area, and to do it effectively, you need the ability to give instructions. These methods make full use of the model's inference ability and can help solve complex problems.
👩‍💻
🔨
ⓒ 2023. Haebom, all rights reserved.
The source is indicated and may be used for commercial purposes with the permission of the copyright holder.