Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Effects of structure on reasoning in instance-level Self-Discover

Created by
  • Haebom

Author

Sachith Gunasekara, Yasiru Ratnayake

Outline

This paper points out that the demand for predictable LLM inference in complex systems has popularized structured outputs, but concerns remain about their poor performance compared to unstructured natural language. Training on unstructured Chain of Thought (CoT) trace data has led to new powerful inference models, but it raises computational costs and reliability issues. In this paper, we present iSelf-Discover, an instance-level adaptation of the Self-Discover framework, and compare dynamically generated structured JSON inferences with unstructured inferences. Experimental results on various benchmarks show that unstructured inferences consistently outperform structured inferences. In particular, on the complex MATH benchmark, unstructured plans achieve up to 18.90% relative performance gain over structured approaches. The zero-shot unstructured variant of iSelf-Discover outperforms the five-shot structured variant, highlighting that these differences are important even when inferences are dynamically generated ahead of the final answer. Furthermore, we show that the optimal plan generation granularity (instance-level vs. task-level) varies depending on the context. These results suggest that we need to re-evaluate our reliance on structured formats for solving complex problems and how we structure complex systems.

Takeaways, Limitations

Takeaways:
Experimentally demonstrated that unstructured reasoning can outperform structured reasoning in solving complex problems.
On the MATH benchmark, the unstructured plan outperforms the structured plan by up to 18.90%.
Zero-shot unstructured models outperform five-shot structured models.
It is suggested that the optimal granularity of plan generation depends on the characteristics of the task.
Suggesting the need to reconsider dependencies on structured forms in complex system design.
Limitations:
Results may be limited to specific benchmarks and models.
Further research on different types of problems and models is needed.
Further analysis is needed on computational cost and reliability issues.
Lack of clear guidance on determining the granularity at which to generate optimal plans.
👍