Sun, H., Hüyük, A., & van der Schaar, M. (2023). Query-dependent prompt evaluation and optimization with offline inverse RL. In The Twelfth International Conference on Learning Representations. Identify the overlooked query-dependent prompt optimization objective and its challenges, and introduce Offline Inverse Reinforcement Learning to integrate rich human expertise as a systematic approach. (Prompt-OIRL) Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., ... & Du, M. (2024). The impact of reasoning step length on large language models. arXiv preprint arXiv:2401.04925. Increasing Reasoning steps leads increasing in the effectiveness of CoT. But it depends on the complexity of task. Nayab, S., Rossolini, G., Buttazzo, G., Manes, N., & Giacomelli, F. (2024). Concise thoughts: Impact of output length on llm reasoning and cost. arXiv preprint arXiv:2407.19825.
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!