This paper highlights that the fine-tuning performance of large-scale language models (LLMs) is highly dependent on the training data mixture composition, yet the process of selecting the optimal data mixture is manual and heuristic-dependent. Therefore, we propose TASKPGM, a principled and scalable mixture optimization framework that selects continuous task ratios by minimizing an energy function using Markov Random Fields (MRFs). TASKPGM models the relationships between tasks using behavioral differences, such as Jensen-Shannon Divergence and Pointwise Mutual Information, computed from the predictive distribution of single-task fine-tuning models. It provides a closed-form solution under group constraints and provably balances representativeness and diversity across tasks. It demonstrates consistent empirical performance gains across evaluation tools such as MMLU and BIGBench on Llama 2 and Mistral, along with theoretical guarantees (including weak submodularity for budget-constrained variants). Beyond performance, TASKPGM provides interpretable insights into task influence and mixture composition, making it a powerful tool for efficient and robust LLM fine-tuning.