In this paper, we systematically study two distributed learning optimization methods, model-parallel and data-parallel, to address the computational and communication bottlenecks that arise due to the rapid introduction of large-scale language models (LLMs) in recommender systems. For model-parallel, we implement tensor-parallel and pipeline-parallel, and introduce an adaptive load-balancing mechanism to reduce the communication overhead between devices. For data-parallel, we compare synchronous and asynchronous modes, and combine gradient compression and sparsification techniques with an efficient aggregate communication framework to significantly improve bandwidth utilization. Experimental results using real recommendation datasets show that the proposed hybrid parallel approach improves training throughput by more than 30% and resource utilization by about 20% compared to existing single-mode parallel approaches, while maintaining strong scalability and stability. Finally, we discuss the trade-offs between different parallel strategies in online deployments, and suggest future directions including heterogeneous hardware integration and automatic scheduling techniques.