This paper focuses on the practical deployment of Deep Reinforcement Learning (DRL), which has emerged as a powerful solution to meet the increasing demands for connectivity, reliability, low latency, and operational efficiency in advanced networks. We present an orchestration framework that integrates ETSI MEC and Open RAN to enhance the smooth adoption of DRL-based strategies at various time scales and agent life cycle management. We identify three major challenges that hinder practical deployments: asynchronous requests due to unpredictable or bursty traffic, adaptability and generalization to heterogeneous topologies and changing service requirements, and long-term convergence and service interruptions due to exploration in live operational environments. We propose three solutions: advanced time series integration for handling asynchronous traffic, flexible architecture design such as multi-agent DRL and incremental learning to support heterogeneous scenarios, and simulation-based deployment with transfer learning to reduce convergence time and service interruptions. Finally, we verify the feasibility of the MEC-O-RAN architecture on a city-wide test infrastructure, and demonstrate the effectiveness of the proposed solution by demonstrating the three identified challenges through two real-world use cases.