This paper comprehensively surveys the intersection of distributed intelligence and model optimization in Edge-Cloud Collaborative Computing (ECCC). ECCC, which integrates edge devices and cloud resources to enable efficient, low-latency processing, has emerged as a key paradigm for addressing the computing demands of modern intelligent applications. This paper provides a structured tutorial on the underlying architecture, enabling technologies, and emerging applications. It systematically analyzes model optimization methods, such as model compression, adaptation, and neural network architecture exploration, along with AI-based resource management strategies that balance performance, energy efficiency, and latency requirements. Furthermore, it explores critical aspects of enhancing privacy and security within ECCC systems and examines real-world deployments across a range of applications, including autonomous driving, healthcare, and industrial automation. Performance analysis and benchmarking techniques are also thoroughly explored to establish evaluation standards for these complex systems. Finally, it presents a roadmap for addressing the ongoing challenges of heterogeneity management, real-time processing, and scalability by highlighting key research directions, including LLM deployment, 6G integration, neuromorphic computing, and quantum computing.