This paper proposes Driver-Net, a novel deep learning framework that accurately and timely assesses driver readiness to ensure safe control transfer in autonomous vehicles. Unlike conventional vision-based driver monitoring systems that focus on head posture or gaze, Driver-Net uses three cameras to synchronize and capture visual cues such as the driver's head, hands, and posture. It integrates spatiotemporal data through a dual-path architecture comprised of context blocks and feature blocks, and employs a multi-modal fusion strategy to enhance prediction accuracy. Evaluation results using a diverse dataset collected from the University of Leeds Driving Simulator demonstrate a maximum accuracy of 95.8% in driver readiness classification. This represents a significant improvement over existing methods and highlights the importance of multi-modal and multi-view fusion. As a real-time, non-invasive solution, Driver-Net significantly contributes to the development of safer and more reliable autonomous vehicles, meeting emerging regulations and future safety standards.