This paper proposes a novel hybrid deep learning framework for accurately extracting key information from 2D engineering drawings. To address the issue of conventional OCR techniques generating unstructured output due to complex layouts and overlapping symbols, we utilize a hybrid approach that integrates an oriented bounding box (OBB) detection model and a transformer-based document parsing model (Donut). Using YOLOv11, we detect nine major categories—GD&T, general tolerances, dimensions, materials, annotations, radii, surface roughness, threads, and title blocks—and fine-tune Donut to generate structured JSON output. We compare two fine-tuning strategies: a single model for all categories and a category-specific model. We find that the single model achieves higher precision (94.77% for GD&T), recall (100% for most categories), F1 score (97.3%), and reduces hallucinations (5.23%) across all evaluation metrics. The proposed framework improves accuracy, reduces manual work, and supports scalable deployment in precision-based industries.