This paper presents the design and development of a pipeline that efficiently extracts tabular data from invoices using Optical Character Recognition (OCR) technology. Text is recognized using Tesseract OCR, and structured tabular data is detected, aligned, and extracted from scanned invoice documents using custom postprocessing logic. The method includes dynamic preprocessing, table boundary detection, and row-to-column mapping optimized for noisy and non-standard invoice formats. The resulting pipeline significantly improves data extraction accuracy and consistency, supporting real-world use cases such as automated financial workflows and digital archiving.