Hespi (HErbarium Specimen sheet PIpeline) is a pipeline that extracts pre-catalog data from the primary specimen labels of herbarium specimens using computer vision technology. It integrates two object detection models that detect components and fields of specimen labels, classifies the type of labels (printed, typed, handwritten, mixed), and extracts text using OCR and HTR. The extracted text is proofread based on an authoritative taxonomic database and improved using a multi-modal LLM. Hespi accurately detects and extracts text from international herbarium specimen sheets, and its modular design allows for training and integrating custom models.