To address the problem of accurately enumerating facility types and their spatial distribution, a critical step in building code compliance inspection, we aim to address a largely overlooked issue in the literature. To improve this time-consuming and labor-intensive manual task, we propose a novel method that combines visual recognition and reasoning capabilities using LLM. We enhance performance through a statement-based inference pipeline. Experiments on real and synthetic floor plan data demonstrate the effectiveness and robustness of the proposed method.