This paper addresses Embedded Question Answering (EQA), a critical yet challenging task for robotic assistants. Existing approaches either treat static video Q&A as static video Q&A or limit answers to closed-ended choices, hindering practical application. To overcome these limitations, we present EfficientEQA, a novel framework that combines efficient exploration with free-form answer generation. EfficientEQA features three key innovations: (1) efficient exploration via Semantic-Value-Weighted Frontier Exploration (SFE) using Verbalized Confidence (VC) from a black-box VLM; (2) a BLIP-based mechanism that adaptively stops exploration by flagging highly relevant observations as outliers; and (3) a Retrieval-Augmented Generation (RAG) method that accurately answers based on relevant images from the agent's observation history without relying on predefined choices. Experimental results show that EfficientEQA achieves over 15% higher accuracy than state-of-the-art methods and requires over 20% fewer exploration steps.