This paper studies the problem of active sequential hypothesis testing (AST), where a learner adaptively gathers data from the environment to efficiently determine a fundamentally correct hypothesis for a new task. We present examples of the Best-Arm Identification (BAI) task in the multi-loss bandit problem and the generalized search problem. We introduce In-Context Pure Exploration (ICPE), which meta-trains a Transformer to map observation history to query behavior and predicted hypotheses. ICPE actively gathers evidence for a new task and infers the true hypothesis without parameter updates at inference time. On deterministic, probabilistic, and structured benchmarks, including BAI and generalized search, ICPE outperforms adaptive baselines without explicitly modeling the information structure.