Existing offline reinforcement learning (RL) methods mainly operate in batch-constrained settings, constraining the algorithm to a specific state-action distribution present in the dataset, reducing the impact of distribution shifts but restricting the policy to observed actions. In this paper, we alleviate these limitations by introducing state-constrained offline RL, a novel framework that focuses only on the state distribution of the dataset. This approach allows the policy to take high-quality actions outside the distribution, leading to states within the distribution, thus greatly enhancing the learning potential. The proposed setting not only broadens the learning horizon but also enhances the ability to effectively combine different trajectories in the dataset, which is a uniquely desirable property of offline RL. This study builds on theoretical findings for further progress in this area. In addition, we introduce StaCQ, a deep learning algorithm that achieves state-of-the-art performance on the D4RL benchmark dataset and is consistent with the theoretical proposals. StaCQ sets a strong baseline for future exploration in this area.