This paper provides insights into how selective state-space models (SSMs) work by analyzing their expressive power and length generalization performance on rule language tasks (emulating finite-state automata (FSAs). To overcome the limitations of existing SSM-based architectures, we present selective dense state-space models (SD-SSMs), the first selective SSM that exhibits perfect length generalization on a variety of rule language tasks using a single layer. SD-SSMs utilize a dense transition matrix dictionary, a softmax selection mechanism that generates a convex combination of the prior matrices at each time step, and a readout consisting of hierarchical regularization and linear mapping. We also evaluate variants of diagonal selective SSMs by considering their empirical performance on commutative and noncommutative automata, and explain the experimental results through theoretical considerations.