This paper theoretically supports the notion that language models can decode digit predictions into strings for regression analysis, and explores the use of a causal sequence decoding model as a digit regression head for various feature representations. Despite being trained using a common approach of predicting the next token via cross-entropy loss, we find that the decoder-based head performs equally well as a standard pointwise head on standard regression tasks and exhibits the flexibility to capture smooth digit distributions, such as density estimation.