Decoder-only language models, such as GPT and LLaMA, typically perform decoding at the last layer. This study proposes a hierarchical decoder architecture that simultaneously decodes text at different layers, leveraging human hierarchical reasoning. To adapt a pretrained language model to this hierarchical decoder configuration, we copy the language heads from the last layer to selected intermediate layers and fine-tune them with different task inputs. Experiments demonstrate that these selective intermediate layers can generate meaningful and reasonable content, and this hierarchical decoder paradigm achieves state-of-the-art performance on multiple tasks, including hierarchical text classification, classification-based generation, and hierarchical text generation. HdLM outperforms all baselines on WoS, DBpedia, ESconv, EmpatheticDialogues, and several cognitive tests. Furthermore, we provide a thorough theoretical analysis of the methodology's convergence and computational savings. This study demonstrates the potential of a generalized hierarchical rhesus machine learning model trained from scratch.