This paper proposes model folding, a novel data-free model compression technique. This technique merges structurally similar neurons across layers, significantly reducing model size without fine-tuning or access to training data. Unlike existing methods, it utilizes k-means clustering and employs a novel data-free technique to prevent variance collapse or explosion, preserving data statistics during compression. Through theoretical frameworks and experiments on standard benchmarks, including ResNet18 and LLaMA-7B, we demonstrate that model folding achieves comparable performance to data-driven compression techniques and outperforms recently proposed data-free methods, particularly at high sparsity levels. This method is particularly effective for large-scale model compression, making it suitable for deployment in resource-constrained environments.