This paper presents SLIM, a novel one-shot compression framework for large-scale language models (LLMs) that addresses the memory consumption and slow inference problems. While existing model compression techniques require computationally expensive retraining to maintain accuracy, SLIM reduces model size without retraining while maintaining accuracy. SLIM works by integrating hardware-friendly quantization, sparsity, and low-dimensional approximation. Its key components include probabilistic quantization (SLIM-Quant), semi-structured sparsity using conventional one-shot pruning, and computation of a low-dimensional adapter based on a novel importance function to compensate for quantization and sparsity errors. Experimental results demonstrate that SLIM achieves up to 5.66% higher accuracy than existing methods, reduces memory usage by up to 0.23x, and achieves up to 4.3x and 3.8x speedups on Nvidia RTX3060 and A100 GPUs, respectively. Additionally, we show that additional accuracy improvements can be obtained through the optional Parameter-Efficient Fine-Tuning (PEFT) recipe.