This paper proposes a low-rank optimization method that restricts learning to a low-dimensional space to improve the running time of large-scale language model (LLM) training and reduce the memory footprint of the adaptive optimizer. Previous studies have projected gradients of linear layers based on the singular value decomposition (SVD) or QR decomposition. However, applying this method individually to each layer is computationally expensive, and storing the projection matrix incurs additional memory costs. In this study, we propose a computationally efficient and simple two-step procedure that approximates SVD/QR-based gradient projection into a low-dimensional space using a predefined orthogonal matrix of the discrete cosine transform (DCT). The aligned columns of the DCT matrix are dynamically selected for each layer, and an efficient projection matrix is obtained through a simple matmul with the DCT matrix in O(n³) time, followed by a lightweight alignment step to identify the most relevant basis vectors. For large layers, the DCT can be computed in O(n²log(n)) time using Makhoul's N-point algorithm based on the fast Fourier transform (FFT). Due to the predefined properties of the orthogonal basis, it is computed once at the start of training. Experimental results for pretraining and fine-tuning tasks demonstrate that it performs similarly to expensive SVD/QR-based methods while achieving rank-independent running times, achieving up to 25% faster execution times and reduced memory usage across a range of model sizes.