This paper addresses the safety issue of large-scale language models (LLMs), especially those related to answering socially harmful questions. We experimentally demonstrate that aligned models can be compromised by additional fine-tuning despite previous efforts to improve safety. We reveal that this vulnerability stems from the sensitivity of the safety-related low-rank subspace in the LLM parameters to fine-tuning, and based on this insight, we propose a novel training-free method, Low-Rank Extrapolation (LoX). LoX improves safety robustness by extrapolating the safety subspace of aligned LLMs. Experimental results show that LoX significantly improves the robustness against harmful or malicious fine-tuning attacks while maintaining the adaptability of the model to new tasks. For example, LoX reduces the attack success rate (ASR) against harmful or malicious fine-tuning attacks by 11% to 54%. By examining the ASR landscape of parameters, we explain that the success of LoX is because extrapolation moves the LLM parameters to a flatter region, making them less sensitive to perturbations. The code is available at github.com/VITA-Group/LoX에서.