This paper proposes MemoryTalker, which enables realistic and accurate 3D facial motion synthesis that reflects a speaker's speech tone using only audio input. Previous studies have limited practical applications, as they require prior information, such as the speaker's class label or additional 3D face meshes, to adequately reflect speech tone. MemoryTalker utilizes a two-stage training process: Memorizing, which stores and retrieves common motions; and Animating, which synthesizes personalized facial motions using styled motion memories based on audio-based speech characteristics. In this second stage, the model learns which facial motion types should be emphasized for specific audio. Consequently, MemoryTalker can generate reliable personalized facial animations without additional prior information. Through quantitative and qualitative evaluations and user studies, we demonstrate the model's effectiveness and performance improvement over existing state-of-the-art methods for personalized facial animation.