This paper presents the FaceEditTalker framework, which integrates facial attribute editing into audio-based talking head generation. Unlike previous studies that focus on lip synchronization and emotional expression, FaceEditTalker flexibly adjusts visual attributes such as hairstyle, accessories, and fine facial features, enhancing its potential for diverse applications such as personalized digital avatars, online educational content, and brand-specific digital customer service. It consists of an image feature space editing module that extracts semantic and detailed features and controls their properties, and an audio-based video generation module that fuses the edited features with audio-guided facial landmarks to drive a diffusion-based generator. Experimental results demonstrate that FaceEditTalker achieves comparable or superior performance to existing methods in terms of lip synchronization accuracy, video quality, and attribute controllability.