This paper addresses text-based virtual fitting tasks, building on recent virtual fitting approaches that leverage powerful generative capabilities by fine-tuning pre-trained text-to-image diffusion models. Specifically, we focus on the text-editable virtual fitting task, which modifies clothing based on provided clothing images and edits the wear style (e.g., tuck-in style, fit) based on text descriptions. To achieve this, we address three key challenges: (i) designing rich text descriptions for paired person-clothing data for model training; (ii) resolving conflicts where textual information about existing person clothing interferes with the generation of new clothing; and (iii) adaptively adjusting inpainting masks based on text descriptions to ensure appropriate editing areas while preserving the original person's appearance, which is unrelated to the new clothing. To address these challenges, we propose PromptDresser, a text-editable virtual fitting model that leverages the support of large-scale multimodal models (LMMs) to enable high-quality, versatile manipulations based on text prompts. PromptDresser utilizes LMMs through in-context learning to generate detailed text descriptions of person and clothing images, including detailed information and editing attributes, with minimal human intervention. Additionally, the inpainting mask adaptively adjusts based on text prompts to ensure the editing area is secure. Experimental results demonstrate that PromptDresser outperforms existing methods, demonstrating excellent text-based control and diverse garment manipulation.