This paper highlights the challenges of dermatology consultations in remote settings, namely the need for diagnosis with limited information (images and brief descriptions). To address this, we propose a medical AI system that mimics clinical reasoning. We compared and analyzed seven vision-language models across six configurations: a baseline model, a fine-tuned model, a model with an additional inference layer, and a model with added medical literature search capabilities. While fine-tuning actually resulted in a decrease in performance, the architecture mimicking clinical reasoning achieved up to 70% accuracy and generated explainable, literature-based output, a crucial element for clinical application. This demonstrates that medical AI can be successful by reimagining collaborative and evidence-based practice in clinical diagnosis.