This paper proposes CoQuIR, a large-scale, multilingual benchmark for evaluating the quality-awareness of code retrieval, essential for improving code reuse and debugging speed in software development. Unlike existing benchmarks that focus solely on functional relevance, CoQuIR provides fine-grained quality annotations for 42,725 queries and 134,907 code snippets across 11 programming languages, considering four core dimensions: accuracy, efficiency, security, and maintainability. Using two quality-focused evaluation metrics—Pairwise Preference Accuracy and Margin-based Ranking Score—we benchmark 23 retrieval models and find that even the best-performing models struggle to distinguish buggy or unsafe code from more robust code. Furthermore, we conduct a preliminary investigation into training methods that explicitly encourage code quality awareness, demonstrating improvements in quality-awareness metrics across various models using synthetic datasets. We then validate the effectiveness of our approach through subsequent code generation experiments. In conclusion, this study highlights the importance of integrating quality signals into code search systems, laying the foundation for more reliable and robust software development tools.