This paper presents SciRerankBench, a novel benchmark for evaluating rerankers within the Two-Stage Retrieval Augmented Generative Large Language Model (RAG-LLM) system for scientific literature question answering. It highlights the critical role of rerankers in scientific fields, where subtle differences in terminology can significantly impact the accuracy of answers. SciRerankBench spans five scientific domains and develops three types of question-context-answer (QCA) pairs: Noisy Contexts, Semantically Similar but Logically Irrelevant Contexts, and Counterfactual Contexts, to rigorously evaluate reranker performance in terms of noise robustness, relevance disambiguation, and factual consistency. Through a systematic evaluation of 13 rerankers and five LLM families, we provide insight into the strengths and limitations of each reranker, emphasizing that SciRerankBench is the first benchmark for evaluating rerankers within RAG-LLM.