Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence

Created by
  • Haebom

Author

Alisa Vinogradova (Optic Inc), Vlad Vinogradov (Optic Inc), Dmitrii Radkevich (Optic Inc), Ilya Yasny (Optic Inc), Dmitry Kobyzev (Optic Inc), Ivan Izmailov (Optic Inc), Katsiaryna Yanchanka (Optic Inc), Roman Doronin (Optic Inc), Andrey Doronichev (Optic Inc)

Outline

This paper describes and benchmarks the competitor discovery components used within an agent-based AI system for rapid pharmaceutical asset due diligence. Given a specific indication, the competitor discovery AI agent searches for all drugs that comprise the competitive landscape for that indication and extracts standardized properties of these drugs. Competitor definitions vary across investors, data is paid/licensed, distributed across multiple registries, has inconsistent ontologies across indications, has numerous aliases, is multimodal, and rapidly evolving. Existing LLM-based AI systems cannot reliably search all competitor drug names, and there is no public benchmark for this task. To address this, we transformed five years of multimodal unstructured due diligence notes from a private biotech VC fund into a structured valuation corpus to map competitor drugs by indication and standardized properties. Furthermore, we introduced a competitor validation LLM-as-a-judge agent to eliminate false positives, improve accuracy, and suppress hallucinations. The competitor discovery agent presented in this paper achieved 83% recall, outperforming OpenAI Deep Research (65%) and Perplexity Labs (60%). This system is deployed for corporate users, and in a case study at a biotech VC investment fund, analyst processing time for competitive analysis was reduced from 2.5 days to approximately 3 hours (approximately 20x).

Takeaways, Limitations

Takeaways:
We present a successful case study of developing and deploying an AI system that effectively discovers competitive drugs from multimodal unstructured data.
Demonstrates the potential for increased efficiency by dramatically reducing due diligence time (20x) by leveraging LLM-based agents.
Building a new benchmark dataset to evaluate and compare the performance of LLM-based competitor discovery systems.
A strategy to eliminate false positives and improve accuracy using an LLM-as-a-judge agent is presented.
Limitations:
The data used was limited to data from a specific private biotech VC fund, requiring review of generalizability.
The size and diversity of the benchmark dataset could be improved in future research.
Because the definition of a competitor is investor-specific, it may not generalize to other investors.
Accessibility is limited due to the paywalled/licensed nature of the data.
👍