/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
Emotions as Ambiguity-aware Ordinal Representations
From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Enhancing Model Privacy in Federated Learning with Random Masking and Quantization
Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models
Principled Detection of Hallucinations in Large Language Models via Multiple Testing
Vocoder-Projected Feature Discriminator
ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
PediatricsMQA: a Multi-modal Pediatrics Question Answering Benchmark
VideoEraser: Concept Erasure in Text-to-Video Diffusion Models
A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives
GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
Input-Time Scaling
LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
StreetViewAI: Making Street View Accessible Using Context-Aware Multimodal AI
Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Human-Centered Human-AI Interaction (HC-HAII): A Human-Centered AI パースペクティブ
GTPO: Trajectory-Based Policy Optimization in Large Language Models
Contrastive Multi-Task Learning with Solvent-Aware Augmentation for Drug Discovery
A Large-Scale Benchmark of Cross-Modal Learning for Histology and Gene Expression in Spatial Transcriptomics
Invisible Architectures of Thought: Toward a New Science of AI as Cognitive Infrastructure
Revisiting Pre-trained Language Models for Vulnerability Detection
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
Scaling Decentralized Learning with FLock
SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
Apple Intelligence Foundation Language Models: Tech Report 2025
Optimistic Exploration for Risk-Averse Constrained Reinforcement Learning
PyVision: Agentic Vision with Dynamic Tooling
DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
Analyzing Character Representation in Media Content using Multimodal Foundation Model: Effectiveness and Trust
MEraser: An Effective Fingerprint Erasure Approach for Large Language Models
CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval
DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers
Pseudo-Simulation for Autonomous Driving
BinConv: A Neural Architecture for Ordinal Encoding in Time-Series Forecasting
FaceEditTalker: Controllable Talking Head Generation with Facial Attribute Editing
EnvInjection: Environmental Prompt Injection Attack to Multi-modal Web Agents
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real
Heat Diffusion Models - Interpixel Attention Mechanism
Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation
Multi-Type Context-Aware Conversational Recommender Systems via Mixture-of-Experts
Pricing AI Model Accuracy
Evaluating the Fitness of Ontologies for the Task of Question Generation
Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation
PGAD: Prototype-Guided Adaptive Distillation for Multi-Modal Learning in AD Diagnosis
Constructing a Norm for Children's Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language Models
An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model
Efficient PINNs via Multi-Head Unimodular Regularization of the Solutions Space
Statistical learning does not always entail knowledge
Score-based Generative Diffusion Models for Social Recommendations
PromptKeeper: Safeguarding System Prompts for LLMs
X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models
Understanding Fairness-Accuracy Trade-offs in Machine Learning Models: Does Promoting Fairness Undermine Performance?
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language モデル
Leveraging Multi-facet Paths for Heterogeneous Graph Representation Learning
Training with Explanations Alone: A New Paradigm to Prevent Shortcut Learning
Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints
TabSketchFM: Sketch-based Tabular Representation Learning for Data Discovery over Data Lakes
HoneyBee: A Scalable Modular Framework for Creating Multimodal Oncology Datasets with Foundational Embedding Models
StepWiser: Stepwise Generative Judges for Wiser Reasoning
AniME: Adaptive Multi-Agent Planning for Long Animation Generation
AppAgent-Pro: A Proactive GUI Agent System for Multidomain Information Integration and User Assistance
AI Chaperones Are (Really) All You Need to Prevent Parasocial Relationships with Chatbots
Nemori: Self-Organizing Agent Memory Inspired by Cognitive Science
General agents contain world models
Approximate Lifted Model Construction
Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search
Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents
Preference Elicitation for Multi-objective Combinatorial Optimization with Active Learning and Maximum Likelihood Estimation
Reference-Aligned Retrieval-Augmented Question Answering over Heterogeneous Proprietary Documents
Demonstrating specification gaming in reasoning models
AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented Generation
Think Smart、Act SMARL! Analyzing Probabilistic Logic Shields for Multi-Agent Reinforcement Learning
From Evidence to Decision: Exploring Evaluative AI
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning
Patch Progression Masked Autoencoder with Fusion CNN Network for Classifying Evolution Between Two Pairs of 2D OCT Slices
DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis
Large Language Models (LLMs) for Electronic Design Automation (EDA)
Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Decomposing Behavioral Phase Transitions in LLMs: Order Parameters for Emergent Misalignment
Cross-Platform E-Commerce Product Categorization and Recategorization: A Multimodal Hierarchical Classification Approach
Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation
MathBuddy: A Multimodal System for Affective Math Tutoring
Diffusion Language Models Know the Answer Before Decoding
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
Dhati+: Fine-tuned Large Language Models for Arabic Subjectivity Evaluation
WaveHiT-SR: Hierarchical Wavelet Network for Efficient Image Super-Resolution
The Next Layer: Augmenting Foundation Models with Structure-Preserving and Attention-Guided Learning for Local Patches to Global Context Awareness in Computational Pathology
Logical Reasoning with Outcome Reward Models for Test-Time Scaling
The Information Dynamics of Generative Diffusion
AI-Powered Detection of Inappropriate Language in Medical School Curricula
Generative AI for Testing of Autonomous Driving Systems: A Survey
Multispectral LiDAR data for extracting tree points in urban and suburban areas
Load more
HoneyBee: A Scalable Modular Framework for Creating Multimodal Oncology Datasets with Foundational Embedding Models
Created by
Haebom
作者
Aakash Tripathi, Asim Waqas, Matthew B. Schabath, Yasin Yilmaz, Ghulam Rasool
概要
HONeYBEEは、腫瘍学アプリケーションのためのマルチモーダル生医学データ統合オープンソースフレームワークです。構造化および非構造化臨床データ、完全なスライド画像、イメージングスキャン、および分子プロファイルを処理して、ドメイン固有の基本モデルおよび融合戦略を使用して統合された患者レベルの埋め込みを作成します。これらの埋め込みは、生存予測、癌タイプの分類、患者類似性の検索、およびコホートクラスタリングを可能にします。 TCGAの33種類の癌タイプにわたって11,400人以上の患者を対象に評価した結果、臨床埋め込みは98.5%の分類精度と患者検索で96.4%の精度@ 10で最も強力な単一モーダル性能を示しました。また、ほとんどのがんタイプで最高の生存予測一致指数を達成しました。マルチモーダル融合は特定の癌に対して相補的な利点を提供し、臨床的特徴だけでは達成できない全体的な生存予測を改善しました。 4つの大規模言語モデルの比較評価の結果、Qwen3などの汎用モデルは、病理学レポートなどの異機種データに対する作業固有の微調整パフォーマンスを向上させましたが、臨床テキスト表現は専門の医療モデルよりも優れていることがわかりました。
Takeaways、Limitations
•
Takeaways:
さまざまな医療データモダリティを統合して腫瘍学の研究と予測のパフォーマンスを向上させるための効果的なフレームワークを提示します。特に臨床データに基づく埋め込みの優れた性能確認マルチモーダル融合による生存予測の改善の可能性の提示汎用LLMの医療データ処理性能の確認
•
Limitations:
TCGAデータセットへの依存。他のデータセットへの一般化可能性検証が必要です。特定の癌タイプに対するマルチモーダル融合の効果は限定的であり得る。モデルの解釈可能性と説明力に関するさらなる研究が必要
PDFを見る
Made with Slashpage