Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Neural Hamiltonian Operator

VLAD: A VLM-Augmented Autonomous Driving Framework with Hierarchical Planning and Interpretable Decision Process

Rethinking All Evidence: Enhancing Trustworthy Retrieval-Augmented Generation via Conflict-Driven Summarization

AI Meets Maritime Training: Precision Analytics for Enhanced Safety and Performance

PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning

LLM-based Realistic Safety-Critical Driving Video Generation

GAIus: Combining Genai with Legal Clauses Retrieval for Knowledge-based Assistant

Beyond First-Order: Training LLMs with Stochastic Conjugate Subgradients and AdamW

Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration

Search-Based Robot Motion Planning With Distance-Based Adaptive Motion Primitives

Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning

Geometry-aware 4D Video Generation for Robot Manipulation

AI-guided digital intervention with physiological monitoring reduces intrusive memories after experimental trauma

Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem

Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services

FAIR-MATCH: A Multi-Objective Framework for Bias Mitigation in Reciprocal Dating Recommendations

Quantifying Student Success with Generative AI: A Monte Carlo Simulation Informed by Systematic Review

Epitome: Pioneering an Experimental Platform for AI-Social Science Integration

Automated Vehicles Should be Connected with Natural Language

A Data Science Approach to Calcutta High Court Judgments: An Efficient LLM and RAG-powered Framework for Summarization and Similar Cases Retrieval

Prompt Mechanisms in Medical Imaging: A Comprehensive Survey

XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science

Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

Long-Sequence Memory with Temporal Kernels and Dense Hopfield Functionals

Can AI be Consentful?

Text Detoxification: Data Efficiency, Semantic Preservation and Model Generalization

Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals

Data Classification with Dynamically Growing and Shrinking Neural Networks

Can Argus Judge Them All? Comparing VLMs Across Domains

Fast AI Model Splitting over Edge Networks

Fast Clifford Neural Layers

On-Policy Optimization of ANFIS Policies Using Proximal Policy Optimization

Learning to Segment for Vehicle Routing Problems

Systemic Constraints of Undecidability

Research on Low-Latency Inference and Training Efficiency Optimization for Graph Neural Network and Large Language Model-Based Recommendation Systems

Data-driven Insights for Informed Decision-Making: Applying LSTM Networks for Robust Electricity Forecasting in Libya

An Uncertainty-Aware Dynamic Decision Framework for Progressive Multi-Omics Integration in Classification Tasks

PathCoT: Chain-of-Thought Prompting for Zero-shot Pathology Visual Reasoning

HPC-AI Coupling Methodology for Scientific Applications

Hello Afrika: Speech Commands in Kinyarwanda

A Systematic Review of Security Vulnerabilities in Smart Home Devices and Mitigation Techniques

Refining Gelfond Rationality Principle Towards More Comprehensive Foundational Principles for Answer Set Semantics

Joint Matching and Pricing for Crowd-shipping with In-store Customers

Agent Ideate: A Framework for Product Idea Generation from Patents Using Agentic AI

T3DM: Test-Time Training-Guided Distribution Shift Modelling for Temporal Knowledge Graph Reasoning

Agent-as-Tool: A Study on the Hierarchical Decision Making with Reinforcement Learning

Using multi-agent architecture to mitigate the risk of LLM hallucinations

Pensieve Grader: An AI-Powered, Ready-to-Use Platform for Effortless Handwritten STEM Grading

A Fuzzy Approach to the Specification, Verification and Validation of Risk-Based Ethical Decision Making Models

AI Agents and Agentic AI-Navigating a Plethora of Concepts for Future Manufacturing

Beyond Black-Box AI: Interpretable Hybrid Systems for Dementia Care

Rethinking the Illusion of Thinking

Autonomy by Design: Preserving Human Autonomy in AI Decision-Support

Adapt Your Body: Mitigating Proprioception Shifts in Imitation Learning

The Impact of AI on Educational Assessment: A Framework for Constructive Alignment

ZonUI-3B: A Lightweight Vision-Language Model for Cross-Resolution GUI Grounding

Pipelined Decoder for Efficient Context-Aware Text Generation

Flow-Modulated Scoring for Semantic-Aware Knowledge Graph Completion

Ovis-U1 Technical Report

Against 'softmaxing' culture

Listener-Rewarded Thinking in VLMs for Image Preferences

Beyond Code: The Multidimensional Impacts of Large Language Models in Software Development

Text Production and Comprehension by Human and Artificial Intelligence: Interdisciplinary Workshop Report

Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset

Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center

HyperCLOVA X THINK Technical Report

Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism

Binned semiparametric Bayesian networks

ComRAG: Retrieval-Augmented Generation with Dynamic Vector Stores for Real-time Community Question Answering in Industry

Generating and Customizing Robotic Arm Trajectories using Neural Networks

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting

Benchmarking the Pedagogical Knowledge of Large Language Models

CARTS: Collaborative Agents for Recommendation Textual Summarization

Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases

Studying and Improving Graph Neural Network-based Motif Estimation

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Unleashing Diffusion and State Space Models for Medical Image Segmentation

A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models

Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs

Making a Pipeline Production-Ready: Challenges and Lessons Learned in the Healthcare Domain

Bregman Centroid Guided Cross-Entropy Method

eACGM: Non-instrumented Performance Tracing and Anomaly Detection towards Machine Learning Systems

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

Avoid Forgetting by Preserving Global Knowledge Gradients in Federated Learning with Non-IID Data

MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

Two-Stage Regularization-Based Structured Pruning for LLMs

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment

Llama-Nemotron: Efficient Reasoning Models

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses

ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition

RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation

CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception

Mitigating Hallucinations in YOLO-based Object Detection Models: A Revisit to Out-of-Distribution Detection

SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation

Conformal Inference under High-Dimensional Covariate Shifts via Likelihood-Ratio Regularization

SwarmThinkers: Learning Physically Consistent Atomic KMC Transitions at Scale

Created by

Haebom

저자

Qi Li, Kun Li, Haozhi Han, Honghui Shang, Xinfu He, Yunquan Zhang, Hong An, Ting Cao, Mao Yang

개요

본 논문은 과학적 시뮬레이션 시스템에서 물리적 정합성, 해석 가능성, 그리고 확장성을 동시에 달성하는 어려움을 해결하기 위해 SwarmThinkers라는 강화 학습 프레임워크를 제시합니다. SwarmThinkers는 원자 규모 시뮬레이션을 물리적으로 근거한 군집 지능 시스템으로 재구성하여 각 확산 입자를 공유 정책 네트워크를 통해 전이를 선택하는 지역적 의사결정 에이전트로 모델링합니다. 학습된 선호도와 전이율을 융합하는 재가중치 메커니즘을 통해 통계적 정확도를 유지하면서 해석 가능한 단계별 의사결정을 가능하게 합니다. 중앙 집중식 학습과 분산 실행 패러다임을 통해 시스템 크기, 농도, 온도에 관계없이 재학습 없이 정책을 일반화할 수 있습니다. 방사선 유도 Fe-Cu 합금 석출을 시뮬레이션하는 벤치마크에서 SwarmThinkers는 단일 A100 GPU에서 전체 규모의 물리적으로 일관된 시뮬레이션을 달성한 최초의 시스템이며, 이전에는 슈퍼컴퓨터를 사용한 OpenKMC를 통해서만 가능했습니다. 최대 4963배(평균 3185배) 빠른 계산과 485배 낮은 메모리 사용량을 제공합니다. 입자를 수동적 샘플러가 아닌 의사결정자로 취급함으로써 SwarmThinkers는 에이전트 기반 지능을 통해 물리적 일관성, 해석 가능성 및 확장성을 통합하는 과학적 시뮬레이션의 패러다임 전환을 제시합니다.

시사점, 한계점

•

시사점:

◦

물리적 정합성, 해석 가능성, 확장성을 모두 만족하는 새로운 과학 시뮬레이션 프레임워크 제시

◦

기존 방법보다 훨씬 빠르고 효율적인 계산 성능 (최대 4963배 속도 향상, 485배 메모리 사용량 감소)

◦

에이전트 기반 접근 방식을 통해 과학 시뮬레이션의 패러다임 전환 가능성 제시

◦

단일 GPU에서 대규모 시뮬레이션 수행 가능

•

한계점:

◦

현재는 특정 시뮬레이션 (방사선 유도 Fe-Cu 합금 석출)에 대한 성능만 검증됨. 다른 시스템이나 문제에 대한 일반화 가능성은 추가 연구 필요.

◦

SwarmThinkers의 복잡성과 구현의 어려움에 대한 논의 부족.

◦

장기간 시뮬레이션이나 더욱 복잡한 시스템에 대한 성능 및 안정성에 대한 추가적인 평가 필요.

◦

학습된 정책의 해석 가능성에 대한 정량적인 평가 부족.

Made with Slashpage