Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

GBPP: Grasp-Aware Base Placement Prediction for Robots via Two-Stage Learning

HiChunk: Evaluating and Enhancing Retrieval-Augmented Generation with Hierarchical Chunking

Evalet: Evaluating Large Language Models by Fragmenting Outputs into Functions

Your Compiler is Backdooring Your Model: Understanding and Exploiting Compilation Inconsistency Vulnerabilities in Deep Learning Compilers

Physics-informed neural network solves minimal surfaces in curved spacetime

A funny companion: Distinct neural responses to perceived AI- versus human-generated humor

National Running Club Database: Assessing Collegiate Club Athletes' Cross Country Race Results

Online Learning Based Efficient Resource Allocation for LoRaWAN Network

MetaLLMix : An XAI Aided LLM-Meta-learning Based Approach for Hyper-parameters Optimization

Implicit Neural Representations of Intramyocardial Motion and Strain

MVPBench: A Benchmark and Fine-Tuning Framework for Aligning Large Language Models with Diverse Human Values

MachineLearningLM: Scaling Many-shot In-context Learning via Continued Pretraining

AI Governance in Higher Education: A course design exploring regulatory, ethical and practical considerations

Benchmarking Gender and Political Bias in Large Language Models

BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models

TinyDef-DETR: A DETR-based Framework for Defect Detection in Transmission Lines from UAV Imagery

TSPC: A Two-Stage Phoneme-Centric Architecture for code-switching Vietnamese-English Speech Recognition

Spiking Neural Networks for Continuous Control via End-to-End Model-Based Learning

ICR: Iterative Clarification and Rewriting for Conversational Search

ToM-SSI: Evaluating Theory of Mind in Situated Social Interactions

Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs

Keypoint-based Diffusion for Robotic Motion Planning on the NICOL Robot

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

SIFThinker: Spatially-Aware Image Focus for Visual Reasoning

Sample-Aware Test-Time Adaptation for Medical Image-to-Image Translation

GPT-4.1 Sets the Standard in Automated Experiment Design Using Novel Python Libraries

FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning

New Kid in the Classroom: Exploring Student Perceptions of AI Coding Assistants

Analysis of Fourier Neural Operators via Effective Field Theory

FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models

PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Memorization Sinks: Isolating Memorization during LLM Training

Clue-RAG: Towards Accurate and Cost-Efficient Graph-based RAG via Multi-Partite Graph and Query-Driven Iterative Retrieval

OGF: An Online Gradient Flow Method for Optimizing the Statistical Steady-State Time Averages of Unsteady Turbulent Flows

AC-Refiner: Efficient Arithmetic Circuit Optimization Using Conditional Diffusion Models

Towards Bio-Inspired Robotic Trajectory Planning via Self-Supervised RNN

Evaluating the Robustness of Open-Source Vision-Language Models to Domain Shift in Object Captioning

Can Generalist Vision Language Models (VLMs) Rival Specialist Medical VLMs? Benchmarking and Strategic Insights

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$

Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing

Counterfactual Simulatability of LLM Explanations for Generation Tasks

PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims

HiLAB: A Hybrid Inverse-Design Framework

Tuning-Free LLM Can Build A Strong Recommender Under Sparse Connectivity And Knowledge Gap Via Extracting Intent

WaterFlow: Learning Fast & Robust Watermarks using Stable Diffusion

Is the Top Still Spinning? Evaluating Subjectivity in Narrative Understanding

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Training-free Adjustable Polynomial Graph Filtering for Ultra-fast Multimodal Recommendation

Teaching Your Models to Understand Code via Focal Preference Alignment

Investigating the use of terrain-following coordinates in AI-driven precipitation forecasts

SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning

Safe Learning Under Irreversible Dynamics via Asking for Help

Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild

TokenSkip: Controllable Chain-of-Thought Compression in LLMs

Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs

Pitfalls of defacing whole-head MRI: re-identification risk with diffusion models and compromised research potential

AI/ML Based Detection and Categorization of Covert Communication in IPv6 Network

Learn from Global Correlations: Enhancing Evolutionary Algorithm via Spectral GNN

Enhancing Automated Loop Invariant Generation for Complex Programs with Large Language Models

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints

Adversarial Prompt Distillation for Vision-Language Models

TrojanRobot: Physical-world Backdoor Attacks Against VLM-based Robotic Manipulation

The Belief State Transformer

A Statistical Analysis of Deep Federated Learning for Intrinsically Low-dimensional Data

Responsible AI in NLP: GUS-Net Span-Level Bias Detection Dataset and Benchmark for Generalizations, Unfairness, and Stereotypes

T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design

TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation

Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

RingMo-Aerial: An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning

Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning

Informed Correctors for Discrete Diffusion Models

EMOE: A Framework for Out-of-distribution Uncertainty Based Rejection via Model-Agnostic Expansive Matching of Experts

Empowering Time Series Analysis with Foundation Models: A Comprehensive Survey

Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions

Co-Alignment: Rethinking Alignment as Bidirectional Human-AI Cognitive Adaptation

When Safe Unimodal Inputs Collide: Optimizing Reasoning Chains for Cross-Modal Safety in Multimodal Large Language Models

Agentic Lybic: Multi-Agent Execution System with Tiered Reasoning and Orchestration

Executable Ontologies: Synthesizing Event Semantics with Dataflow Architecture

Explaining Tournament Solutions with Minimal Supports

Neuromorphic Computing with Multi-Frequency Oscillations: A Bio-Inspired Approach to Artificial Intelligence

TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems

Small Language Models are the Future of Agentic AI

Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D

Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success

Comprehend, Divide, and Conquer: Feature Subspace Exploration via Multi-Agent Hierarchical Reinforcement Learning

Robust Decision-Making Via Free Energy Minimization

CredID: Credible Multi-Bit Watermark for Large Language Models Identification

Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge

Overcoming classic challenges for artificial neural networks by providing incentives and practice

Federated Cross-Training Learners for Robust Generalization under Data Heterogeneity

Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models

Contrastive timbre representations for musical instrument and synthesizer retrieval

HARMONIC: A Content-Centric Cognitive Robotic Architecture

RadGame: An AI-Powered Platform for Radiology Education

JANUS: A Dual-Constraint Generative Framework for Stealthy Node Injection Attacks

Your Compiler is Backdooring Your Model: Understanding and Exploiting Compilation Inconsistency Vulnerabilities in Deep Learning Compilers

Created by

Haebom

저자

Simin Chen, Jinjun Peng, Yixin He, Junfeng Yang, Baishakhi Ray

개요

본 논문은 딥러닝 컴파일러의 설계에 존재하는 근본적인 취약성을 밝힙니다. 공식적으로 수정되지 않은 컴파일러가 컴파일 과정에서 모델의 의미를 변경하고 숨겨진 백도어를 도입할 수 있는지를 연구합니다. 적대적 설정과 자연적 설정 모두에서 연구를 진행했습니다. 적대적 설정에서는 컴파일 전에는 트리거가 효과가 없지만 컴파일 후에는 효과적인 백도어가 되는 무해한 모델을 생성합니다. 6개의 모델, 3개의 상용 컴파일러, 2개의 하드웨어 플랫폼에서 테스트한 결과, 트리거 입력에 대해 100% 성공률을 보였으며, 정상적인 정확도는 유지하고 최첨단 탐지기로는 탐지되지 않았습니다. 이 공격은 컴파일러, 하드웨어 및 부동 소수점 설정에 걸쳐 일반화됩니다. 자연적 설정에서는 HuggingFace의 상위 100개 모델(2억 2천만 회 이상 다운로드된 모델 포함)을 분석하여 31개 모델에서 자연적인 트리거를 발견했습니다. 이는 컴파일러가 적대적 조작 없이도 위험을 초래할 수 있음을 보여줍니다. 결과적으로 수정되지 않은 딥러닝 컴파일러가 모델의 의미를 암묵적으로 변경할 수 있는 간과된 위협을 밝혀냈으며, 안전하고 신뢰할 수 있는 머신러닝을 위한 새로운 방향을 제시합니다.

시사점, 한계점

•

시사점:

◦

딥러닝 컴파일러의 설계에 존재하는 심각한 보안 취약성을 최초로 밝혔습니다.

◦

적대적 공격과 자연적 트리거를 통해 컴파일러가 모델의 의미를 변경하고 백도어를 심을 수 있음을 증명했습니다.

◦

컴파일러, 하드웨어, 부동 소수점 설정에 걸쳐 공격이 일반화됨을 보여줍니다.

◦

안전하고 신뢰할 수 있는 머신러닝을 위한 새로운 연구 방향을 제시합니다.

•

한계점:

◦

특정 컴파일러와 모델에 대한 실험 결과이므로 다른 컴파일러 및 모델에 대한 일반화 가능성에 대한 추가 연구가 필요합니다.

◦

백도어 탐지 및 방지 기술에 대한 연구가 더 필요합니다.

◦

실제 세계 시나리오에서 이러한 취약성이 얼마나 빈번하게 악용될 수 있는지에 대한 추가 연구가 필요합니다.

Made with Slashpage