Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Defining and Quantifying Creative Behavior in Popular Image Generators

TS-SNN: Temporal Shift Module for Spiking Neural Networks

IntelliCardiac: An Intelligent Platform for Cardiac Image Segmentation and Classification

AI-Powered Agile Analog Circuit Design and Optimization

Demonstrating ViSafe: Vision-enabled Safety for High-speed Detect and Avoid

Motion-compensated cardiac MRI using low-rank diffeomorphic flow (DMoCo)

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models

Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques

Humans can learn to detect AI-generated texts, or at least learn when they can't

Large Language Models Understanding: an Inherent Ambiguity Barrier

Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing

PINN-MEP: Continuous Neural Representations for Minimum-Energy Path Discovery in Molecular Systems

Building Trustworthy Multimodal AI: A Review of Fairness, Transparency, and Ethics in Vision-Language Tasks

GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions

A highly maneuverable flying squirrel drone with agility-improving foldable wings

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

Perils of Label Indeterminacy: A Case Study on Prediction of Neurological Recovery After Cardiac Arrest

CodeIF-Bench: Evaluating Instruction-Following Capabilities of Large Language Models in Interactive Code Generation

Novel Deep Neural OFDM Receiver Architectures for LLR Estimation

NaFM: Pre-training a Foundation Model for Small-Molecule Natural Products

Benchmarking Open-Source Large Language Models on Healthcare Text Classification Tasks

Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization

Integrating AI for Human-Centric Breast Cancer Diagnostics: A Multi-Scale and Multi-View Swin Transformer Framework

Negotiative Alignment: Embracing Disagreement to Achieve Fairer Outcomes -- Insights from Urban Studies

Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges

Semantic Shift Estimation via Dual-Projection and Classifier Reconstruction for Exemplar-Free Class-Incremental Learning

LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces

Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems

FLARE: A Framework for Stellar Flare Forecasting using Stellar Physical Properties and Historical Records

TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

Safety Evaluation of DeepSeek Models in Chinese Contexts

DejAIvu: Identifying and Explaining AI Art on the Web in Real-Time with Saliency Maps

Texture Image Synthesis Using Spatial GAN Based on Vision Transformers

Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning

The Right to AI

Communicating Activations Between Language Model Agents

Guaranteed Recovery of Unambiguous Clusters

ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning

Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction

Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play

E2E-AFG: An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation

Jailbreaking and Mitigation of Vulnerabilities in Large Language Models

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency Patching

Learning to Compare Hardware Designs for High-Level Synthesis

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Automated detection of underdiagnosed medical conditions via opportunistic imaging

Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to Giant

On Synthetic Texture Datasets: Challenges, Creation, and Curation

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

XG-NID: Dual-Modality Network Intrusion Detection using a Heterogeneous Graph Neural Network and Large Language Model

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Enhancing Differential Testing With LLMs For Testing Deep Learning Libraries

HORAE: A Domain-Agnostic Language for Automated Service Regulation

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying

Analyzing Consumer IoT Traffic from Security and Privacy Perspectives: a Comprehensive Survey

DyCE: Dynamically Configurable Exiting for Deep Learning Compression and Real-time Scaling

The Inadequacy of Similarity-based Privacy Metrics: Privacy Attacks against "Truly Anonymous" Synthetic Datasets

Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics

An automated end-to-end deep learning-based framework for lung cancer diagnosis by detecting and classifying the lung nodules

Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions

Transformer-based assignment decision network for multiple object tracking

An alignment safety case sketch based on debate

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete

A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law

Agentic Neurodivergence as a Contingent Solution to the AI Alignment Problem

Theoretical Foundations for Semantic Cognition in Artificial Intelligence

Approximate Lifted Model Construction

MultiMind: Enhancing Werewolf Agents with Multimodal Reasoning and Theory of Mind

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Recursive Inference Scaling: A Winning Path to Scalable Inference in Language and Multimodal Systems

Generating Symbolic World Models via Test-time Scaling of Large Language Models

Imagining and building wise machines: The centrality of AI metacognition

Public Perceptions of Fairness Metrics Across Borders

Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for Entity Alignment

Flow-GRPO: Training Flow Matching Models via Online RL

StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

ComPO: Preference Alignment via Comparison Oracles

TransProQA: an LLM-based literary Translation evaluation metric with Professional Question Answering

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Reasoning Models Don't Always Say What They Think

Crosslingual Reasoning through Test-Time Scaling

CART-ELC: Oblique Decision Tree Induction via Exhaustive Search

Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks

Time of the Flight of the Gaussians: Optimizing Depth Indirectly in Dynamic Radiance Fields

High-fidelity Grain Growth Modeling: Leveraging Deep Learning for Fast Computations

Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery

Mapping User Trust in Vision Language Models: Research Landscape, Challenges, and Prospects

Scalable Chain of Thoughts via Elastic Reasoning

Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection

PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

Software Development Life Cycle Perspective: A Survey of Benchmarks for CodeLLMs and Agents

T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction

Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration

Training Domain Draft Models for Speculative Decoding: Best Practices and Insights

Created by

Haebom

저자

Fenglu Hong, Ravi Raju, Jonathan Lingjie Li, Bo Li, Urmish Thakker, Avinash Ravichandran, Swayambhoo Jain, Changran Hu

개요

본 논문은 대규모 언어 모델(LLM) 추론 가속화를 위한 유망한 방법인 추측적 디코딩(speculative decoding)의 효율성을 높이는 연구에 관한 것입니다. 기존의 추측적 디코딩은 일반적인 초안 모델(draft model)을 사용하지만, 특정 도메인에 적용 시 도메인 차이로 인해 수용률이 크게 감소하는 문제점이 있습니다. 이를 해결하기 위해 본 논문은 지식 증류 기법을 활용하여 도메인 특화 초안 모델을 학습하는 방법을 제시하고, 화이트박스 및 블랙박스 증류 방식을 비교 분석하며, 기존 사용자 질의, 기획된 도메인 데이터, 합성 데이터 등 다양한 데이터 접근성 시나리오에서의 효과를 실험적으로 검증합니다. 함수 호출, 생물학, 중국어 도메인에 대한 실험 결과, 오프라인 증류가 온라인 증류보다 11%25% 우수하며, 화이트박스 증류가 블랙박스 증류보다 2%10% 우수함을 보였습니다. 또한 합성 데이터를 사용하여 기존 사용자 질의 데이터 학습 성능의 80%~93% 수준의 성능을 달성할 수 있음을 확인했습니다.

시사점, 한계점

•

시사점:

◦

도메인 특화 초안 모델 학습을 위한 효과적인 지식 증류 기법 제시 및 검증.

◦

오프라인 증류의 우수성 및 화이트박스 증류의 효과 입증.

◦

합성 데이터 활용을 통한 도메인 특화 초안 모델 학습의 효율성 증대 가능성 제시.

◦

도메인 특화 추측적 디코딩의 실질적인 효율 향상을 위한 실용적인 지침 제공.

•

한계점:

◦

실험 대상 도메인이 함수 호출, 생물학, 중국어 세 가지로 제한적임.

◦

다양한 LLM 아키텍처 및 크기에 대한 일반화 가능성 검증 부족.

◦

합성 데이터 생성 방법 및 품질에 대한 자세한 설명 부족.

◦

특정 도메인 이외 다른 도메인에서의 일반화 성능에 대한 추가 연구 필요.

Made with Slashpage