Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

IG Parser: A Software Package for the Encoding of Institutional Statements using the Institutional Grammar

One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling

J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization

Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset

From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents

Leveraging LLM Inconsistency to Boost Pass@k Performance

Sinusoidal Initialization, Time for a New Start

Enhancing Channel-Independent Time Series Forecasting via Cross-Variate Patch Embedding

Any-to-Any Learning in Computational Pathology via Triplet Multimodal Pretraining

Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals

ChromFound: Towards A Universal Foundation Model for Single-Cell Chromatin Accessibility Data

IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems

RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction

LLM-DSE: Searching Accelerator Parameters with LLM Agents

Online Iterative Self-Alignment for Radiology Report Generation

Reachability Barrier Networks: Learning Hamilton-Jacobi Solutions for Smooth and Flexible Control Barrier Functions

Improving Medium Range Severe Weather Prediction through Transformer Post-processing of AI Weather Forecasts

BioCube: A Multimodal Dataset for Biodiversity Research

One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems

TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs

FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Pretraining

GRoQ-Loco: Generalist and Robot-agnostic Quadruped Locomotion Control using Offline Datasets

Who You Are Matters: Bridging Topics and Social Roles via LLM-Enhanced Logical Recommendation

Artificial Intelligence Bias on English Language Learners in Automatic Scoring

Learning Long-Context Diffusion Policies via Past-Token Prediction

Fast Text-to-Audio Generation with Adversarial Post-Training

Unified Continuous Generative Models

Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?

Technical Report: Quantifying and Analyzing the Generalization Power of a DNN

Prompting Large Language Models for Training-Free Non-Intrusive Load Monitoring

On-Device LLM for Context-Aware Wi-Fi Roaming

Efficient Fine-Tuning of Quantized Models via Adaptive Rank and Bitwidth

Understanding University Students' Use of Generative AI: The Roles of Demographics and Personality Traits

Adaptive Thinking via Mode Policy Optimization for Social Language Agents

LLM-hRIC: LLM-empowered Hierarchical RAN Intelligent Control for O-RAN

On the Boolean Network Theory of Datalog$^\neg$

Learning to Reason under Off-Policy Guidance

How Effective Can Dropout Be in Multiple Instance Learning ?

Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

Cross-Document Cross-Lingual NLI via RST-Enhanced Graph Fusion and Interpretability Prediction

S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation

Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding

Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization

LogicQA: Logical Anomaly Detection with Vision Language Model Generated Questions

Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis

CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models

LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation

MirrorShield: Towards Universal Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models

RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs

Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More

Cost-Optimal Grouped-Query Attention for Long-Context Modeling

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

On the Vulnerability of Concept Erasure in Diffusion Models

Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

SQLong: Enhanced NL2SQL for Longer Contexts with LLMs

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation

Attention Mechanism for LLM-based Agents Dynamic Diffusion under Information Asymmetry

Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

R2-KG: General-Purpose Dual-Agent Framework for Reliable Reasoning on Knowledge Graphs

DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning

MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval

EquiBench: Benchmarking Large Language Models' Understanding of Program Semantics via Equivalence Checking

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models

Uncovering Untapped Potential in Sample-Efficient World Model Agents

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models

CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

AI-driven Personalized Privacy Assistants: a Systematic Literature Review

Early Risk Prediction of Pediatric Cardiac Arrest from Electronic Health Records via Multimodal Fused Transformer

Online Scheduling for LLM Inference with KV Cache Constraints

Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education

Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

Latent Action Learning Requires Supervision in the Presence of Distractors

Redefining Machine Unlearning: A Conformal Prediction-Motivated Approach

On the Role of Transformer Feed-Forward Layers in Nonlinear In-Context Learning

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

NBDI: A Simple and Effective Termination Condition for Skill Extraction from Task-Agnostic Demonstrations

MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

Building Symbiotic AI: Reviewing the AI Act for a Human-Centred, Principle-Based Framework

TiEBe: Tracking Language Model Recall of Notable Worldwide Events Through Time

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

A Separable Self-attention Inspired by the State Space Model for Computer Vision

Cross-model Transferability among Large Language Models on the Platonic Representations of Concepts

KunServe: Efficient Parameter-centric Memory Management for LLM Serving

Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension

Can LLMs be Good Graph Judge for Knowledge Graph Construction?

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations

Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models

Knowledge-Guided Prompt Learning for Request Quality Assurance in Public Code Review

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation

RATE: Causal Explainability of Reward Models with Imperfect Counterfactuals

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Large Continual Instruction Assistant

TopoTune : A Framework for Generalized Combinatorial Complex Neural Networks

Superposition Yields Robust Neural Scaling

Created by

Haebom

저자

Yizhou Liu, Ziming Liu, Jeff Gore

개요

본 논문은 대규모 언어 모델(LLM)의 성능 향상이 모델 크기 증가에 따른 손실 감소의 거듭제곱 법칙(neural scaling law)에 기반한다는 점에 주목하여, 이 법칙의 기원을 탐구합니다. 두 가지 경험적 원칙, 즉 LLM이 모델 차원보다 더 많은 것을 표현한다는 점(표현의 중첩)과 언어 내 단어 또는 개념의 출현 빈도가 다르다는 점을 바탕으로, 간단한 모델을 구축하여 모델 크기에 따른 손실의 변화를 연구했습니다. 약한 중첩(가장 빈번한 특징만 표현)의 경우 손실의 크기는 기저 특징의 빈도 분포에 의존하는 반면, 강한 중첩(모든 특징이 표현되지만 서로 겹침)의 경우 손실은 모델 차원에 반비례하는 것을 발견했습니다. 이러한 강건한 크기 변화는 기하학적으로 설명될 수 있으며, 저차원 공간에 더 많은 벡터가 채워질수록 벡터 간 간섭(제곱 중첩)이 차원에 반비례합니다. 실제로 네 가지 오픈소스 LLM을 분석한 결과, 강한 중첩을 보였으며, 본 연구의 간단한 모델 예측과 정량적으로 일치했습니다. 칠린칠라 스케일링 법칙 또한 본 연구 결과와 일치했습니다. 결론적으로, 표현의 중첩은 관찰된 신경 스케일링 법칙의 중요한 메커니즘이며, 이러한 통찰은 더 적은 계산과 매개변수로 더 나은 성능을 달성하기 위한 새로운 훈련 전략과 모델 아키텍처를 고안하는 데 기여할 것으로 예상합니다.

시사점, 한계점

•

시사점:

◦

대규모 언어 모델의 성능 향상에 대한 neural scaling law의 기저 메커니즘으로 표현의 중첩을 제시.

◦

강한 중첩 상황에서 손실이 모델 차원에 반비례하는 현상을 규명하고, 기하학적으로 설명.

◦

오픈소스 LLM 분석을 통해 모델의 예측과 실제 결과의 정량적 일치를 확인.

◦

칠린칠라 스케일링 법칙과의 일관성 확인.

◦

더 적은 계산과 매개변수로 더 나은 성능을 달성하기 위한 새로운 훈련 전략 및 모델 아키텍처 개발 가능성 제시.

•

한계점:

◦

간단한 모델을 사용하여 분석하였으므로, 실제 LLM의 복잡성을 완전히 반영하지 못할 수 있음.

◦

분석에 사용된 LLM의 종류와 범위가 제한적일 수 있음.

◦

제시된 메커니즘이 모든 LLM에 적용 가능한지는 추가 연구가 필요함.

Made with Slashpage