Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Accurate and scalable exchange-correlation with deep learning

AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation

Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models

Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs

GRAM: A Generative Foundation Reward Model for Reward Generalization

VideoMAR: Autoregressive Video Generatio with Continuous Tokens

FrontendBench: A Benchmark for Evaluating LLMs on Front-End Development via Automatic Evaluation

Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models

No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

Serving Large Language Models on Huawei CloudMatrix384

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

Refactoring Codebases through Library Design

TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding

Multi-Agent Language Models: Advancing Cooperation, Coordination, and Adaptation

Multi-Task Reward Learning from Human Ratings

Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Vision Transformers Don't Need Trained Registers

BIS Reasoning 1.0: The First Large-Scale Japanese Benchmark for Belief-Inconsistent Syllogistic Reasoning

LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles

CORA: Coalitional Rational Advantage Decomposition for Multi-Agent Policy Gradients

Supervised Quantum Machine Learning: A Future Outlook from Qubits to Enterprise Applications

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models

Think Twice before Adaptation: Improving Adaptability of DeepFake Detection via Online Test-Time Adaptation

Efficient Long CoT Reasoning in Small Language Models

Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning

MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion

J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization

Fractured Chain-of-Thought Reasoning

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions

Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs

Influential Bandits: Pulling an Arm May Change the Environment

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning

Exploring Personalized Federated Learning Architectures for Violence Detection in Surveillance Videos

A Bird Song Detector for improving bird identification through Deep Learning: a case study from Do\~nana

KANITE: Kolmogorov-Arnold Networks for ITE estimation

Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization

Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing

Adding Chocolate to Mint: Mitigating Metric Interference in Machine Translation

EgoBlind: Towards Egocentric Visual Assistance for the Blind

PsychBench: A comprehensive and professional benchmark for evaluating the performance of LLM-assisted psychiatric clinical practice

Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data

Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning

Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks

Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions

Perspective Transition of Large Language Models for Solving Subjective Tasks

Can LLMs Ask Good Questions?

Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review

SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation

Large Language Models for Automated Literature Review: An Evaluation of Reference Generation, Abstract Writing, and Review Composition

Multiclass Post-Earthquake Building Assessment Integrating High-Resolution Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers

REVOLVE: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization

FLARE: Towards Universal Dataset Purification against Backdoor Attacks

Heterogeneous Relationships of Subjects and Shapelets for Semi-supervised Multivariate Series Classification

Contrast Similarity-Aware Dual-Pathway Mamba for Multivariate Time Series Node Classification

Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation

LL\"aMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

The Epochal Sawtooth Phenomenon: Unveiling Training Loss Oscillations in Adam and Other Optimizers

Pap2Pat: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs

Deep Graph Anomaly Detection: A Survey and New Perspectives

A Novel Perturb-ability Score to Mitigate Evasion Adversarial Attacks on Flow-Based ML-NIDS

Style-Preserving Lip Sync via Audio-Aware Style Reference

Advancing oncology with federated learning: transcending boundaries in breast, lung, and prostate cancer. A systematic review

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Informed Correctors for Discrete Diffusion Models

RadioRAG: Online Retrieval-augmented Generation for Radiology Question Answering

A Systematic Survey of Natural Language Processing for the Greek Language

Predicting the Understandability of Computational Notebooks through Code Metrics Analysis

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction

The NordDRG AI Benchmark for Large Language Models

From Data-Driven to Purpose-Driven Artificial Intelligence: Systems Thinking for Data-Analytic Automation of Patient Care

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Entropy-based Exploration Conduction for Multi-step Reasoning

Solving Satisfiability Modulo Counting Exactly with Probabilistic Circuits

Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Optimal Transport for Probabilistic Circuits

OM4OV: Leveraging Ontology Matching for Ontology Versioning

Behaviour Planning: A Toolkit for Diverse Planning

Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition

"Generate" the Future of Work through AI: Empirical Evidence from Online Labor Markets

Dense SAE Latents Are Features, Not Bugs

Sekai: A Video Dataset towards World Exploration

Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers

AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability

Federated Learning for MRI-based BrainAGE: a multicenter study on post-stroke functional outcome prediction

GFLC: Graph-based Fairness-aware Label Correction for Fair Classification

The Compositional Architecture of Regret in Large Language Models

LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning

The Epochal Sawtooth Phenomenon: Unveiling Training Loss Oscillations in Adam and Other Optimizers

Created by

Haebom

저자

Qi Liu, Wanjing Ma

개요

본 논문은 적응적 경사 기반 최적화기, 특히 Adam 최적화기를 사용한 훈련 중에 자주 관찰되는 "Epochal Sawtooth Phenomenon (ESP)"라 명명된 반복적인 훈련 손실 패턴을 확인하고 분석합니다. 이 패턴은 각 에포크의 시작 시 손실의 급격한 감소와 그 후 점진적인 증가를 특징으로 하며, 톱니 모양의 손실 곡선을 생성합니다. 실험적 관찰을 통해 Adam에서 가장 두드러지게 나타나지만 RMSProp과 같은 다른 최적화기에서도 덜 심각하게 지속됨을 보여줍니다. Adam의 β 매개변수, 배치 크기, 데이터 셔플링, 샘플 교체와 같은 주요 요인에 중점을 두고 ESP의 근본 메커니즘을 실험적으로 분석합니다. 분석 결과, ESP는 두 번째 모멘트 추정치에 의해 제어되는 적응적 학습률 조정으로 인해 발생하며, 데이터 셔플링 중 "샘플의 즉각적인 재노출" 효과가 각 에포크의 시작 시 모델이 더 많이 학습하거나 암기하는 원인이 됨을 보여줍니다. 또한 더 작은 β₂ 값이 ESP를 악화시키지만 일종의 정규화 역할을 할 수 있음을 발견했습니다. ESP가 과적합을 나타내는 것은 아니지만, 더 높은 모델 용량은 현상을 증폭시킬 수 있습니다. 분석을 더욱 뒷받침하기 위해 고차원 이차 최소화 작업을 통해 ESP를 복제했습니다. 간단한 최적화 시나리오에서도 ESP가 나타날 수 있음을 보여주어 이 패턴의 일반성을 강화했습니다. 실험 재현을 위한 코드는 https://github.com/qiliuchn/training-loss-pattern 에서 확인할 수 있습니다.

GitHub - qiliuchn/training-loss-pattern

Contribute to qiliuchn/training-loss-pattern development by creating an account on GitHub.

시사점, 한계점

•

시사점:

◦

Adam 최적화기를 사용한 훈련에서 흔히 발생하는 ESP 현상의 메커니즘을 규명하고 분석함으로써, 훈련 과정에 대한 이해를 증진시켰습니다.

◦

ESP 현상이 적응적 학습률 조정과 데이터 셔플링의 상호작용으로 인해 발생함을 밝혔습니다.

◦

β₂ 매개변수의 값 조정을 통해 ESP 현상을 완화하거나 정규화 효과를 얻을 수 있음을 제시했습니다.

◦

고차원 이차 최소화 작업을 통해 ESP 현상의 일반성을 확인했습니다.

•

한계점:

◦

본 연구는 특정 최적화기와 몇 가지 제한된 실험 설정에 초점을 맞추었으므로, 더 다양한 최적화기, 데이터셋, 모델 아키텍처에 대한 추가 연구가 필요합니다.

◦

ESP 현상과 과적합의 관계에 대한 추가적인 분석이 필요합니다.

◦

본 논문에서 제시된 분석 결과는 주로 경험적 관찰에 기반하고 있으며, 이론적인 설명이 부족합니다.

Made with Slashpage