Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Accurate and scalable exchange-correlation with deep learning

AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation

Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models

Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs

GRAM: A Generative Foundation Reward Model for Reward Generalization

VideoMAR: Autoregressive Video Generatio with Continuous Tokens

FrontendBench: A Benchmark for Evaluating LLMs on Front-End Development via Automatic Evaluation

Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models

No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

Serving Large Language Models on Huawei CloudMatrix384

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

Refactoring Codebases through Library Design

TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding

Multi-Agent Language Models: Advancing Cooperation, Coordination, and Adaptation

Multi-Task Reward Learning from Human Ratings

Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Vision Transformers Don't Need Trained Registers

BIS Reasoning 1.0: The First Large-Scale Japanese Benchmark for Belief-Inconsistent Syllogistic Reasoning

LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles

CORA: Coalitional Rational Advantage Decomposition for Multi-Agent Policy Gradients

Supervised Quantum Machine Learning: A Future Outlook from Qubits to Enterprise Applications

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models

Think Twice before Adaptation: Improving Adaptability of DeepFake Detection via Online Test-Time Adaptation

Efficient Long CoT Reasoning in Small Language Models

Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning

MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion

J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization

Fractured Chain-of-Thought Reasoning

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions

Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs

Influential Bandits: Pulling an Arm May Change the Environment

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning

Exploring Personalized Federated Learning Architectures for Violence Detection in Surveillance Videos

A Bird Song Detector for improving bird identification through Deep Learning: a case study from Do\~nana

KANITE: Kolmogorov-Arnold Networks for ITE estimation

Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization

Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing

Adding Chocolate to Mint: Mitigating Metric Interference in Machine Translation

EgoBlind: Towards Egocentric Visual Assistance for the Blind

PsychBench: A comprehensive and professional benchmark for evaluating the performance of LLM-assisted psychiatric clinical practice

Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data

Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning

Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks

Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions

Perspective Transition of Large Language Models for Solving Subjective Tasks

Can LLMs Ask Good Questions?

Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review

SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation

Large Language Models for Automated Literature Review: An Evaluation of Reference Generation, Abstract Writing, and Review Composition

Multiclass Post-Earthquake Building Assessment Integrating High-Resolution Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers

REVOLVE: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization

FLARE: Towards Universal Dataset Purification against Backdoor Attacks

Heterogeneous Relationships of Subjects and Shapelets for Semi-supervised Multivariate Series Classification

Contrast Similarity-Aware Dual-Pathway Mamba for Multivariate Time Series Node Classification

Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation

LL\"aMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

The Epochal Sawtooth Phenomenon: Unveiling Training Loss Oscillations in Adam and Other Optimizers

Pap2Pat: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs

Deep Graph Anomaly Detection: A Survey and New Perspectives

A Novel Perturb-ability Score to Mitigate Evasion Adversarial Attacks on Flow-Based ML-NIDS

Style-Preserving Lip Sync via Audio-Aware Style Reference

Advancing oncology with federated learning: transcending boundaries in breast, lung, and prostate cancer. A systematic review

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Informed Correctors for Discrete Diffusion Models

RadioRAG: Online Retrieval-augmented Generation for Radiology Question Answering

A Systematic Survey of Natural Language Processing for the Greek Language

Predicting the Understandability of Computational Notebooks through Code Metrics Analysis

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction

The NordDRG AI Benchmark for Large Language Models

From Data-Driven to Purpose-Driven Artificial Intelligence: Systems Thinking for Data-Analytic Automation of Patient Care

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Entropy-based Exploration Conduction for Multi-step Reasoning

Solving Satisfiability Modulo Counting Exactly with Probabilistic Circuits

Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Optimal Transport for Probabilistic Circuits

OM4OV: Leveraging Ontology Matching for Ontology Versioning

Behaviour Planning: A Toolkit for Diverse Planning

Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition

"Generate" the Future of Work through AI: Empirical Evidence from Online Labor Markets

Dense SAE Latents Are Features, Not Bugs

Sekai: A Video Dataset towards World Exploration

Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers

AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability

Federated Learning for MRI-based BrainAGE: a multicenter study on post-stroke functional outcome prediction

GFLC: Graph-based Fairness-aware Label Correction for Fair Classification

The Compositional Architecture of Regret in Large Language Models

LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning

Parameterized Synthetic Text Generation with SimpleStories

Created by

Haebom

저자

Lennart Finke, Chandan Sreedhara, Thomas Dooms, Mat Allen, Emerald Zhang, Juan Diego Rodriguez, Noa Nabeshima, Thomas Marshall, Dan Braun

개요

SimpleStories는 영어와 일본어로 각 200만 개의 샘플을 포함하는, 간단한 언어로 작성된 대규모 합성 스토리 데이터셋입니다. 다양한 추상화 수준에서 프롬프트를 매개변수화함으로써, 구문 및 의미적 다양성을 유도하여 스토리 특징을 대규모로 제어할 수 있습니다. 새롭게 훈련된 모델 모음에 대한 ablation 연구는 TinyStories 데이터셋에 비해 샘플 효율성과 모델 해석력이 향상되었음을 보여줍니다. 모델 생성의 모든 구성 요소를 오픈소스로 공개하여 엔드투엔드 훈련 과정을 연구하는 새로운 방법을 가능하게 하고자 합니다. 부산물로, 문법적으로 자연스러운 언어를 출력하는 최소 매개변수 언어 모델에 대한 한계를 넓혔습니다.

시사점, 한계점

•

시사점:

◦

간단한 언어로 작성된 대규모 합성 스토리 데이터셋 SimpleStories 제공

◦

프롬프트 매개변수화를 통한 스토리 특징 제어 및 다양성 확보

◦

기존 데이터셋(TinyStories) 대비 향상된 샘플 효율성 및 모델 해석력

◦

엔드투엔드 훈련 과정 연구를 위한 오픈소스 공개

◦

최소 매개변수로 문법적으로 자연스러운 언어 생성 가능성 제시

•

한계점:

◦

합성 데이터셋의 한계로 인한 실제 데이터와의 차이 존재 가능성 (명시적으로 언급되지는 않았으나, 합성 데이터의 특성상 존재할 수 있는 한계점)

◦

Ablation 연구의 범위 및 세부 내용이 제한적으로 제시되어 추가적인 검증 필요 (논문에서 자세한 내용이 제공되지 않아 추측)

◦

다른 언어로의 확장성에 대한 검토 필요 (논문에서 영어와 일본어만 다룸)

You do not have permission to write comments