Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Dynaword: From One-shot to Continuously Developed Datasets

Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor

Proof2Hybrid: Automatic Mathematical Benchmark Synthesis for Proof-Centric Problems

Collaborative Chain-of-Agents for Parametric-Retrieved Knowledge Synergy

BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability

SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy

Managing Escalation in Off-the-Shelf Large Language Models

FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models

A Foundational Schema.org Mapping for a Legal Knowledge Graph: Representing Brazilian Legal Norms as FRBR Works

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

SMART-Editor: A Multi-Agent Framework for Human-Like Design Editing with Structural Integrity

Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints

MoCHA: Advanced Vision-Language Reasoning with MoE Connector and Hierarchical Group Attention

Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation

Memorization in Fine-Tuned Large Language Models

From Entanglement to Alignment: Representation Space Decomposition for Unsupervised Time Series Domain Adaptation

The Xeno Sutra: Can Meaning and Value be Ascribed to an AI-Generated "Sacred" Text?

Post-Completion Learning for Language Models

Rainbow Noise: Stress-Testing Multimodal Harmful-Meme Detectors on LGBTQ Content

Equivariant Volumetric Grasping

SemiSegECG: A Multi-Dataset Benchmark for Semi-Supervised Semantic Segmentation in ECG Delineation

FedSA-GCL: A Semi-Asynchronous Federated Graph Learning Framework with Personalized Aggregation and Cluster-Aware Broadcasting

Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility

R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning

P3SL: Personalized Privacy-Preserving Split Learning on Heterogeneous Edge Devices

Document Haystack: A Long Context Multimodal Image/Document Understanding Vision LLM Benchmark

Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation

TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models

Divide-Then-Rule: A Cluster-Driven Hierarchical Interpolator for Attribute-Missing Graphs

$\Texttt{Droid}$: A Resource Suite for AI-Generated Code Detection

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Principled Foundations for Preference Optimization

Evaluating LLMs on Real-World Forecasting Against Expert Forecasters

STRUCTSENSE: A Task-Agnostic Agentic Framework for Structured Information Extraction with Human-In-The-Loop Evaluation and Benchmarking

S2FGL: Spatial Spectral Federated Graph Learning

AI4Research: A Survey of Artificial Intelligence for Scientific Research

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Reinforcing VLMs to Use Tools for Detailed Visual Reasoning Under Resource Constraints

Causally Steered Diffusion for Automated Video Counterfactual Generation

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study

ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark

ProRefine: Inference-Time Prompt Refinement with Textual Feedback

SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design

MetaGen Blended RAG: Unlocking Zero-Shot Precision for Specialized Domain Question-Answering

Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference

Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind

Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI

All-optical temporal integration mediated by subwavelength heat antennas

GRILL: Gradient Signal Restoration in Ill-Conditioned Layers to Enhance Adversarial Attacks on Autoencoders

JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers

FFCBA: Feature-based Full-target Clean-label Backdoor Attacks

Multilingual Performance Biases of Large Language Models in Education

NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis

Efficient Generative Model Training via Embedded Representation Warmup

Graph Attention-Driven Bayesian Deep Unrolling for Dual-Peak Single-Photon Lidar Imaging

Spectral Architecture Search for Neural Network Models

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model

ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems

Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance

Ensemble Learning for Large Language Models in Text and Code Generation: A Survey

Augmented Adversarial Trigger Learning

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness

M2S: Multi-turn to Single-turn jailbreak in Red Teaming for LLMs

A Causal Framework for Aligning Image Quality Metrics and Deep Neural Network Robustness

PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset

DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

Entropy-Lens: The Information Signature of Transformer Computations

CAMEF: Causal-Augmented Multi-Modality Event-Driven Financial Forecasting by Integrating Time Series Patterns and Salient Macroeconomic Announcements

Shaping Sparse Rewards in Reinforcement Learning: A Semi-supervised Approach

AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought

AI-driven Wireless Positioning: Fundamentals, Standards, State-of-the-art, and Challenges

CHIRP: A Fine-Grained Benchmark for Open-Ended Response Evaluation in Vision-Language Models

Average-Reward Soft Actor-Critic

Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation

From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate

IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves

Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification

Effective AGM Belief Contraction: A Journey beyond the Finitary Realm (Technical Report)

Beyond Images: Adaptive Fusion of Visual and Textual Data for Food Classification

TAPAS: Fast and Automatic Derivation of Tensor Parallel Strategies for Large Neural Networks

KCR: Resolving Long-Context Knowledge Conflicts via Reasoning in LLMs

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

CADDesigner: Conceptual Design of CAD Models Based on General-Purpose Agent

Mind the Gap: The Divergence Between Human and LLM-Generated Tasks

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Tiny-BioMoE: a Lightweight Embedding Model for Biosignal Analysis

The AlphaPhysics Term Rewriting System for Marking Algebraic Expressions in Physics Exams

Modeling Deontic Modal Logic in the s(CASP) Goal-directed Predicate Answer Set Programming System

Automatic Prompt Optimization for Knowledge Graph Construction: Insights from an Empirical Study

The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning

Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory

UFEval: Unified Fine-grained Evaluation with Task and Aspect Generalization

Novice Developers' Perspectives on Adopting LLMs for Software Development: A Systematic Literature Review

Created by

Haebom

作者

Samuel Ferino, Rashina Hoda, John Grundy, Christoph Treude

概要

本論文は、2022年4月から2025年6月まで出版された80本の研究論文を対象に、初心者開発者（コンピュータサイエンス/ソフトウェア工学学生および2年未満のキャリアの初期キャリア開発者）の大規模言語モデル（LLM）ベースのソフトウェア開発ツールの採用に関する体系的な文献研究。研究は4つの研究質問（RQ）に答えるために行われ、各RQは研究の動機と方法論、初心者の開発者がLLMを使用するソフトウェア開発作業、LLMを使用する利点と課題と推奨事項、そして研究のLimitationsと将来の研究の方向性をカバーします。研究結果は、ソフトウェアエンジニアリング研究者、教育者、開発者のための将来の研究方向とTakeawaysを提示し、関連資料は公に提供されています。

Takeaways、Limitations

•

Takeaways：

◦

初心者の開発者のLLMベースのソフトウェア開発ツールの採用の包括的な理解を提供します。

◦

LLMの使用の利点、課題、および推奨事項を体系的に分類および要約します。

◦

ソフトウェアエンジニアリング研究者、教育者、開発者のための将来の研究方向とTakeawaysを提示します。

◦

研究成果に基づき、LLM ベースのツールの有効活用と教育計画の策定に貢献できます。

•

Limitations：

◦

分析対象の研究の期間と範囲は限られている可能性があります。（2022年4月～2025年6月）

◦

研究の定性的偏向の可能性の存在

◦

特定の研究設計や方法論に偏る可能性。

◦

ＬＬＭの急速な発展速度を考慮すると、研究結果の視認性が制限される可能性がある。

Made with Slashpage