/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
Interleaving Reasoning for Better Text-to-Image Generation
Barycentric Neural Networks and Length-Weighted Persistent Entropy Loss: A Green Geometric and Topological Framework for Function Approximation
Signal-Based Malware Classification Using 1D CNNs
Toward a Metrology for Artificial Intelligence: Hidden-Rule Environments and Reinforcement Learning
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
No Thoughts Just AI: Biased LLM Hiring Recommendations Alter Human Decision Making and Limit Human Autonomy
What Fundamental Structure in Reward Functions Enables Efficient Sparse-Reward Learning?
HodgeFormer: Transformers for Learnable Operators on Triangular Meshes through Data-Driven Hodge Matrices
CAT: Causal Attention Tuning For Injecting Fine-grained Causal Knowledge into Large Language Models
Pilot Study on Generative AI and Critical Thinking in Higher Education Classrooms
ZkLoRA: Fine-Tuning Large Language Models with Verifiable Security via Zero-Knowledge Proofs
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Ultra-Low-Latency Spiking Neural Networks with Temporal-Dependent Integrate-and-Fire Neuron Model for Objects Detection
Attacking LLMs and AI Agents: Advertisement Embedding Attacks Against Large Language Models
A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems
Trust but Verify! A Survey on Verification Design for Test-time Scaling
Research on Conversational Recommender System Considering Consumer Types
A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges
Grid-Agent: An LLM-Powered Multi-Agent System for Power Grid Control
Enhancing Dialogue Annotation with Speaker Characteristics Leveraging a Frozen LLM
A Mixed User-Centered Approach to Enable Augmented Intelligence in Intelligent Tutoring Systems: The Case of MathAIde app
Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs
MoRPI-PINN: A Physics-Informed Framework for Mobile Robot Pure Inertial Navigation
Conditional Video Generation for High-Efficiency Video Compression
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges
Grounding DINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models
Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting
From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations
HueManity: Probing Fine-Grained Visual Perception in MLLMs
Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments
Localizing Persona Representations in LLMs
Multi-output Classification using a Cross-talk Architecture for Compound Fault Diagnosis of Motors in Partially Labeled Condition
SCIZOR: A Self-Supervised Approach to Data Curation for Large-Scale Imitation Learning
Is Your LLM Overcharging You? Tokenization, Transparency, and Incentives
Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts
Visuospatial Cognitive Assistant
Overflow Prevention Enhances Long-Context Recurrent LLMs
GRADA: Graph-based Reranking against Adversarial Documents Attack
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Comparative Analysis of Lightweight Deep Learning Models for Memory-Constrained Devices
Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?
Llama-Nemotron: Efficient Reasoning Models
Tripartite-GraphRAG via Plugin Ontologies
DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification
Enhancing Traffic Incident Response through Sub-Second Temporal Localization with HybridMamba
Audio-centric Video Understanding Benchmark without Text Shortcut
The Model Hears You: Audio Language Model Deployments Should Consider the Principle of Least Privilege
Involution and BSConv Multi-Depth Distillation Network for Lightweight Image Super-Resolution
DistJoin: A Decoupled Join Cardinality Estimator based on Adaptive Neural Predicate Modulation
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection
VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR Identification
Cardiverse: Harnessing LLMs for Novel Card Game Prototyping
TrojanRobot: Physical-world Backdoor Attacks Against VLM-based Robotic Manipulation
Automatically Detecting Online Deceptive Patterns
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning
CTourLLM: Enhancing LLMs with Chinese Tourism Knowledge
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
MSRFormer: Road Network Representation Learning using Multi-scale Feature Fusion of Heterogeneous Spatial Interactions
Attention of a Kiss: Exploring Attention Maps in Video Diffusion for XAIxArts
EvoEmo: Towards Evolved Emotional Policies for LLM Agents in Multi-Turn Negotiation
AI-SearchPlanner: Modular Agentic Search via Pareto-Optimal Multi-Objective Reinforcement Learning
MaRVL-QA: A Benchmark for Mathematical Reasoning over Visual Landscapes
Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
CountQA: How Well Do MLLMs Count in the Wild?
ASP-FZN: A Translation-based Constraint Answer Set Solver
MedGellan: LLM-Generated Medical Guidance to Support Physicians
Modeling the Diachronic Evolution of Legal Norms: An LRMoo-Based, Component-Level, Event-Centric Approach to Legal Knowledge Graphs
Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs
GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
Automatic Reward Shaping from Confounded Offline Data
Visualizing Thought: Conceptual Diagrams Enable Robust Combinatorial Planning in LMMs
COMMA: A Communicative Multimodal Multi-Agent Benchmark
PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents
Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism
Self-Emotion-Mediated Exploration in Artificial Intelligence Mirrors: Findings from Cognitive Psychology
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
ACE and Diverse Generalization via Selective Disagreement
Bringing Multi-Modal Multi-Task Federated Foundation Models to Education Domain: Prospects and Challenges
ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Breaking Android with AI: A Deep Dive into LLM-Powered Exploitation
Accelerating Local AI on Consumer GPUs: A Hardware-Aware Dynamic Strategy for YOLOv10s
GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models
Multimodal Contrastive Pretraining of CBCT and IOS for Enhanced Tooth Segmentation
Uncovering Scaling Laws for Large Language Models via Inverse Problems
Active Membership Inference Test (aMINT): Enhancing Model Auditability with Multi-Task Learning
Deep Learning-Based Burned Area Mapping Using Bi-Temporal Siamese Networks and AlphaEarth Foundation Datasets
Small Open Models Achieve Near Parity with Large Models in Low Resource Literary Translation at a Fraction of the Cost
Forecasting Russian Equipment Losses Using Time Series and Deep Learning Models
Enhanced SegNet with Integrated Grad-CAM for Interpretable Retinal Layer Segmentation in OCT Images
Individual utilities of life satisfaction reveal inequality aversion unrelated to political alignment
XSRD-Net: EXplainable Stroke Relapse Detection
Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning
What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source Projects
Spectral and Rhythm Feature Performance Evaluation for Category and Class Level Audio Classification with Deep Convolutional Neural Networks
Enhancing Online Learning by Integrating Biosensors and Multimodal Learning Analytics for Detecting and Predicting Student Behavior: A Review
Spectral Masking and Interpolation Attack (SMIA): A Black-box Adversarial Attack against Voice Authentication and Anti-Spoofing Systems
Load more
ACE and Diverse Generalization via Selective Disagreement
Created by
Haebom
作者
Oliver Daniels, Stuart Armstrong, Alexandre Maranh ao, Mahirah Fairuz Rahman, Benjamin M. Marlin, Rebecca Gorman
概要
この論文は、深層ニューラルネットワークが偽の相関関係に脆弱であるという問題を解決するための新しい方法であるACEを提案します。既存の研究では、不完全な偽の相関に焦点を当てて相関を破るラベル付きインスタンスにアクセスする方法を使用していましたが、完全な偽の相関の場合、正しい一般化が根本的に不十分に特定されています。 ACEは、トレーニングデータと一致しますが、新しいラベルなし入力のサブセットについて異なる予測を行う概念セットを学習することによって、これらの不十分な特定の問題を解決します。自信を持って選択的な矛盾を促進する自己トレーニング方式を使用して、ACEはさまざまな完全な偽の相関ベンチマークで既存の方法と同等または優れたパフォーマンスを示し、不完全な偽の相関にも強いです。さらに、ACEは従来の方法よりも構成が容易で、事前知識を直接エンコードし、原則に基づく非マップモデルの選択を可能にします。言語モデルの並べ替えの初期適用では、ACEは信頼できない測定値にアクセスすることなく、測定操作検出ベンチマークで競争力のあるパフォーマンスを達成しました。
Takeaways、Limitations
•
Takeaways:
◦
完全な偽相関問題に対する新しい解決策を提示(ACEアルゴリズム)
◦
従来の方法より優れたまたは同等の性能を様々なベンチマークで達成。
◦
不完全な偽の相関関係にも丈夫。
◦
事前知識のエンコーディングと非マップモデルの選択可能。
◦
言語モデルのソート分野で信頼できない測定値なしで競争力のあるパフォーマンスを達成します。
•
Limitations:
◦
依然として重要な Limitations 存在 (具体的な Limitations は論文に明示的に記載されていない)。
PDFを見る
Made with Slashpage