[공지사항]을 빙자한 안부와 근황
Show more
/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
Merge Kernel for Bayesian Optimization on Permutation Space
Demographic-aware fine-grained classification of pediatric wrist fractures
Generative Multi-Target Cross-Domain Recommendation
ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models
A Simple Baseline for Stable and Plastic Neural Networks
WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
How Not to Detect Prompt Injections with an LLM
Critiques of World Models
The role of large language models in UI/UX design: A systematic literature review
LearnLens: LLM-Enabled Personalised, Curriculum-Grounded Feedback with Educators in the Loop
STACK: Adversarial Attacks on LLM Safeguard Pipelines
ZonUI-3B: A Lightweight Vision-Language Model for Cross-Resolution GUI Grounding
Understanding Reasoning in Thinking Language Models via Steering Vectors
Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis
SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Exploring Graph Representations of Logical Forms for Language Modeling
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs
CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors
Hands-On: Segmenting Individual Signs from Continuous Sequences
Can we ease the Injectivity Bottleneck on Lorentzian Manifolds for Graph Neural Networks?
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation
AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results
An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model
Evaluating link prediction: New perspectives and recommendations
Learning to Reason at the Frontier of Learnability
Stonefish: Supporting Machine Learning Research in Marine Robotics
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
On the Transfer of Knowledge in Quantum Algorithms
Code Readability in the Age of Large Language Models: An Industrial Case Study from Atlassian
Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Consistency of Responses and Continuations Generated by Large Language Models on Social Media
From Code to Compliance: Assessing ChatGPT's Utility in Designing an Accessible Webpage -- A Case Study
Temporal reasoning for timeline summarisation in social media
Invisible Textual Backdoor Attacks based on Dual-Trigger
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Two-Stage Pretraining for Molecular Property Prediction in the Wild
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
An Approach for Auto Generation of Labeling Functions for Software Engineering Chatbots
Bridging Local and Global Knowledge via Transformer in Board Games
Entropy Loss: An Interpretability Amplifier of 3D Object Detection Network for Intelligent Driving
FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Visual Grounding Methods for Efficient Interaction with Desktop Graphical User Interfaces
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
SecurePose: Automated Face Blurring and Human Movement Kinematics Extraction from Videos Recorded in Clinical Settings
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges
From Roots to Rewards: Dynamic Tree Reasoning with RL
Illuminating the Three Dogmas of Reinforcement Learning under Evolutionary Light
Instance space analysis of the capacitated vehicle routing problem
Multi-Agent LLMs as Ethics Advocates for AI-Based Systems
GATSim: Urban Mobility Simulation with Generative Agents
Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know?
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Strategic Reflectivism In Intelligent Systems
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
What the F*ck Is Artificial General Intelligence?
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios
To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization
BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems
UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception
CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis
Toward Temporal Causal Representation Learning with Tensor Decomposition
Kolmogorov Arnold Networks (KANs) for Imbalanced Data - An Empirical Perspective
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) トラック
Multi-Centre Validation of a Deep Learning Model for Scoliosis Assessment
The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?
DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits
Edge Intelligence with Spiking Neural Networks
VLA-Mark: A cross modal watermark for large vision-language alignment model
Noradrenergic-inspired gain modulation attenuates the stability gap in joint training
A multi-strategy improved snake optimizer for 3-dimensional UAV path planning and engineering problems
Photonic Fabric Platform for AI Accelerators
OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
A segmented robot grasping perception neural network for edge AI
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation
Exploiting Primacy Effect To Improve Large Language Models
Generalist Forecasting with Frozen Video Models via Latent Diffusion
Convergent transformations of visual representation in brains and models
Preprint: Did I Just Browse A Website Written by LLMs?
The Levers of Political Persuasion with Conversational AI
Political Leaning and Politicalness Classification of Texts
Self-supervised learning on gene expression data
Using LLMs to identify features of personal and professional skills in an open-response situational judgment test
Load more
Bridging Local and Global Knowledge via Transformer in Board Games
Created by
Haebom
作者
Yan-Ru Ju、Tai-Lin Wu、Chung-Chin Shih、Ti-Rong Wu
概要
AlphaZeroはボードゲームで超人的なパフォーマンスを達成しましたが、ボード全体の包括的な理解を必要とするシナリオ(例えば、囲碁の長期パターン認識)では限界があります。本論文では、局所的およびグローバルな知識を結びつけるために、残差ブロックとトランスフォーマブロックとを交差するResTNetを提案する。 ResTNetは複数のボードゲームで勝率を向上させ(9x9囲碁:54.6%→60.8%、19x19囲碁:53.6%→60.9%、19x19ヘックス:50.4%→58.0%)、19x19パターン処理します。円形パターン認識の平均二乗誤差を2.58から1.07に、敵対プログラムに対する攻撃確率を70.44%から23.91%に減少させ、はしごパターン認識精度を59.15%から80.01%に向上させます。アテンションマップの視覚化により、囲碁とヘックスの両方で重要なゲームコンセプトを捉え、AlphaZeroの意思決定プロセスに関する洞察を提供します。 ResTNetは、局所的およびグローバルな知識統合への有望なアプローチを提示し、ボードゲームでより効果的なAlphaZeroベースのアルゴリズムのための道を開きます。コードは
https://rlg.iis.sinica.edu.tw/papers/restnet
で確認できます。
Bridging Local and Global Knowledge via Transformer in Board Games
rlg.iis.sinica.edu.tw
Takeaways、Limitations
•
Takeaways:
◦
ResTNetはAlphaZeroの限界を克服し、ボードゲームでパフォーマンスを向上させる新しいアーキテクチャを提示します。
◦
局所的およびグローバル的情報を効果的に統合する方法を提示します。
◦
長期パターン認識能力を向上させ、AlphaZeroの意思決定プロセスを理解するのに役立ちます。
◦
様々なボードゲームでの性能向上を実験的に検証した。
•
Limitations:
◦
提示された方法がすべての種類のボードゲームに適用可能であることに関するさらなる研究が必要である。
◦
ResTNetのパフォーマンス向上が特定のゲームやパターンに偏る可能性があります。
◦
より複雑で多様なパターンに対する一般化能力のさらなる検証が必要である。
PDFを見る
Made with Slashpage