Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Are Vision Transformer Representations Semantically Meaningful? A Case Study in Medical Imaging

Probing Evaluation Awareness of Language Models

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

BranchNet: A Neuro-Symbolic Learning Framework for Structured Multi-Class Classification

GPU-based complete search for nonlinear minimization subject to bounds

Enhanced Generative Model Evaluation with Clipped Density and Coverage

Tuning without Peeking: Provable Privacy and Generalization Bounds for LLM Post-Training

ECCV 2024 W-CODA: 1st Workshop on Multimodal Perception and Comprehension of Corner Cases in Autonomous Driving

Towards culturally-appropriate conversational AI for health in the majority world: An exploratory study with citizens and professionals in Latin America

AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness

Exploring Advanced LLM Multi-Agent Systems Based on Blackboard Architecture

Relational Causal Discovery with Latent Confounders

GPT, But Backwards: Exactly Inverting Language Model Outputs

Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling

Deep Recommender Models Inference: Automatic Asymmetric Data Flow Optimization

Comparing Optimization Algorithms Through the Lens of Search Behavior Analysis

AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training

Autoregressive Image Generation with Linear Complexity: A Spatial-Aware Decay Perspective

GradMetaNet: An Equivariant Architecture for Learning on Gradients

Customized Exploration of Landscape Features Driving Multi-Objective Combinatorial Optimization Performance

Depth Anything at Any Condition

Tile and Slide : A New Framework for Scaling NeRF from Local to Global 3D Earth Observation

Prompt Guidance and Human Proximal Perception for HOT Prediction with Regional Joint Loss

Enhanced Influence-aware Group Recommendation for Online Media Propagation

Survivability of Backdoor Attacks on Unconstrained Face Recognition Systems

Data Agent: A Holistic Architecture for Orchestrating Data+AI Ecosystems

Autonomous AI Surveillance: Multimodal Deep Learning for Cognitive and Behavioral Monitoring

Exploring Classical Piano Performance Generation with Expressive Music Variational AutoEncoder

Real-Time Emergency Vehicle Siren Detection with Efficient CNNs on Embedded Hardware

Self-Guided Process Reward Optimization with Masked Step Advantage for Process Reinforcement Learning

Crafting Hanzi as Narrative Bridges: An AI Co-Creation Workshop for Elderly Migrants

AI and Remote Sensing for Resilient and Sustainable Built Environments: A Review of Current Methods, Open Data and Future Directions

Chargax: A JAX Accelerated EV Charging Simulator

Following the Clues: Experiments on Person Re-ID using Cross-Modal Intelligence

Integrating Traditional and Deep Learning Methods to Detect Tree Crowns in Satellite Images

Crop Pest Classification Using Deep Learning Techniques: A Review

BioMARS: A Multi-Agent Robotic System for Autonomous Biological Experiments

Epistemic Scarcity: The Economics of Unresolvable Unknowns

Evaluating the Effectiveness of Direct Preference Optimization for Personalizing German Automatic Text Simplifications for Persons with Intellectual Disabilities

Zero-Incentive Dynamics: a look at reward sparsity through the lens of unrewarded subgoals

NOCTIS: Novel Object Cyclic Threshold based Instance Segmentation

Quantum-Assisted Automatic Path-Planning for Robotic Quality Inspection in Industry 4.0

Tensor Program Optimization for the RISC-V Vector Extension Using Probabilistic Programs

EdgeLoRA: An Efficient Multi-Tenant LLM Serving System on Edge Devices

Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems

DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal

Penalizing Transparency? How AI Disclosure and Author Demographics Shape Human and AI Judgments About Writing

Evaluating LLM Agent Collusion in Double Auctions

Age Sensitive Hippocampal Functional Connectivity: New Insights from 3D CNNs and Saliency Mapping

Medical-Knowledge Driven Multiple Instance Learning for Classifying Severe Abdominal Anomalies on Prenatal Ultrasound

Distributional Soft Actor-Critic with Diffusion Policy

RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

User-guided Generative Source Separation

LEDOM: An Open and Fundamental Reverse Language Model

Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy

ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks

Neural Hamiltonian Operator

VLAD: A VLM-Augmented Autonomous Driving Framework with Hierarchical Planning and Interpretable Decision Process

Rethinking All Evidence: Enhancing Trustworthy Retrieval-Augmented Generation via Conflict-Driven Summarization

AI Meets Maritime Training: Precision Analytics for Enhanced Safety and Performance

PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning

LLM-based Realistic Safety-Critical Driving Video Generation

GAIus: Combining Genai with Legal Clauses Retrieval for Knowledge-based Assistant

Beyond First-Order: Training LLMs with Stochastic Conjugate Subgradients and AdamW

Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration

Search-Based Robot Motion Planning With Distance-Based Adaptive Motion Primitives

Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning

Geometry-aware 4D Video Generation for Robot Manipulation

AI-guided digital intervention with physiological monitoring reduces intrusive memories after experimental trauma

Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem

Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services

FAIR-MATCH: A Multi-Objective Framework for Bias Mitigation in Reciprocal Dating Recommendations

Quantifying Student Success with Generative AI: A Monte Carlo Simulation Informed by Systematic Review

Epitome: Pioneering an Experimental Platform for AI-Social Science Integration

Automated Vehicles Should be Connected with Natural Language

A Data Science Approach to Calcutta High Court Judgments: An Efficient LLM and RAG-powered Framework for Summarization and Similar Cases Retrieval

Prompt Mechanisms in Medical Imaging: A Comprehensive Survey

XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science

Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

Long-Sequence Memory with Temporal Kernels and Dense Hopfield Functionals

Can AI be Consentful?

Text Detoxification: Data Efficiency, Semantic Preservation and Model Generalization

Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals

Data Classification with Dynamically Growing and Shrinking Neural Networks

Can Argus Judge Them All? Comparing VLMs Across Domains

Fast AI Model Splitting over Edge Networks

Fast Clifford Neural Layers

On-Policy Optimization of ANFIS Policies Using Proximal Policy Optimization

Learning to Segment for Vehicle Routing Problems

Systemic Constraints of Undecidability

Research on Low-Latency Inference and Training Efficiency Optimization for Graph Neural Network and Large Language Model-Based Recommendation Systems

Data-driven Insights for Informed Decision-Making: Applying LSTM Networks for Robust Electricity Forecasting in Libya

An Uncertainty-Aware Dynamic Decision Framework for Progressive Multi-Omics Integration in Classification Tasks

PathCoT: Chain-of-Thought Prompting for Zero-shot Pathology Visual Reasoning

HPC-AI Coupling Methodology for Scientific Applications

Hello Afrika: Speech Commands in Kinyarwanda

A Systematic Review of Security Vulnerabilities in Smart Home Devices and Mitigation Techniques

Refining Gelfond Rationality Principle Towards More Comprehensive Foundational Principles for Answer Set Semantics

Joint Matching and Pricing for Crowd-shipping with In-store Customers

Program of Equations Thoughts to Solve Algebra Word Problems

Created by

Haebom

저자

Yunze Lin

개요

본 논문은 대규모 언어 모델(LLM)을 이용한 대수적 단어 문제(AWP) 해결을 다룹니다. 기존의 Chain-of-Thought 기법은 단계별 추론을 통해 성과를 거두었지만, LLM 자체의 계산 오류로 인한 정확도 저하 문제가 존재합니다. 이를 해결하기 위해, 본 논문에서는 방정식과 코드 생성을 분리하는 Program of Equations Thoughts (POET) 기법을 제안합니다. POET는 복잡한 계산을 Python 인터프리터에 위임하여 LLM의 계산 오류를 방지합니다. 또한, 수동으로 설계된 템플릿을 활용하여 단일 단계 문제 해결을 위한 Python 코드를 직접 생성하는 Zero-shot POET를 제안합니다. 제안된 방법은 PEN 및 ALG514 데이터셋에서 각각 95.3%와 98.0%의 정확도를 달성하여 최첨단 성능(SOTA)을 기록했으며, DRAW-1K 데이터셋에서도 95.5%의 SOTA 결과를 달성했습니다.

시사점, 한계점

•

시사점:

◦

LLM 기반 AWP 해결의 정확도 향상에 기여하는 새로운 방법론 제시 (POET 및 Zero-shot POET).

◦

LLM의 계산적 약점을 효과적으로 해결하여 SOTA 성능 달성.

◦

단계별 추론과 코드 생성을 분리하여 문제 해결 과정의 투명성 향상.

◦

Zero-shot POET를 통해 추가적인 학습 없이도 높은 정확도 달성 가능성 제시.

•

한계점:

◦

Zero-shot POET는 수동으로 설계된 템플릿에 의존, 일반화 성능에 대한 추가 연구 필요.

◦

Python 인터프리터 의존성으로 인해 다른 프로그래밍 언어를 사용하는 환경에서는 적용에 제약 존재.

◦

복잡한 문제 유형에 대한 일반화 성능 평가 및 개선 필요.

◦

다양한 데이터셋에 대한 추가적인 실험을 통해 견고성을 더욱 검증해야 함.

Made with Slashpage