haebom
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
Beyond Hidden-Layer Manipulation: Semantically-Aware Logit Interventions for Debiasing LLMs
Efficient Low Rank Attention for Long-Context Inference in Large Language Models
RoGBot: Relationship-Oblivious Graph-based Neural Network with Contextual Knowledge for Bot Detection
SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection
VisCoder2: Building Multi-Language Visualization Coding Agents
Spatially Aware Linear Transformer (SAL-T) for Particle Jet Tagging
Structure-Aware Fusion with Progressive Injection for Multimodal Molecular Representation Learning
Integrating Genomics into Multimodal EHR Foundation Models
Bridging Function Approximation and Device Physics via Negative Differential Resistance Networks
Combining Textual and Structural Information for Premise Selection in Lean
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Help the machine to help you: an evaluation in the wild of egocentric data cleaning via skeptical learning
Monotone and Separable Set Functions: Characterizations and Neural Models
Noise is All You Need: Solving Linear Inverse Problems by Noise Combination Sampling with Diffusion Models
LLMComp: A Language Modeling Paradigm for Error-Bounded Scientific Data Compression
Beyond Pairwise: Empowering LLM Alignment With Ranked Choice Modeling
NUM2EVENT: Interpretable Event Reasoning from Numerical time-series
Chain of Execution Supervision Promotes General Reasoning in Large Language Models
AI-Driven Development of a Publishing Imprint: Xynapse Traces
From Detection to Discovery: A Closed-Loop Approach for Simultaneous and Continuous Medical Knowledge Expansion and Depression Detection on Social Media
Speeding Up MACE: Low-Precision Tricks for Equivarient Force Fields
Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction
Short Ticketing Detection Framework Analysis Report
An Enhanced Dual Transformer Contrastive Network for Multimodal Sentiment Analysis
Feedback Lunch: Deep Feedback Codes for Wiretap Channels
Preference Learning with Response Time: Robust Losses and Guarantees
Fine-tuning Large Language Models with Limited Data: A Survey and Practical Guide
Bridging Tool Dependencies and Domain Knowledge: A Graph-Based Framework for In-Context Planning
OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs
Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning
FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling
Generative AI for Healthcare: Fundamentals, Challenges, and Perspectives
From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning
Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks
Affordance Representation and Recognition for Autonomous Agents
Law in Silico: Simulating Legal Society with LLM-Based Agents
Human-Level Reasoning: A Comparative Study of Large Language Models on Logical and Abstract Reasoning
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training
Improving LLM Reasoning via Dependency-Aware Query Decomposition and Logic-Parallel Content Expansion
Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents
An N-of-1 Artificial Intelligence Ecosystem for Precision Medicine
A Unified Geometric Space Bridging AI Models and the Human Brain
VDSAgents: A PCS-Guided Multi-Agent System for Veridical Data Science Automation
Generative Large Language Models (gLLMs) in Content Analysis: A Practical Guide for Communication Research
Retrieval and Argumentation Enhanced Multi-Agent LLMs for Judgmental Forecasting
Verifying Large Language Models' Reasoning Paths via Correlation Matrix Rank
Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms
MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools
MGA: Memory-Driven GUI Agent for Observation-Centric Interaction
UniPlanner: A Unified Motion Planning Framework for Autonomous Vehicle Decision-Making Systems via Multi-Dataset Integration
BLM$_1$: A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning
BMGQ: A Bottom-up Method for Generating Complex Multi-hop Reasoning Questions from Semi-structured Data
From Observability Data to Diagnosis: An Evolving Multi-agent System for Incident Management in Cloud Systems
HistoLens: An Interactive XAI Toolkit for Verifying and Mitigating Flaws in Vision-Language Models for Histopathology
Modeling Electric Vehicle Car-Following Behavior: Classical vs Machine Learning Approach
LLMLogAnalyzer: A Clustering-Based Log Analysis Chatbot using Large Language Models
OneCast: Structured Decomposition and Modular Generation for Cross-Domain Time Series Forecasting
Discovering Heuristics with Large Language Models (LLMs) for Mixed-Integer Programs: Single-Machine Scheduling
Learning Individual Movement Shifts After Urban Disruptions with Social Infrastructure Reliance
The Sign Estimator: LLM Alignment in the Face of Choice Heterogeneity
Decentralized Causal Discovery using Judo Calculus
Latent Chain-of-Thought for Visual Reasoning
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Hybrid Modeling, Sim-to-Real Reinforcement Learning, and Large Language Model Driven Control for Digital Twins
Generating Creative Chess Puzzles
From Benchmarks to Business Impact: Deploying IBM Generalist Agent in Enterprise Production
Decentralized Multi-Agent Goal Assignment for Path Planning using Large Language Models
ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
Why Foundation Models in Pathology Are Failing
Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions
Test-Time Tuned Language Models Enable End-to-end De Novo Molecular Structure Generation from MS/MS Spectra
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
AI and the Decentering of Disciplinary Creativity
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
WhaleVAD-BPN: Improving Baleen Whale Call Detection with Boundary Proposal Networks and Post-processing Optimisation
The Gray Zone of Faithfulness: Taming Ambiguity in Unfaithfulness Detection
Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge
BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection
Steering Evaluation-Aware Language Models to Act Like They Are Deployed
DB-FGA-Net: Dual Backbone Frequency Gated Attention Network for Multi-Class Brain Tumor Classification with Grad-CAM Interpretability
Assessing the Feasibility of Early Cancer Detection Using Routine Laboratory Data: An Evaluation of Machine Learning Approaches on an Imbalanced Dataset
On the Structure of Stationary Solutions to McKean-Vlasov Equations with Applications to Noisy Transformers
ShapeX: Shapelet-Driven Post Hoc Explanations for Time Series Classification Models
Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning
Imbalanced Gradients in RL Post-Training of Multi-Task LLMs
What Makes a Good Curriculum? Disentangling the Effects of Data Ordering on LLM Mathematical Reasoning
Noise-corrected GRPO: From Noisy Rewards to Unbiased Gradients
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in OmniModels
ADPO: Anchored Direct Preference Optimization
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
MIN-Merging: Merge the Important Neurons for Model Merging
When Intelligence Fails: An Empirical Study on Why LLMs Struggle with Password Cracking
From Flows to Words: Can Zero-/Few-Shot LLMs Detect Network Intrusions? A Grammar-Constrained, Calibrated Evaluation on UNSW-NB15
SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors
GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
UNDREAM: Bridging Differentiable Rendering and Photorealistic Simulation for End-to-end Adversarial Attacks
The Chameleon Nature of LLMs: Quantifying Multi-Turn Stance Instability in Search-Enabled Language Models
ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion
Load more
The Cross-Lingual Cost: Retrieval Biases in RAG over Arabic-English Corpora
Created by
Haebom
저자
Chen Amiraz, Yaroslav Fyodorov, Elad Haramaty, Zohar Karnin, Liane Lewin-Eytan
개요
본 논문은 아랍어-영어 Cross-lingual Retrieval-Augmented Generation (RAG)을 연구하며, 실제 기업 데이터셋을 기반으로 한 도메인별 벤치마크를 사용하여 기존 연구의 한계를 극복하고자 한다. 특히, 사용자 쿼리와 지원 문서의 언어가 다른 경우 검색 성능 저하가 발생한다는 점을 발견하고, 이를 해결하기 위한 두 가지 간단한 검색 전략을 제시하여 성능을 향상시켰다.
시사점, 한계점
•
시사점:
◦
Cross-lingual RAG에서 검색이 중요한 병목 현상임을 밝힘.
◦
도메인별 환경에서 언어 간 검색 어려움을 확인.
◦
검색 성능 향상을 위한 간단한 전략 제시 (언어 균등 검색, 쿼리 번역).
◦
실제 RAG 애플리케이션에서 다국어 검색 개선 가능성을 보여줌.
•
한계점:
◦
제시된 검색 전략이 모든 언어 쌍에 일반화될 수 있는지에 대한 추가 연구 필요.
◦
특정 도메인 및 언어 쌍에 국한된 연구일 수 있음.
◦
개선된 검색 전략이 복잡한 환경에서도 효과적인지에 대한 검증 필요.
PDF 보기
Made with Slashpage