Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving

Mechanistic Interpretability as Statistical Estimation: A Variance Analysis of EAP-IG

Neural Diffusion Processes for Physically Interpretable Survival Prediction

Tenyidie Syllabification corpus creation and deep learning applications

On Predictability of Reinforcement Learning Dynamics for Large Language Models

EMR-AGENT: Automating Cohort and Feature Extraction from EMR Databases

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Normal-Abnormal Guided Generalist Anomaly Detection

Does Bigger Mean Better? Comparative Analysis of CNNs and Biomedical Vision Language Modles in Medical Diagnosis

AbsTopK: Rethinking Sparse Autoencoders For Bidirectional Features

VRWKV-Editor: Reducing quadratic complexity in transformer-based video editing

More Thoughts, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models

The AI Productivity Index (APEX)

Discontinuous Epitope Fragments as Sufficient Target Templates for Efficient Binder Design

Uncertainty-Aware Generative Oversampling Using an Entropy-Guided Conditional Variational Autoencoder

GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries

Segmentor-Guided Counterfactual Fine-Tuning for Locally Coherent and Targeted Image Synthesis

Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks

The Hidden Costs of Translation Accuracy: Distillation, Quantization, and Environmental Impact

IndexNet: Timestamp and Variable-Aware Modeling for Time Series Forecasting

An effective control of large systems of active particles: An application to evacuation problem

Discovering Software Parallelization Points Using Deep Neural Networks

SpeechWeave: Diverse Multilingual Synthetic Text & Audio Data Generation Pipeline for Training Text to Speech Models

Machines are more productive than humans until they aren't, and vice versa

Landcover classification and change detection using remote sensing and machine learning: a case study of Western Fiji

Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models

MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification

Forecasting the Ionosphere from Sparse GNSS Data with Temporal-Fusion Transformers

Towards Methane Detection Onboard Satellites

Tackling Federated Unlearning as a Parameter Estimation Problem

Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability

Legal Knowledge Graph Foundations, Part I: URI-Addressable Abstract Works (LRMoo F1 to schema.org)

An Architecture for Spatial Networking

AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?

VITA: Vision-to-Action Flow Matching Policy

Can LLMs Find Fraudsters? Multi-level LLM Enhanced Graph Fraud Detection

Model Parallelism With Subnetwork Data Parallelism

A Novel Approach for Estimating Largest Lyapunov Exponents in One-Dimensional Chaotic Time Series Using Machine Learning

PlaceFM: A Training-free Geospatial Foundation Model of Places using Large-Scale Point of Interest Data

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement

MS-DFTVNet:A Long-Term Time Series Prediction Method Based on Multi-Scale Deformable Convolution

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

WWAggr: A Window Wasserstein-based Aggregation for Ensemble Change Point Detection

Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering

Localized Forest Fire Risk Prediction: A Department-Aware Approach for Operational Decision Support

CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning

Should I Share this Translation? Evaluating Quality Feedback for User Reliance on Machine Translation

Differential Information Distribution: A Bayesian Perspective on Direct Preference Optimization

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Enhanced DACER Algorithm with High Diffusion Efficiency

What happens when generative AI models train recursively on each others' outputs?

Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

PiCa: Parameter-Efficient Fine-Tuning with Column Space Projection

Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap

Deriving Strategic Market Insights with Large Language Models: A Benchmark for Forward Counterfactual Generation

Time-o1: Time-Series Forecasting Needs Transformed Label Alignment

MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation

ABBA-Adapters: Efficient and Expressive Fine-Tuning of Foundation Models

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

ScSiameseClu: A Siamese Clustering Framework for Interpreting single-cell RNA Sequencing Data

AI-Powered Inverse Design of Ku-Band SIW Resonant Structures by Iterative Residual Correction Network

Feature Representation Transferring to Lightweight Models via Perception Coherence

PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

FalconWing: An Ultra-Light Indoor Fixed-Wing UAV Platform for Vision-Based Autonomy

WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms

Towards Effective E-Participation of Citizens in the European Union: The Development of AskThePublic

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models

Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier

Audio-Enhanced Vision-Language Modeling with Latent Space Broadening for High Quality Data Expansion

Knowledge-guided machine learning for county-level corn yield prediction under drought

Gaussian DP for Reporting Differential Privacy Guarantees in Machine Learning

FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4

What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning

Interpretable Text Embeddings and Text Similarity Explanation: A Survey

CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

Forget Forgetting: Continual Learning in a World of Abundant Memory

Out-of-Distribution Detection using Synthetic Data Generation

Handling Heterophily in Recommender Systems with Wavelet Hypergraph Diffusion

Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review

Diffusion Adversarial Post-Training for One-Step Video Generation

Unraveling Indirect In-Context Learning Using Influence Functions

Synergizing LLMs and Knowledge Graphs: A Novel Approach to Software Repository-Related Question Answering

VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning

Reasoning over User Preferences: Knowledge Graph-Augmented LLMs for Explainable Conversational Recommendations

Reliable Decision Making via Calibration Oriented Retrieval Augmented Generation

Faster LLM Inference using DBMS-Inspired Preemption and Cache Replacement Policies

There and Back Again: On the relationship between Noise and Image Inversions in Diffusion Models

QSpec: Speculative Decoding with Complementary Quantization Schemes

Superficial Safety Alignment Hypothesis

AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs

R2 v2: The Pareto-compliant R2 Indicator for Better Benchmarking in Bi-objective Optimization

Hierarchical place recognition with omnidirectional images and curriculum learning-based loss functions

Neural Network Parameter-optimization of Gaussian pmDAGs

Semantic Bridges Between First Order c-Representations and Cost-Based Semantics: An Initial Perspective

Rethinking Reward Models for Multi-Domain Test-Time Scaling

Communication-Efficient and Accurate Approach for Aggregation in Federated Low-Rank Adaptation

Unraveling Indirect In-Context Learning Using Influence Functions

Created by

Haebom

Author

Hadi Askari, Shivanshu Gupta, Terry Tong, Fei Wang, Anshuman Chhabra, Muhao Chen

Outline

In this study, we introduce Indirect In-Context Learning, a novel paradigm for generalized In-Context Learning (ICL). In Indirect ICL, we explore demo selection strategies tailored to two real-world scenarios: Mixture of Tasks and Noisy ICL. We systematically evaluate Influence Functions (IFs) as a selection tool for these settings, highlighting their potential to better capture the informativeness of examples within the demo pool. For the Mixture of Tasks setting, we extract demos from 28 diverse tasks, including MMLU, BigBench, StrategyQA, and CommonsenseQA. Combining BertScore-Recall (BSR) with the IF surrogate model further improves performance, achieving a mean absolute accuracy improvement of 0.37% and 1.45% in 3-shot and 5-shot settings, respectively, compared to the traditional ICL metric. In the Noisy ICL setting, we investigate scenarios where demos are mislabeled or subject to adversarial noise. Experimental results show that reweighting traditional ICL selectors (BSR and Cosine Similarity) using an IF-based selector improves accuracy by an average of 2.90% for Cosine Similarity and 2.94% for BSR on the noisy GLUE benchmark. Under adversarial subsetting, we demonstrate the utility of task-agnostic demo selection using IFs to mitigate backdoor attacks. Compared to task-aware methods, the attack success rate is reduced by 32.89%. In summary, we propose a robust framework for demo selection that generalizes beyond traditional ICL and provide valuable insights into the role of IFs in indirect ICL.

Takeaways, Limitations

•

Takeaways:

◦

Presentation of an Indirect In-Context Learning (ICL) paradigm utilizing Influence Functions (IFs).

◦

Performance improvement (up to 1.45% accuracy improvement) by combining BertScore-Recall (BSR) and IFs in Mixture of Tasks settings.

◦

Improved accuracy of existing ICL selectors (BSR, Cosine Similarity) by leveraging IFs in Noisy ICL settings (up to 2.94% improvement).

◦

Demonstrating the usefulness of IFs in task-independent demo selection for mitigating backdoor attacks (32.89% reduction in attack success rate).

•

Limitations:

◦

Further validation of the generalizability of the proposed method is needed.

◦

Evaluation of various types of noise and adversarial attacks is required.

◦

Analysis and improvement of the computational cost of IFs-based models are needed.

View PDF

Made with Slashpage