Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds

Avoidance Decoding for Diverse Multi-Branch Story Generation

HydroVision: Predicting Optically Active Parameters in Surface Water Using Computer Vision

HodgeFormer: Transformers for Learnable Operators on Triangular Meshes through Data-Driven Hodge Matrices

MSA2-Net: Utilizing Self-Adaptive Convolution Module to Extract Multi-Scale Information in Medical Image Segmentation

Q-Learning-Driven Adaptive Rewiring for Cooperative Control in Heterogeneous Networks

Spotlighter: Revisiting Prompt Tuning from a Representative Mining View

Multimodal Iterative RAG for Knowledge Visual Question Answering

Embodied AI: Emerging Risks and Opportunities for Policy Action

Meta-learning ecological priors from large language models explains human learning and decision making

Scaffold Diffusion: Sparse Multi-Category Voxel Structure Generation with Discrete Diffusion

Locus: Agentic Predicate Synthesis for Directed Fuzzing

Network-Level Prompt and Trait Leakage in Local Research Agents

The Information Dynamics of Generative Diffusion

Arbitrary Precision Printed Ternary Neural Networks with Holistic Evolutionary Approximation

Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms

LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery

MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents

STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports

BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models

Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP

MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning

A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems

LanternNet: A Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations

When and Where do Data Poisons Attack Textual Inversion?

Covering a Few Submodular Constraints and Applications

Rethinking Data Protection in the (Generative) Artificial Intelligence Era

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization

Multimodal Medical Image Binding via Shared Text Embeddings

Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning

Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity

LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis

A theoretical framework for self-supervised contrastive learning for continuous dependent data

Securing AI Agents with Information-Flow Control

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands

Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning

When a Reinforcement Learning Agent Encounters Unknown Unknowns

Group-in-Group Policy Optimization for LLM Agent Training

Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

LawFlow: Collecting and Simulating Lawyers' Thought Processes on Business Formation Case Studies

On Developers' Self-Declaration of AI-Generated Code: An Analysis of Practices

WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada

Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO

TruthLens: Visual Grounding for Universal DeepFake Reasoning

Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning

Efficiently Editing Mixture-of-Experts Models with Compressed Experts

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs

Investigating a Model-Agnostic and Imputation-Free Approach for Irregularly-Sampled Multivariate Time-Series Modeling

Rapid Word Learning Through Meta In-Context Learning

FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs

Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning

Survey on Hand Gesture Recognition from Visual Input

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models

RouteNet-Gauss: Hardware-Enhanced Network Modeling with Machine Learning

GalaxAlign: Mimicking Citizen Scientists' Multimodal Guidance for Galaxy Morphology Analysis

Soft-Transformers for Continual Learning

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Domain Consistency Representation Learning for Lifelong Person Re-Identification

Aligning Machine and Human Visual Representations across Abstraction Levels

Towards Agentic AI on Particle Accelerators

Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Banishing LLM Hallucinations Requires Rethinking Generalization

SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games

Explainable Machine Learning-Based Security and Privacy Protection Framework for Internet of Medical Things Systems

From Metrics to Meaning: Time to Rethink Evaluation in Human-AI Collaborative Design

P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers

CoreThink: A Symbolic Reasoning Layer to reason over Long Horizon Tasks with LLMs

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

AHELM: A Holistic Evaluation of Audio-Language Models

The Ramon Llull's Thinking Machine for Automated Ideation

Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning

KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Integrating Activity Predictions in Knowledge Graphs

Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks

ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-Domain Incremental Learning in CLIP

Deep Research Agents: A Systematic Examination And Roadmap

Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimization

ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research

Shutdownable Agents through POST-Agency

CyberBOT: Towards Reliable Cybersecurity Education via Ontology-Grounded Retrieval Augmented Generation

PadChest-GR: A Bilingual Chest X-ray Dataset for Grounded Radiology Report Generation

Can Large Language Models Act as Ensembler for Multi-GNNs?

MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration

Frugal inference for control

On Generating Monolithic and Model Reconciling Explanations in Probabilistic Scenarios

A Survey on Human-AI Collaboration with Large Foundation Models

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Created by

Haebom

Author

David Chanin, James Wilken-Smith, Tom a\v{s} Dulka, Hardik Bhatnagar, Satvik Golechha, Joseph Bloom

Outline

This paper deals with Sparse Autoencoders (SAEs), which aim to decompose the activation space of a large-scale language model (LLM) into human-interpretable potential directions or features. Increasing the number of features in an SAE leads to feature splitting, which is a phenomenon in which hierarchical features are split into more fine-grained features (e.g., “mathematics” is split into “algebra”, “geometry”, etc.). However, this paper shows that sparse decomposition and splitting of hierarchical features are not robust. In particular, features with a seemingly single meaning are not properly activated and are “absorbed” into child features, which is called feature absorption. This phenomenon is revealed to occur in the process of optimizing sparsity in SAEs when the underlying features form a hierarchical structure. In this paper, we present a metric for detecting absorption in SAEs and conduct experimental validation on hundreds of LLM SAEs. We suggest that simply changing the size or sparsity of SAEs is not enough to solve this problem. Finally, we discuss fundamental theoretical issues that need to be addressed before LLM can be robustly and large-scalely interpreted using SAE, as well as potential solutions to these issues.

Takeaways, Limitations

•

Takeaways: Revealed that sparse decomposition and partitioning of hierarchical features in SAE are not robust, and newly introduced the feature absorption phenomenon. This points out Limitations, which is important for applying SAE to LLM analysis. In addition, a new metric for detecting feature absorption is proposed.

•

Limitations: It was shown that changing the size or sparsity of SAE alone cannot solve the feature absorption problem, but it did not provide a specific solution to solve the fundamental problem. Feature absorption detection using the currently presented metrics requires further research. A more robust and scalable methodology for LLM interpretation is needed.

View PDF

Made with Slashpage