Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

MOIS-SAM2: Exemplar-based Segment Anything Model 2 for multilesion interactive segmentation of neurofibromas in whole-body MRI

Soft Tokens, Hard Truths

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models

COLT: Enhancing Video Large Language Models with Continual Tool Usage

Do You Need Proprioceptive States in Visuomotor Policies?

CPCLDETECTOR: Knowledge Enhancement and Alignment Selection for Chinese Patronizing and Condescending Language Detection

Self-Evolving LLMs via Continual Instruction Tuning

Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework

Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning

Equip Pre-ranking with Target Attention by Residual Quantization

Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data

Patterns in the Transition From Founder-Leadership to Community Governance of Open Source

Synthetic bootstrapped pretraining

PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Structure Matters: Brain Graph Augmentation via Learnable Edge Masking for Data-efficient Psychiatric Diagnosis

Beyond the Pre-Service Horizon: Infusing In-Service Behavior for Improved Financial Risk Forecasting

HumAine-Chatbot: Real-Time Personalized Conversational AI via Reinforcement Learning

EAI-Avatar: Emotion-Aware Interactive Talking Head Generation

SciRerankBench: Benchmarking Rerankers Towards Scientific Retrieval-Augmented Generated LLMs

Do AI Companies Make Good on Voluntary Commitments to the White House?

Embedding Alignment in Code Generation for Audio

Kron-LoRA: Hybrid Kronecker-LoRA Adapters for Scalable, Sustainable Fine-tuning

From Query to Logic: Ontology-Driven Multi-Hop Reasoning in LLMs

Measuring Harmfulness of Computer-Using Agents

Enhancing RAG Efficiency with Adaptive Context Compression

CANDLE: A Cross-Modal Agentic Knowledge Distillation Framework for Interpretable Sarcopenia Diagnosis

Assay2Mol: large language model-based drug design using BioAssay context

Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation

White-Basilisk: A Hybrid Model for Code Vulnerability Detection

Energy Management for Renewable-Collocated Artificial Intelligence Data Centers

VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation

LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization

Structure As Search: Unsupervised Permutation Learning for Combinatorial Optimization

HAZEMATCHING: Dehazing Light Microscopy Images with Guided Conditional Flow Matching

Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs

Engineering RAG Systems for Real-World Applications: Design, Development, and Evaluation

CUPID: Curating Data your Robot Loves with Influence Functions

Quantum-Classical Hybrid Quantized Neural Network

SurgVidLM: Towards Multi-grained Surgical Video Understanding with Large Language Model

Why Do Some Inputs Break Low-Bit LLM Quantization?

A Quad-Step Approach to Uncertainty-Aware Deep Learning for Skin Cancer Classification

CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

Urania: Differentially Private Insights into AI Use

RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing

OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

Localized LoRA: A Structured Low-Rank Approximation for Efficient Fine-Tuning

PathGene: Benchmarking Driver Gene Mutations and Exon Prediction Using Multicenter Lung Cancer Histopathology Image Dataset

To Trust Or Not To Trust Your Vision-Language Model's Prediction

SEM: Enhancing Spatial Understanding for Robust Robot Manipulation

Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

GSPRec: Temporal-Aware Graph Spectral Filtering for Recommendation

EDBench: Large-Scale Electron Density Data for Molecular Modeling

Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare

LEMUR Neural Network Dataset: Towards Seamless AutoML

Towards Visual Text Grounding of Multimodal Large Language Model

Unsupervised Estimation of Nonlinear Audio Effects: Comparing Diffusion-Based and Adversarial approaches

DP-LET: An Efficient Spatio-Temporal Network Traffic Prediction Framework

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment

Challenges and Trends in Egocentric Vision: A Survey

Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models

Language Models Fail to Introspect About Their Knowledge of Language

Learning to Drive by Imitating Surrounding Vehicles

A Transformer Model for Predicting Chemical Products from Generic SMARTS Templates with Data Augmentation

Anomaly Detection in Complex Dynamical Systems: A Systematic Framework Using Embedding Theory and Physics-Inspired Consistency

Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions

HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Compact Rule-Based Classifier Learning via Gradient Descent

BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues

Representation Convergence: Mutual Distillation is Secretly a Form of Regularization

Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets

Stylus: Repurposing Stable Diffusion for Training-Free Music Style Transfer on Mel-Spectrograms

Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model

A GEN AI Framework for Medical Note Generation

Evading Toxicity Detection with ASCII-art: A Benchmark of Spatial Attacks on Moderation Systems

Efficient Fine-Tuning of Large Language Models for Automated Medical Documentation

Robust Training of Neural Networks at Arbitrary Precision and Sparsity

On the Integration of Spatial-Temporal Knowledge: A Lightweight Approach to Atmospheric Time Series Forecasting

DeNOTS: Stable Deep Neural ODEs for Time Series

TALEC: Teach Your LLM to Evaluate in Specific Domain with In-house Criteria by Criteria Division and Zero-shot Plus Few-shot

RealitySummary: Exploring On-Demand Mixed Reality Text Summarization and Question Answering using Large Language Models

CLIP Can Understand Depth

CueGCL: Cluster-aware Personalized Self-Training for Unsupervised Graph Contrastive Learning

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Markov Decision Processes under External Temporal Processes

MAPO: Mixed Advantage Policy Optimization

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models

Plan Verification for LLM-Based Embodied Task Completion Agents

GRAFT: GRaPH and Table Reasoning for Textual Alignment -- A Benchmark for Structured Instruction Following and Visual Reasoning

Compression Strategies for Efficient Multimodal LLMs in Medical Contexts

Emergent Risk Awareness in Rational Agents under Resource Constraints

LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization

Created by

Haebom

Author

Xujia Wang, Yunjia Qi, Bin Xu

Outline

Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA, significantly reduce the number of learnable parameters by introducing low-rank decomposition matrices. However, existing methods perform a large number of matrix multiplications in domain-specific tasks, resulting in low computational efficiency and poor fine-tuning performance. In this paper, we propose Low-Resources Subnet Integration Adaptation (LoSiA), an innovative method that dynamically identifies and optimizes important parameters during the training process. Specifically, we use gradient sparsity analysis to identify subnetworks and optimize them as learnable targets. This design enables effective high-rank adaptation by updating only subnetwork parameters, reducing additional matrix multiplications. We also present LoSiA-Pro, a faster implementation of LoSiA that reduces training latency by approximately 27% compared to LoRA. Extensive evaluation results demonstrate that the proposed method requires the shortest training time for domain-specific and common-sense reasoning tasks while minimizing performance degradation compared to full fine-tuning. Further analysis confirms that LoSiA also reduces forgetting during continuous training. The source code can be found at https://github.com/KlozeWang/LoSiA .

GitHub - KlozeWang/LoSiA: LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization

LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization - KlozeWang/LoSiA

github.com

Takeaways, Limitations

•

Takeaways:

◦

A new method, LoSiA, is proposed to solve the computational inefficiency problem of existing PEFT methods.

◦

Efficient subnetwork optimization possible through gradient sparsity analysis.

◦

Minimizes performance degradation and reduces training time compared to full fine-tuning.

◦

Confirming the effect of reducing forgetting during continuous learning.

◦

LoSiA-Pro implementation is faster than LoRA.

•

Limitations:

◦

Further research is needed to determine whether the performance improvements of LoSiA presented in this paper can be generalized to all types of models and tasks.

◦

Lack of in-depth discussion of Limitations of gradient sparsity analysis as a subnetwork selection criterion.

◦

Lack of analysis of LoSiA performance changes according to various hyperparameter settings.

View PDF

Made with Slashpage