Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies

Thompson Sampling via Fine-Tuning of LLMs

A$^2$FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning

Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning

HALF: Harm-Aware LLM Fairness Evaluation Aligned with Deployment

ENIGMA: The Geometry of Reasoning and Alignment in Large-Language Models

Rediscovering Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

Latent Retrieval Augmented Generation of Cross-Domain Protein Binders

All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language

Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models

Higher-order interactions of multi-layer prompt

The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models

Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning

Ctrl-VI: Controllable Video Synthesis via Variational Inference

A Denoising Framework for Real-World Ultra-Low Dose Lung CT Images Based on an Image Purification Strategy

Comparing Human and Language Models Sentence Processing Difficulties on Complex Structures

Quantifying the Accuracy-Interpretability Trade-Off in Concept-Based Sidechannel Models

Agentic Misalignment: How LLMs Could Be Insider Threats

Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time

PHORECAST: Enabling AI Understanding of Public Health Outreach Across Populations

Predictive Preference Learning from Human Interventions

Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption

PerfBench: Can Agents Resolve Real-World Performance Bugs?

PATCH: Learnable Tile-level Hybrid Sparsity for LLMs

Does FLUX Already Know How to Perform Physically Plausible Image Composition?

On Theoretical Interpretations of Concept-Based In-Context Learning

Chiplet-Based RISC-V SoC with Modular AI Acceleration

From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning

Socratic Mind: Impact of a Novel GenAI-Powered Assessment Tool on Student Learning and Higher-Order Thinking

EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing

Preservation of Language Understanding Capabilities in Speech-aware Large Language Models

MarkDiffusion: An Open-Source Toolkit for Generative Watermarking of Latent Diffusion Models

Merge-of-Thought Distillation

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models

Rethinking Purity and Diversity in Multi-Behavior Sequential Recommendation from the Frequency Perspective

Large Language Models Enable Design of Personalized Nudges across Cultures

PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research

LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking

Analysis of Hyperparameter Optimization Effects on Lightweight Deep Models for Real-Time Image Classification

Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling

Causal Language Control in Multilingual Transformers via Sparse Feature Steering

HANS-Net: Hyperbolic Convolution and Adaptive Temporal Attention for Accurate and Generalizable Liver and Tumor Segmentation in CT Imaging

Why is Your Language Model a Poor Implicit Reward Model?

Prompt Perturbations Reveal Human-Like Biases in Large Language Model Survey Responses

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

TopoStreamer: Temporal Lane Segment Topology Reasoning in Autonomous Driving

A Clinically-Grounded Two-Stage Framework for Renal CT Report Generation

VALID-Mol: a Systematic Framework for Validated LLM-Assisted Molecular Design

R1-Ranker: Teaching LLM Rankers to Reason

LLM-guided Chemical Process Optimization with a Multi-Agent Approach

Subspace-Boosted Model Merging

SoK: Evaluating Jailbreak Guardrails for Large Language Models

TAI3: Testing Agent Integrity in Interpreting User Intent

KScope: A Framework for Characterizing the Knowledge Status of Language Models

When Style Breaks Safety: Defending LLMs Against Superficial Style Alignment

IQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering

Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention

Adaptive Budget Allocation for Orthogonal-Subspace Adapter Tuning in LLMs Continual Learning

Thinker: Learning to Think Fast and Slow

KL-regularization Itself is Differentially Private in Bandits and RLHF

InfoDet: A Dataset for Infographic Element Detection

Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models

Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses

APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight

ConDiSim: Conditional Diffusion Models for Simulation Based Inference

Internet of Agents: Fundamentals, Applications, and Challenges

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

LLMs' Suitability for Network Security: A Case Study of STRIDE Threat Modeling

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise

Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge

EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture

Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning

Leveraging LLMs, IDEs, and Semantic Embeddings for Automated Move Method Refactoring

A Neural Symbolic Model for Space Physics

Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions

Tokenizing Single-Channel EEG with Time-Frequency Motif Learning

Evaluating Sakana's AI Scientist: Bold Claims, Mixed Results, and a Promising Future?

The simulation of judgment in LLMs

FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling

The Last Dependency Crusade: Solving Python Dependency Conflicts with LLMs

Polynomial-Time Algorithms for Fair Orientations of Chores

VERITAS: Verifying the Performance of AI-native Transceiver Actions in Base-Stations

Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos

An AI-Driven Multimodal Smart Home Platform for Continuous Monitoring and Assistance in Post-Stroke Motor Impairment

Disentangled and Self-Explainable Node Representation Learning

AI-generated Essays: Characteristics and Implications on Automated Scoring and Academic Integrity

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

VoxelPrompt: A Vision Agent for End-to-End Medical Image Analysis

Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

MIO: A Foundation Model on Multimodal Tokens

GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

Say My Name: a Model's Bias Discovery Framework

Visual Stereotypes of Autism Spectrum in Janus-Pro-7B, DALL-E, Stable Diffusion, SDXL, FLUX, and Midjourney

A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice

Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference

HALF: Harm-Aware LLM Fairness Evaluation Aligned with Deployment

Created by

Haebom

Author

Ali Mekky, Omar El Herraoui, Preslav Nakov, Yuxia Wang

Outline

This paper highlights the importance of pre-deployment fairness and bias assessment, as large-scale language models (LLMs) are widely used in high-risk fields such as clinical decision support, legal analysis, recruitment, and education. To overcome the shortcomings of existing evaluations, we propose HALF (Harm-Aware LLM Fairness), a deployment-centric framework that assesses model bias in realistic application environments and considers the severity of harm. HALF organizes nine application domains into three tiers (severe, moderate, and mild) and uses a five-stage pipeline. The evaluation results for eight LLMs show that (1) LLMs do not consistently exhibit fairness across domains, (2) model size and performance do not guarantee fairness, and (3) inference models outperform medical decision support models but not training models.

Takeaways, Limitations

•

Takeaways:

◦

The fairness of LLM varies significantly across domains, and a model that performs well in one domain does not guarantee fairness in other domains.

◦

Fairness cannot be predicted solely based on model size or general performance metrics; evaluations tailored to specific application environments are required.

◦

Models with high inference ability do not improve fairness in all domains, and performance varies depending on domain characteristics.

◦

The HALF framework can make a significant contribution to evaluating and addressing fairness issues in real-world deployments of LLM.

•

Limitations:

◦

The number of LLMs and application domains used in the evaluation may be limited.

◦

Subjectivity may exist in the damage severity classification and assessment pipeline.

◦

Further research is needed to explore the generalizability of the HALF framework.

View PDF

Made with Slashpage