/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Dynaword: From One-shot to Continuously Developed Datasets
Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor
Proof2Hybrid: Automatic Mathematical Benchmark Synthesis for Proof-Centric Problems
Collaborative Chain-of-Agents for Parametric-Retrieved Knowledge Synergy
BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability
SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy
Managing Escalation in Off-the-Shelf Large Language Models
FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models
A Foundational Schema.org Mapping for a Legal Knowledge Graph: Representing Brazilian Legal Norms as FRBR Works
D3: Training-Free AI-Generated Video Detection Using Second-Order Features
SMART-Editor: A Multi-Agent Framework for Human-Like Design Editing with Structural Integrity
Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints
MoCHA: Advanced Vision-Language Reasoning with MoE Connector and Hierarchical Group Attention
Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
Memorization in Fine-Tuned Large Language Models
From Entanglement to Alignment: Representation Space Decomposition for Unsupervised Time Series Domain Adaptation
The Xeno Sutra: Can Meaning and Value be Ascribed to an AI-Generated "Sacred" Text?
Post-Completion Learning for Language Models
Rainbow Noise: Stress-Testing Multimodal Harmful-Meme Detectors on LGBTQ Content
Equivariant Volumetric Grasping
SemiSegECG: A Multi-Dataset Benchmark for Semi-Supervised Semantic Segmentation in ECG Delineation
FedSA-GCL: A Semi-Asynchronous Federated Graph Learning Framework with Personalized Aggregation and Cluster-Aware Broadcasting
Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility
R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
P3SL: Personalized Privacy-Preserving Split Learning on Heterogeneous Edge Devices
Document Haystack: A Long Context Multimodal Image/Document Understanding Vision LLM Benchmark
Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation
TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models
Divide-Then-Rule: A Cluster-Driven Hierarchical Interpolator for Attribute-Missing Graphs
$\Texttt{Droid}$: A Resource Suite for AI-Generated Code Detection
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
Principled Foundations for Preference Optimization
Evaluating LLMs on Real-World Forecasting Against Expert Forecasters
STRUCTSENSE: A Task-Agnostic Agentic Framework for Structured Information Extraction with Human-In-The-Loop Evaluation and Benchmarking
S2FGL: Spatial Spectral Federated Graph Learning
AI4Research: A Survey of Artificial Intelligence for Scientific Research
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation
Reinforcing VLMs to Use Tools for Detailed Visual Reasoning Under Resource Constraints
Causally Steered Diffusion for Automated Video Counterfactual Generation
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study
ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark
ProRefine: Inference-Time Prompt Refinement with Textual Feedback
SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design
MetaGen Blended RAG: Unlocking Zero-Shot Precision for Specialized Domain Question-Answering
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
All-optical temporal integration mediated by subwavelength heat antennas
GRILL: Gradient Signal Restoration in Ill-Conditioned Layers to Enhance Adversarial Attacks on Autoencoders
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
FFCBA: Feature-based Full-target Clean-label Backdoor Attacks
Multilingual Performance Biases of Large Language Models in Education
NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models
Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis
Efficient Generative Model Training via Embedded Representation Warmup
Graph Attention-Driven Bayesian Deep Unrolling for Dual-Peak Single-Photon Lidar Imaging
Spectral Architecture Search for Neural Network Models
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Augmented Adversarial Trigger Learning
ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness
M2S: Multi-turn to Single-turn jailbreak in Red Teaming for LLMs
A Causal Framework for Aligning Image Quality Metrics and Deep Neural Network Robustness
PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
Entropy-Lens: The Information Signature of Transformer Computations
CAMEF: Causal-Augmented Multi-Modality Event-Driven Financial Forecasting by Integrating Time Series Patterns and Salient Macroeconomic Announcements
Shaping Sparse Rewards in Reinforcement Learning: A Semi-supervised Approach
AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought
AI-driven Wireless Positioning: Fundamentals, Standards, State-of-the-art, and Challenges
CHIRP: A Fine-Grained Benchmark for Open-Ended Response Evaluation in Vision-Language Models
Average-Reward Soft Actor-Critic
Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation
From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate
IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification
Effective AGM Belief Contraction: A Journey beyond the Finitary Realm (Technical Report)
Beyond Images: Adaptive Fusion of Visual and Textual Data for Food Classification
TAPAS: Fast and Automatic Derivation of Tensor Parallel Strategies for Large Neural Networks
KCR: Resolving Long-Context Knowledge Conflicts via Reasoning in LLMs
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
CADDesigner: Conceptual Design of CAD Models Based on General-Purpose Agent
Mind the Gap: The Divergence Between Human and LLM-Generated Tasks
RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization
Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power
Tiny-BioMoE: a Lightweight Embedding Model for Biosignal Analysis
The AlphaPhysics Term Rewriting System for Marking Algebraic Expressions in Physics Exams
Modeling Deontic Modal Logic in the s(CASP) Goal-directed Predicate Answer Set Programming System
Automatic Prompt Optimization for Knowledge Graph Construction: Insights from an Empirical Study
The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning
Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
UFEval: Unified Fine-grained Evaluation with Task and Aspect Generalization
Load more
FakeIDet: Exploring Patches for Privacy-Preserving Fake ID Detection
Created by
Haebom
作者
Javier Mu noz-Haro, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez
概要
本論文は、身分証明書偽変造検知の分野におけるデータ不足の問題を解決するためのパッチベースのプライバシー保護アプローチを提示する。既存の研究が個人情報保護の問題で公開データセットを使用できない限界を克服しようと、実際の身分証明書のパッチイメージを活用してFakeIDetという新しい偽造造検知方法を提案する。 2つのレベルの匿名化(完全および部分的な匿名化)とさまざまなパッチサイズを試して、パフォーマンスとプライバシーのバランスを模索し、Vision Transformerとベースモデルをバックボーンとして使用します。実験の結果、DLC-2021データセットでパッチ単位と全体ID単位でそれぞれ13.91%と0%のEER(Equal Error Rate)を達成し、優れた性能を示した。また、48,400個のパッチ画像を含む公開データセットFakeIDet-dbを公開し、今後の研究のための基盤を設けた。
Takeaways、Limitations
•
Takeaways:
◦
実際の身分証明書データ不足のトラブルシューティングのためのパッチベースのプライバシー保護アプローチの提示
◦
新しい偽変調検出法FakeIDetの提案と優れた性能検証(低EER達成)
◦
最初の公開実績証明書パッチデータセットFakeIDet-db公開による研究の有効化
◦
さまざまな匿名化レベルとパッチサイズの実験により、プライバシーとパフォーマンスのバランス点を求める
•
Limitations:
◦
提供されたデータセットのサイズと多様性の追加検証が必要
◦
さまざまな種類の偽造IDの一般化性能評価を追加する必要があります
◦
実際の現場適用時に発生する可能性のあるさまざまなノイズと干渉に対する耐性評価の欠如
◦
パッチベースのアプローチの制限により、身分証明書のイメージ全体を活用する方法と比較してパフォーマンスが低下する可能性があります
PDFを見る
Made with Slashpage