[공지사항]을 빙자한 안부와 근황
Show more
/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks
Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants
A Roadmap for Climate-Relevant Robotics Research
Fairness Is Not Enough: Auditing Competence and Intersectional Bias in AI-powered Resume Screening
MMOne: Representing Multiple Modalities in One Scene
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks
CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance
(Almost) Free Modality Stitching of Foundation Models
A Brain Tumor Segmentation Method Based on CLIP and 3D U-Net with Cross-Modal Semantic Guidance and Multi-Level Feature Fusion
KEN: Knowledge Augmentation and Emotion Guidance Network for Multimodal Fake News Detection
THOR: Transformer Heuristics for On-Demand Retrieval
SEALGuard: Safeguarding the Multilingual Conversations in Southeast Asian Languages for LLM Software Systems
KeyRe-ID: Keypoint-Guided Person Re-Identification using Part-Aware Representation in Videos
Prompt Perturbations Reveal Human-Like Biases in LLM Survey Responses
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling
VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents
ReCode: Updating Code API Knowledge with Reinforcement Learning
Cross-Layer Discrete Concept Discovery for Interpreting Language Models
Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents
Multiple-Frequencies Population-Based Training
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
GPU Performance Portability needs Autotuning
Generating Synthetic Data via Augmentations for Improved Facial Resemblance in DreamBooth and InstantID
Coral Protocol: Open Infrastructure Connecting The Internet of Agents
MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness
Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence
ConTextual: Improving Clinical Text Summarization in LLMs with Context-preserving Token Filtering and Knowledge Graphs
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model
KP Quantum Neural Networks
VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models
Data-Efficient Deep Operator Network for Unsteady Flow: A Multi-Fidelity Approach with Physics-Guided Subsampling
Learning Universal Human Mobility Patterns with a Foundation Model for Cross-domain Data Fusion
GeoFlow-SLAM: A Robust Tightly-Coupled RGBD-Inertial and Legged Odometry Fusion SLAM for Dynamic Legged Robotics
A Multi-Stage Framework with Taxonomy-Guided Reasoning for Occupation Classification Using Large Language Models
Multi-View Node Pruning for Accurate Graph Representation
V-Max: A Reinforcement Learning Framework for Autonomous Driving
Interpretable Transformation and Analysis of Timelines through Learning via Surprisability
AI Governance InternationaL Evaluation Index (AGILE Index) 2024
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
Improving Transformer World Models for Data-Efficient RL
LLM-RecG: A Semantic Bias-Aware Framework for Zero-Shot Sequential Recommendation
SIDDA: SInkhorn Dynamic Domain Adaptation for Image Classification with Equivariant Neural Networks
Determination of galaxy photometric redshifts using Conditional Generative Adversarial Networks (CGANs)
Speech-Forensics: Towards Comprehensive Synthetic Speech Dataset Establishment and Analysis
MRGen: Segmentation Data Engine for Underrepresented MRI Modalities
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning
Dataset resulting from the user study on comprehensibility of explainable AI algorithms
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information
DeFine: Decision-Making with Analogical Reasoning over Factor Profiles
Benchmarking Sub-Genre Classification For Mainstage Dance Music
Risks of ignoring uncertainty propagation in AI-augmented security pipelines
MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications with Retrieval Augmented Generation and Knowledge Graphs
Leveraging Quantum Superposition to Infer the Dynamic Behavior of a Spatial-Temporal Neural Network Signaling Model
Bounding the Worst-class Error: A Boosting Approach
TBDetector:Transformer-Based Detector for Advanced Persistent Threats with Provenance Graph
Machine Learning Systems: A Survey from a Data-Oriented Perspective
Aime: Towards Fully-Autonomous Multi-Agent Framework
SmartThinker: Learning to Compress and Preserve Reasoning by Step-Level Length Control
Ready Jurist One: Benchmarking Language Agents for Legal Intelligence in Dynamic Environments
NTRL: Encounter Generation via Reinforcement Learning for Dynamic Difficulty Adjustment in Dungeons and Dragons
Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplifications and Resistance in Multi-Agent Based LLM-as-Judge
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
BEARCUBS: A benchmark for computer-using web agents
Demystifying MuZero Planning: Interpreting the Learned Model
LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Imbalance in Balance: Online Concept Balancing in Generation Models
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
Towards Formal Verification of LLM-Generated Code from Natural Language Prompts
Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour
Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management
QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation
Voxtral
Merge Kernel for Bayesian Optimization on Permutation Space
Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy
Automating Steering for Safe Multimodal Large Language Models
HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models
VITA: Vision-to-Action Flow Matching Policy
$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation
Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
SHIELD: A Secure and Highly Enhanced Integrated Learning for Robust Deepfake Detection against Adversarial Attacks
Prompt Injection 2.0: Hybrid AI Threats
Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Load more
Learning Universal Human Mobility Patterns with a Foundation Model for Cross-domain Data Fusion
Created by
Haebom
作者
Haoxuan Ma, Xishun Liao, Yifan Liu, Qinhua Jiang, Chris Stanford, Shangqing Cao, Jiaqi Ma
概要
本論文は、様々なデータソースを組み込む既存のアプローチの限界を克服するために、汎用的な人間の動きパターンのためのベースモデルフレームワークを提示する。地理、モビリティ、社会人口統計、交通情報など、さまざまな特性と時空間解像度を持つマルチモードデータを統合することで、個人情報保護が維持され、意味的に豊富な人間の移動経路データセットを構成します。ドメイン移行技術は、LAとエジプトのケーススタディで実証されているように、さまざまな都市環境での移行の可能性を保証します。 LLMを使用して移動経路データを意味的に豊富にし、移動パターンの包括的な理解を可能にします。生成された合成データセットが実証データで観察された移動パターンを正確に再現することを定量的評価で示し、LA郡の大規模交通シミュレーションを通じて実用性を実証する。カリフォルニアのI-405区間では、交通量は5.85%、速度は4.36%の平均絶対パーセント誤差を示し、インテリジェント交通システムおよび都市モビリティアプリケーションに対するフレームワークの可能性を示しています。
Takeaways、Limitations
•
Takeaways:
◦
さまざまなデータソースを統合して、人間の動きパターンをより正確にモデル化できる新しいフレームワークの提示。
◦
LLMを活用して移動データの意味的リッチ化と理解度の向上
◦
ドメイン遷移技術による様々な都市環境への適用可能性の証明
◦
正確な交通シミュレーションによるインテリジェント交通システムと都市計画への利用可能性の提示
•
Limitations:
◦
具体的なLLMモデルの種類と詳細な実装方法の説明の欠如
◦
個人情報保護のための具体的な技術的方法論の詳細な説明の欠如。
◦
さまざまな都市環境の一般化の可能性をさらに検証する必要性
◦
長期予測精度の評価不足
PDFを見る
Made with Slashpage