[공지사항]을 빙자한 안부와 근황
Show more
/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
Merge Kernel for Bayesian Optimization on Permutation Space
Demographic-aware fine-grained classification of pediatric wrist fractures
Generative Multi-Target Cross-Domain Recommendation
ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models
A Simple Baseline for Stable and Plastic Neural Networks
WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
How Not to Detect Prompt Injections with an LLM
Critiques of World Models
The role of large language models in UI/UX design: A systematic literature review
LearnLens: LLM-Enabled Personalised, Curriculum-Grounded Feedback with Educators in the Loop
STACK: Adversarial Attacks on LLM Safeguard Pipelines
ZonUI-3B: A Lightweight Vision-Language Model for Cross-Resolution GUI Grounding
Understanding Reasoning in Thinking Language Models via Steering Vectors
Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis
SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Exploring Graph Representations of Logical Forms for Language Modeling
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs
CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors
Hands-On: Segmenting Individual Signs from Continuous Sequences
Can we ease the Injectivity Bottleneck on Lorentzian Manifolds for Graph Neural Networks?
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation
AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results
An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model
Evaluating link prediction: New perspectives and recommendations
Learning to Reason at the Frontier of Learnability
Stonefish: Supporting Machine Learning Research in Marine Robotics
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
On the Transfer of Knowledge in Quantum Algorithms
Code Readability in the Age of Large Language Models: An Industrial Case Study from Atlassian
Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Consistency of Responses and Continuations Generated by Large Language Models on Social Media
From Code to Compliance: Assessing ChatGPT's Utility in Designing an Accessible Webpage -- A Case Study
Temporal reasoning for timeline summarisation in social media
Invisible Textual Backdoor Attacks based on Dual-Trigger
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Two-Stage Pretraining for Molecular Property Prediction in the Wild
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
An Approach for Auto Generation of Labeling Functions for Software Engineering Chatbots
Bridging Local and Global Knowledge via Transformer in Board Games
Entropy Loss: An Interpretability Amplifier of 3D Object Detection Network for Intelligent Driving
FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Visual Grounding Methods for Efficient Interaction with Desktop Graphical User Interfaces
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
SecurePose: Automated Face Blurring and Human Movement Kinematics Extraction from Videos Recorded in Clinical Settings
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges
From Roots to Rewards: Dynamic Tree Reasoning with RL
Illuminating the Three Dogmas of Reinforcement Learning under Evolutionary Light
Instance space analysis of the capacitated vehicle routing problem
Multi-Agent LLMs as Ethics Advocates for AI-Based Systems
GATSim: Urban Mobility Simulation with Generative Agents
Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know?
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Strategic Reflectivism In Intelligent Systems
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
What the F*ck Is Artificial General Intelligence?
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios
To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization
BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems
UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception
CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis
Toward Temporal Causal Representation Learning with Tensor Decomposition
Kolmogorov Arnold Networks (KANs) for Imbalanced Data - An Empirical Perspective
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) トラック
Multi-Centre Validation of a Deep Learning Model for Scoliosis Assessment
The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?
DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits
Edge Intelligence with Spiking Neural Networks
VLA-Mark: A cross modal watermark for large vision-language alignment model
Noradrenergic-inspired gain modulation attenuates the stability gap in joint training
A multi-strategy improved snake optimizer for 3-dimensional UAV path planning and engineering problems
Photonic Fabric Platform for AI Accelerators
OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
A segmented robot grasping perception neural network for edge AI
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation
Exploiting Primacy Effect To Improve Large Language Models
Generalist Forecasting with Frozen Video Models via Latent Diffusion
Convergent transformations of visual representation in brains and models
Preprint: Did I Just Browse A Website Written by LLMs?
The Levers of Political Persuasion with Conversational AI
Political Leaning and Politicalness Classification of Texts
Self-supervised learning on gene expression data
Using LLMs to identify features of personal and professional skills in an open-response situational judgment test
Load more
Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder
Created by
Haebom
作者
Feng Chen, Weizhe Xu, Changye Li, Serguei Pakhomov, Alex Cohen, Simran Bhola, Sandy Yin, Sunny X Tang, Michael Mackinley, Lena Palaniyappan, Dror Ben-Zeev, Trevor Cohen
概要
本研究は、造弦病スペクトル障害の重要な症状である形式的思考障害(FTD)の客観的かつ拡張可能な評価のために自動音声認識(ASR)技術を利用した。既存の臨床評価尺度の限界を克服するために、ASRを通じて得られた音声の言語的および時間的特徴、特に停止動作を分析してFTD重症度予測に活用した。 3つのデータセット(自然な磁気記録日記、構造化された絵の説明、夢の物語)を使用して、停止関連の特徴と既存の意味一貫性尺度を組み合わせて、支持ベクトル回帰(SVR)分析を行いました。その結果、停止特徴だけでもFTD重症度を強力に予測することができ、停止特徴と意味一貫性尺度を統合したモデルは、意味のみ考慮したモデルよりも予測性能が向上したことを確認した(最大相関係数ρ=0.649、AUC=83.71%)。これらの結果は、時間的および意味的分析を組み合わせたフレームワークが、組織化されていない言語の評価を改善し、精神病における自動音声分析の発展に寄与し得ることを示唆している。
Takeaways、Limitations
•
Takeaways:
◦
自動音声認識(ASR)ベースの客観的でスケーラブルなFTD評価方法の提示。
◦
停止機能がFTDの重症度予測に重要な役割を果たすことを証明します。
◦
停止の特徴と意味一貫性尺度統合によるFTD予測性能の向上の確認
◦
さまざまな状況(自然な日記、絵の説明、夢の物語)で一貫したパフォーマンスが向上します。
•
Limitations:
◦
使用されるデータセットの規模が比較的小さい場合があります。
◦
様々な言語と文化的背景の一般化の可能性に関するさらなる研究が必要
◦
停止パターンのデータセット依存性が存在します。
◦
臨床的利用のための追加の検証と標準化プロセスが必要です。
PDFを見る
Made with Slashpage