[공지사항]을 빙자한 안부와 근황
Show more
/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
Merge Kernel for Bayesian Optimization on Permutation Space
Demographic-aware fine-grained classification of pediatric wrist fractures
Generative Multi-Target Cross-Domain Recommendation
ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models
A Simple Baseline for Stable and Plastic Neural Networks
WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
How Not to Detect Prompt Injections with an LLM
Critiques of World Models
The role of large language models in UI/UX design: A systematic literature review
LearnLens: LLM-Enabled Personalised, Curriculum-Grounded Feedback with Educators in the Loop
STACK: Adversarial Attacks on LLM Safeguard Pipelines
ZonUI-3B: A Lightweight Vision-Language Model for Cross-Resolution GUI Grounding
Understanding Reasoning in Thinking Language Models via Steering Vectors
Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis
SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Exploring Graph Representations of Logical Forms for Language Modeling
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs
CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors
Hands-On: Segmenting Individual Signs from Continuous Sequences
Can we ease the Injectivity Bottleneck on Lorentzian Manifolds for Graph Neural Networks?
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation
AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results
An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model
Evaluating link prediction: New perspectives and recommendations
Learning to Reason at the Frontier of Learnability
Stonefish: Supporting Machine Learning Research in Marine Robotics
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
On the Transfer of Knowledge in Quantum Algorithms
Code Readability in the Age of Large Language Models: An Industrial Case Study from Atlassian
Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Consistency of Responses and Continuations Generated by Large Language Models on Social Media
From Code to Compliance: Assessing ChatGPT's Utility in Designing an Accessible Webpage -- A Case Study
Temporal reasoning for timeline summarisation in social media
Invisible Textual Backdoor Attacks based on Dual-Trigger
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Two-Stage Pretraining for Molecular Property Prediction in the Wild
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
An Approach for Auto Generation of Labeling Functions for Software Engineering Chatbots
Bridging Local and Global Knowledge via Transformer in Board Games
Entropy Loss: An Interpretability Amplifier of 3D Object Detection Network for Intelligent Driving
FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Visual Grounding Methods for Efficient Interaction with Desktop Graphical User Interfaces
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
SecurePose: Automated Face Blurring and Human Movement Kinematics Extraction from Videos Recorded in Clinical Settings
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges
From Roots to Rewards: Dynamic Tree Reasoning with RL
Illuminating the Three Dogmas of Reinforcement Learning under Evolutionary Light
Instance space analysis of the capacitated vehicle routing problem
Multi-Agent LLMs as Ethics Advocates for AI-Based Systems
GATSim: Urban Mobility Simulation with Generative Agents
Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know?
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Strategic Reflectivism In Intelligent Systems
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
What the F*ck Is Artificial General Intelligence?
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios
To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization
BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems
UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception
CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis
Toward Temporal Causal Representation Learning with Tensor Decomposition
Kolmogorov Arnold Networks (KANs) for Imbalanced Data - An Empirical Perspective
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) トラック
Multi-Centre Validation of a Deep Learning Model for Scoliosis Assessment
The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?
DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits
Edge Intelligence with Spiking Neural Networks
VLA-Mark: A cross modal watermark for large vision-language alignment model
Noradrenergic-inspired gain modulation attenuates the stability gap in joint training
A multi-strategy improved snake optimizer for 3-dimensional UAV path planning and engineering problems
Photonic Fabric Platform for AI Accelerators
OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
A segmented robot grasping perception neural network for edge AI
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation
Exploiting Primacy Effect To Improve Large Language Models
Generalist Forecasting with Frozen Video Models via Latent Diffusion
Convergent transformations of visual representation in brains and models
Preprint: Did I Just Browse A Website Written by LLMs?
The Levers of Political Persuasion with Conversational AI
Political Leaning and Politicalness Classification of Texts
Self-supervised learning on gene expression data
Using LLMs to identify features of personal and professional skills in an open-response situational judgment test
Load more
Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) トラック
Created by
Haebom
作者
Brian Ondov, William Xia, Kush Attal, Ishita Unde, Jerry He, Hoa Dang, Ian Soboroff, Dina Demner-Fushman
概要
本論文は、2023年と2024年にText Retrieval Conferencesで開催されたPlain Language Adaptation of Biomedical Abstracts(PLABA)トラックの結果を示しています。 PLABAトラックは、専門的な医学論文の要約を一般人が理解しやすい平易な言語に変換することに焦点を当てました。 2つの課題(Task 1:緑全体を書き直し、Task 2:難しい用語の識別と置換)を通じて、多層パーセプトロンから事前訓練された巨大言語モデル(LLM)まで、さまざまなモデルを評価しました。タスク1では、上位モデルは専門家レベルの正確性と完全性を達成しましたが、簡潔さと明瞭性は不足しており、自動評価指標は手動評価と相関関係が低かった。タスク2では、難しい用語の識別と代替方法の分類に困難がありましたが、LLMベースのシステムは、代替用語の生成において正確性、完全性、簡潔さの点で優れた性能を示しました。
Takeaways、Limitations
•
Takeaways:
◦
巨大言語モデルを活用して専門医学論文を一般大衆のための言語に変換する可能性を示した。
◦
LLMベースのシステムは、医学用語の代替作業でかなりの性能を示した。
◦
専門家レベルの精度と完全性を備えた医学情報変換モデルの開発可能性を示唆する。
•
Limitations:
◦
自動評価指標は手動評価との相関が低く、改善された自動評価ツールの開発が必要です。
◦
上位モデルでさえ、簡潔さと明瞭性の面では依然として改善の余地があります。
◦
難しい用語の識別と適切な代替語の選択に困難を示した。
PDFを見る
Made with Slashpage