haebom
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
OPTIC-ER: A Reinforcement Learning Framework for Real-Time Emergency Response and Equitable Resource Allocation in Underserved African Communities
BioAnalyst: A Foundation Model for Biodiversity
Scaling Towards the Information Boundary of Instruction Sets: The Infinity Instruct Subject Technical Report
Dual-Objective Reinforcement Learning with Novel Hamilton-Jacobi-Bellman Formulations
NeuroPhysNet: A FitzHugh-Nagumo-Based Physics-Informed Neural Network Framework for Electroencephalograph (EEG) Analysis and Motor Imagery Classification
PRO-V-R1: Reasoning Enhanced Programming Agent for RTL Verification
Large language models can learn and generalize steganographic chain-of-thought under process supervision
Turing Test 2.0: The General Intelligence Threshold
Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments
Empowering Clients - Transformation of Design Processes Due to Generative AI
The Delusional Hedge Algorithm as a Model of Human Learning from Diverse Opinions
The Universal Weight Subspace Hypothesis
DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
Semantic Soft Bootstrapping: Long Context Reasoning in LLMs without Reinforcement Learning
TV2TV: A Unified Framework for Interleaved Language and Video Generation
Structured Document Translation via Format Reinforcement Learning
SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards
David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design?
Multi-LLM Collaboration for Medication Recommendation
Meta-Learning for Quantum Optimization via Quantum Sequence Model
QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
Model-Free Assessment of Simulator Fidelity via Quantile Curves
Reflection Removal through Efficient Adaptation of Diffusion Transformers
Evolutionary Architecture Search through Grammar-Based Sequence Alignment
Strategic Self-Improvement for Competitive Agents in AI Labour Markets
Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis
GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
Realizable Abstractions: Near-Optimal Hierarchical Reinforcement Learning
LLMs Know More Than Words: A Genre Study with Syntax, Metaphor & Phonetics
CARL: Critical Action Focused Reinforcement Learning for Multi-Step Agent
Declarative Synthesis and Multi-Objective Optimization of Stripboard Circuit Layouts Using Answer Set Programming
ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching
Developing a General Personal Tutor for Education
SEAL: Self-Evolving Agentic Learning for Conversational Question Answering over Knowledge Graphs
Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding
Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
From Symptoms to Systems: An Expert-Guided Approach to Understanding Risks of Generative AI for Eating Disorders
SoK: a Comprehensive Causality Analysis Framework for Large Language Model Security
Setting up for failure: automatic discovery of the neural mechanisms of cognitive errors
287,872 Supermassive Black Holes Masses: Deep Learning Approaching Reverberation Mapping Accuracy
YingMusic-SVC: Real-World Robust Zero-Shot Singing Voice Conversion with Flow-GRPO and Singing-Specific Inductive Biases
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
Using Machine Learning to Take Stay-or-Go Decisions in Data-driven Drone Missions
Embodied Co-Design for Rapidly Evolving Agents: Taxonomy, Frontiers, and Challenges
UnwrapDiff: Conditional Diffusion for Robust InSAR Phase Unwrapping
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
Neural Policy Composition from Free Energy Minimization
OsmT: Bridging OpenStreetMap Queries and Natural Language with Open-source Tag-aware Language Models
E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving
Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild
Towards an AI Fluid Scientist: LLM-Powered Scientific Discovery in Experimental Fluid Mechanics
Large Speech Model Enabled Semantic Communication
TimesNet-Gen: Deep Learning-based Site Specific Strong Motion Generation
Generative AI for Self-Adaptive Systems: State of the Art and Research Roadmap
Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs
Semi Centralized Training Decentralized Execution Architecture for Multi Agent Deep Reinforcement Learning in Traffic Signal Control
SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
When GenAI Meets Fake News: Understanding Image Cascade Dynamics on Reddit
When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
A Light-Weight Large Language Model File Format for Highly-Secure Model Distribution
Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function
RRPO: Robust Reward Policy Optimization for LLM-based Emotional TTS
Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model
PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement
Prototype-Based Semantic Consistency Alignment for Domain Adaptive Retrieval
UW-BioNLP at ChemoTimelines 2025: Thinking, Fine-Tuning, and Dictionary-Enhanced LLM Systems for Chemotherapy Timeline Extraction
GraphBench: Next-generation graph learning benchmarking
GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis
Open-Ended Goal Inference through Actions and Language for Human-Robot Collaboration
NORi: An ML-Augmented Ocean Boundary Layer Parameterization
Automating Complex Document Workflows via Stepwise and Rollback-Enabled Operation Orchestration
Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models
Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection
Towards 6G Native-AI Edge Networks: A Semantic-Aware and Agentic Intelligence Paradigm
Adversarial Limits of Quantum Certification: When Eve Defeats Detection
FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring
MASE: Interpretable NLP Models via Model-Agnostic Saliency Estimation
STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Grained Pollution Forecasting
AutoGuard: A Self-Healing Proactive Security Layer for DevSecOps Pipelines Using Reinforcement Learning
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
Counting Without Running: Evaluating LLMs' Reasoning About Code Complexity
RGE-GCN: Recursive Gene Elimination with Graph Convolutional Networks for RNA-seq based Early Cancer Detection
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks
MANTRA: a Framework for Multi-stage Adaptive Noise TReAtment During Training
Evaluating Long-Context Reasoning in LLM-Based WebAgents
Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications
Learning Single-Image Super-Resolution in the JPEG Compressed Domain
Bootstrapped Mixed Rewards for RL Post-Training: Injecting Canonical Action Order
Quantitative Analysis of Technical Debt and Pattern Violation in Large Language Model Architectures
The Initialization Determines Whether In-Context Learning Is Gradient Descent
Catching UX Flaws in Code: Leveraging LLMs to Identify Usability Flaws at the Development Stage
Hey GPT-OSS, Looks Like You Got It - Now Walk Me Through It! An Assessment of the Reasoning Language Models Chain of Thought Mechanism for Digital Forensics
Fine-Tuning ChemBERTa for Predicting Inhibitory Activity Against TDP1 Using Deep Learning
MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models
CRAFT-E: A Neuro-Symbolic Framework for Embodied Affordance Grounding
Load more
Enhancing Public Speaking Skills in Engineering Students Through AI
Created by
Haebom
Category
Empty
저자
Amol Harsh, Brainerd Prince, Siddharth Siddharth, Deepan Raj Prabakar Muthirayan, Kabir S Bhalla, Esraaj Sarkar Gupta, Siddharth Sahu
개요
공학 분야 학생들의 효과적인 의사소통 문제를 해결하기 위해, 음성 분석, 컴퓨터 비전, 감성 분석을 결합한 AI 기반 평가 모델을 개발했습니다. 이 모델은 언어적, 비언어적 측면, 표현의 일관성을 평가하며, 개인화되고 확장 가능한 피드백을 제공합니다. Gemini Pro를 포함한 LLM을 활용하여 전문가 평가와 유사한 결과를 보였습니다.
시사점, 한계점
•
AI 기반의 개인화된 피드백을 통해 공학 학생들의 발표 능력 향상 가능성 제시
•
언어적, 비언어적 요소 및 표현 일관성을 통합적으로 평가하는 새로운 방식 제시
•
Gemini Pro를 활용하여 AI 모델의 성능을 향상시킴
•
AI 기반의 발표 훈련 시스템을 통해 반복적인 연습 및 자연스러운 표현 개선 지원
•
초기 테스트 결과가 전문가 평가와 일치하는 정도가 중간 수준임
•
LLM 모델의 성능에 의존하며, 모델의 한계가 시스템의 한계로 이어질 수 있음
•
다양한 청중 및 발표 상황에 대한 모델의 일반화 능력 검증 필요
•
실제 학습 효과에 대한 장기적인 평가 및 추가 연구 필요
PDF 보기
Made with Slashpage