/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
CTA: Cross-Task Alignment for Better Test Time Training
OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model
Classification of autoimmune diseases from Peripheral blood TCR repertoires by multimodal multi-instance learning
What's Making That Sound Right Now? Video-centric Audio-Visual Localization
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
Domain Generalizable Portrait Style Transfer
StreamDiT: Real-Time Streaming Text-to-Video Generation
From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Visual Concepts in Brain Signal Analysis
BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
Neural-Network solver of ideal MHD equilibria
RAG-R1: Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism
Evaluating AI Counseling in Japanese: Counselor, Client, and Evaluator Roles Assessed by Motivational Interviewing Criteria
Hita: Holistic Tokenizer for Autoregressive Image Generation
Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem
Horus: A Protocol for Trustless Delegation Under Uncertainty
Geological Everything Model 3D: A Promptable Foundation Model for Unified and Zero-hot Subsurface Understanding
SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures
WATS: Calibrating Graph Neural Networks with Wavelet-Aware Temperature Scaling
IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes
Tailored Conversations beyond LLMs: A RL-Based Dialogue Manager
Enhancing Generalization of Spiking Neural Networks Through Temporal Regularization
Instruction Following by Boosting Attention of Large Language Models
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
LLMs on support of privacy and security of mobile apps: state of the art and research directions
On the Fundamental Impossibility of Hallucination Control in Large Language Models
Integrating Spatiotemporal Features in LSTM for Spatially Informed COVID-19 Hospitalization Forecasting
CuVSLAM: CUDA accelerated visual odometry and mapping
Enhancing GOP in CTC-Based Mispronunciation Detection with Phonological Knowledge
An empirical study of task and feature correlations in the reuse of pre-trained models
EEG2TEXT-CN: An Exploratory Study of Open-Vocabulary Chinese Text-EEG Alignment via Large Language Model and Contrastive Learning on ChineseEEG
Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Towards General Continuous Memory for Vision-Language Models
Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)
Bayesian Hierarchical Invariant Prediction
Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps
Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
Overcoming Data Scarcity in Generative Language Modelling for Low-Resource Languages: A Systematic Review
The GenAI Generation: Student Views of Awareness, Preparedness, and Concern
Variational OOD State Correction for Offline Reinforcement Learning
Heat Diffusion Models - Interpixel Attention Mechanism
NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models
Offline Learning and Forgetting for Reasoning with Large Language Models
Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models
PVChat: Personalized Video Chat with One-Shot Learning
Challenges and Trends in Egocentric Vision: A Survey
Eyes on the Environment: AI-Driven Analysis for Fire and Smoke Classification, Segmentation, and Detection
Analytic Subspace Routing: How Recursive Least Squares Works in Continual Learning of Large Language Model
A Survey on Transformer Context Extension: Approaches and Evaluation
Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance
UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
The Algorithmic State Architecture (ASA): An Integrated Framework for AI-Enabled Government
A Cascading Cooperative Multi-agent Framework for On-ramp Merging Control Integrating Large Language Models
Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records
GMLM: Bridging Graph Neural Networks and Language Models for Heterophilic Node Classification
Fundamental Limits of Hierarchical Secure Aggregation with Cyclic User Association
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
RSPO: Regularized Self-Play Alignment of Large Language Models
Fine-Grained Knowledge Structuring and Retrieval for Visual Question Answering
Efficient Risk-sensitive Planning via Entropic Risk Measures
Bayesian Optimization for Controlled Image Editing via LLMs
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Composable Strategy Framework with Integrated Video-Text based Large Language Models for Heart Failure Assessment
Safe Beyond the Horizon: Efficient Sampling-based MPC with Neural Control Barrier Functions
A Theory for Conditional Generative Modeling on Multiple Data Sources
Unsupervised Anomaly Detection through Mass Repulsing Optimal Transport
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
DeepCell: Self-Supervised Multiview Fusion for Circuit Representation Learning
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding
Holistic Construction Automation with Modular Robots: From High-Level Task Specification to Execution
Aria-UI: Visual Grounding for GUI Instructions
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
Random Walks with Tweedie: A Unified View of Score-Based Diffusion Models
Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning
Advancing Stroke Risk Prediction Using a Multi-modal Foundation Model
An AI Theory of Mind Will Enhance Our Collective Intelligence
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Longitudinal Ensemble Integration for sequential classification with multimodal data
Improving Trust Estimation in Human-Robot Collaboration Using Beta Reputation at Fine-grained Timescales
Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs
The Nexus of AR/VR, AI, UI/UX, and Robotics Technologies in Enhancing Learning and Social Interaction for Children with Autism Spectrum Disorders: A Systematic Review
What Would You Ask When You First Saw $a^2+b^2=c^2$? Evaluating LLM on Curiosity-Driven Questioning
Liability and Insurance for Catastrophic Losses: the Nuclear Power Precedent and Lessons for AI
Insuring Uninsurable Risks from AI: The State as Insurer of Last Resort
Empirical evidence of Large Language Model's influence on human spoken communication
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
Curvature-Aligned Federated Learning (CAFe): Harmonizing Loss Landscapes for Fairness Without Demographics
CoDy: Counterfactual Explainers for Dynamic Graphs
Optimal Transport for Domain Adaptation through Gaussian Mixture Models
Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs
Detecting value-expressive text posts in Russian social media
Deep neural networks have an inbuilt Occam's razor
TT-TFHE: a Torus Fully Homomorphic Encryption-Friendly Neural Network Architecture
SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam?
MedGemma Technical Report
Rule Learning for Knowledge Graph Reasoning under Agnostic Distribution Shift
Activation Steering for Chain-of-Thought Compression
Load more
DeepCell: Self-Supervised Multiview Fusion for Circuit Representation Learning
Created by
Haebom
作者
Zhengyuan Shi, Chengyu Ma, Ziyang Zheng, Lingfeng Zhou, Hongyang Pan, Wentao Jiang, Fan Yang, Xiaoyan Yang, Zhufei Chu, Qiang Xu
概要
DeepCellは、AIGとPMネットリストというさまざまな視点からの情報を効果的に統合する新しい回路表現学習フレームワークです。マスク言語モデリングからインスピレーションを得た自己地図学習方式であるマスク回路モデリング(MCM)戦略を使用して、異なる設計段階の相互補完的な回路表現を統合された豊富な埋め込みに融合します。 DeepCellはPMネットリスト表現学習のために明示的に設計された最初のフレームワークであり、予測精度と再構成品質の両方で新しい基準を提示します。機能的なECO(Engineering Change Orders)や技術マッピングなどの重要なEDA作業にDeepCellを適用することで実用的な効果を実証し、幅広い実験結果が従来の最新のオープンソースEDAツールよりも効率とパフォーマンスを大幅に向上させます。
Takeaways、Limitations
•
Takeaways:
◦
PMネットリスト表現学習のための最初のフレームワークの提示
◦
予測精度と再構成品質の向上
◦
機能的ECOおよび技術マッピングなどのEDA作業の効率性とパフォーマンスの向上
◦
既存の最先端のオープンソースEDAツールの性能を上回る
•
Limitations:
◦
論文では具体的なLimitationsは言及されていない。今後の研究により、さらなる改善の余地がある可能性があります。
◦
特定のEDAタスクのパフォーマンス改善が他のタスクに一般化できるかどうかについてのさらなる研究が必要です。
◦
使用されるデータセットと実験設定に関する詳細情報が必要な場合があります。
PDFを見る
Made with Slashpage