[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation

Created by
  • Haebom

Author

Yaowen Hu, Wenxuan Tu, Yue Liu, Xinhang Wan, Junyi Yan, Taichun Zhou, Xinwang Liu

Outline

In this paper, we propose a novel method, Complementary Multi-View Neighborhood Differentiation (CMV-ND), to address the deep graph clustering (DGC) problem in large-scale, missing-attribute real-world attribute graphs. CMV-ND preprocesses the structural information of the graph into multiple views that are complete and non-redundant. This is implemented by fully expanding the node neighborhood over various hop distances through recursive neighbor search, and removing redundant nodes between different hop representations through neighbor differentiation strategy. Finally, $K+1$ complementary views are constructed from $K$ differential hop representations and the features of target nodes, and conventional multi-view clustering or DGC methods are applied. Experimental results on six popular graph datasets show that CMV-ND significantly improves the performance of various methods.

Takeaways, Limitations

Takeaways:
Contributed to improving DGC performance on real-world graphs with large-scale and attribute-missing problems.
We present the possibility of generating complete and non-overlapping multi-views via recursive neighbor search and neighbor difference strategies.
Flexibility through compatibility with various existing DGC methods.
Limitations:
Lack of analysis of the computational complexity of the proposed method. Recursive search can be computationally expensive, especially for large graphs.
Further validation of generalization performance on different types of graph data is needed.
Lack of clear guidance on determining the optimal number of views ($K$).
👍