- Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees Local graph clustering methods aim to detect small clusters in very large graphs without the need to process the whole graph. They are fundamental and scalable tools for a wide range of tasks such as local community detection, node ranking and node embedding. While prior work on local graph clustering mainly focuses on graphs without node attributes, modern real-world graph datasets typically come with node attributes that provide valuable additional information. We present a simple local graph clustering algorithm for graphs with node attributes, based on the idea of diffusing mass locally in the graph while accounting for both structural and attribute proximities. Using high-dimensional concentration results, we provide statistical guarantees on the performance of the algorithm for the recovery of a target cluster with a single seed node. We give conditions under which a target cluster generated from a fairly general contextual random graph model, which includes both the stochastic block model and the planted cluster model as special cases, can be fully recovered with bounded false positives. Empirically, we validate all theoretical claims using synthetic data, and we show that incorporating node attributes leads to superior local clustering performances using real-world graph datasets. 2 authors · Jan 30, 2023
2 Semantic Random Walk for Graph Representation Learning in Attributed Graphs In this study, we focus on the graph representation learning (a.k.a. network embedding) in attributed graphs. Different from existing embedding methods that treat the incorporation of graph structure and semantic as the simple combination of two optimization objectives, we propose a novel semantic graph representation (SGR) method to formulate the joint optimization of the two heterogeneous sources into a common high-order proximity based framework. Concretely, we first construct an auxiliary weighted graph, where the complex homogeneous and heterogeneous relations among nodes and attributes in the original graph are comprehensively encoded. Conventional embedding methods that consider high-order topology proximities can then be easily applied to the newly constructed graph to learn the representations of both node and attribute while capturing the nonlinear high-order intrinsic correlation inside or among graph structure and semantic. The learned attribute embeddings can also effectively support some semantic-oriented inference tasks (e.g., semantic community detection), helping to reveal the graph's deep semantic. The effectiveness of SGR is further verified on a series of real graphs, where it achieves impressive performance over other baselines. 1 authors · May 10, 2023