TY - JOUR AU - Yang, Liang AU - Wang, Chuan AU - Gu, Junhua AU - Cao, Xiaochun AU - Niu, Bingxin PY - 2021/05/18 Y2 - 2024/03/29 TI - Why Do Attributes Propagate in Graph Convolutional Neural Networks? JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 35 IS - 5 SE - AAAI Technical Track on Data Mining and Knowledge Management DO - 10.1609/aaai.v35i5.16588 UR - https://ojs.aaai.org/index.php/AAAI/article/view/16588 SP - 4590-4598 AB - Many efforts have been paid to enhance Graph Convolutional Network from the perspective of propagation under the philosophy that ``Propagation is the essence of the GCNNs". Unfortunately, its adverse effect is over-smoothing, which makes the performance dramatically drop. To prevent the over-smoothing, many variants are presented. However, the perspective of propagation can't provide an intuitive and unified interpretation to their effect on prevent over-smoothing. In this paper, we aim at providing a novel explanation to the question of "Why do attributes propagate in GCNNs?''. which not only gives the essence of the oversmoothing, but also illustrates why the GCN extensions, including multi-scale GCN and GCN with initial residual, can improve the performance. To this end, an intuitive Graph Representation Learning (GRL) framework is presented. GRL simply constrains the node representation similar with the original attribute, and encourages the connected nodes possess similar representations (pairwise constraint). Based on the proposed GRL, exiting GCN and its extensions can be proved as different numerical optimization algorithms, such as gradient descent, of our proposed GRL framework. Inspired by the superiority of conjugate gradient descent compared to common gradient descent, a novel Graph Conjugate Convolutional (GCC) network is presented to approximate the solution to GRL with fast convergence. Specifically, GCC adopts the obtained information of the last layer, which can be represented as the difference between the input and output of the last layer, as the input to the next layer. Extensive experiments demonstrate the superior performance of GCC. ER -