An information-theoretic framework for resolving community structure in complex networks

Citation: Martin Rosvall, Carl T. Bergstrom (2007) An information-theoretic framework for resolving community structure in complex networks. PNAS (RSS)

doi: 10.1073

The authors develop an information theoretic method for finding structure in networks (undirected, unweighted graphs). Their method attempts to identify modules. A module is a collection of vertices (i.e. nodes) that has high intraconnectivity (many edges connecting vertices in the same module) but low interconnectivity (few edges connecting vertices in different modules). The choice of number of modules and assignment of vertices to module is the solution that minimizes $L(Y) + L(X given Y)$. $L(Y)$ is the description length of the number of modules, assignment of vertices to modules, and connectivity within and between modules. $L(X given Y)$ is the description length of specifying the exact network given the assignment of vertices to modules and connectivity between modules. This procedure is in fact MDL where $L(Y)$ is the description length of the model and $L(X given Y)$ is the likelihood given a particular model. The authors compare their method to the modularity approach of Newman and Girvan on simulated data, a network of journal articles and citations, and the famous Zachary's karate club network. The information theoretic method outperforms the modularity approach in most simulations and demonstrates interesting network structures in the two real data sets.
$\sum_{i=1}^n$