by
Charles F. Mann,
David W. Matula,
and
Eli V. Olinick
This paper appears Social Networks Vol. 30, No. 3, July 2008, pp. 223-234.
We show that good community structures can be obtained by partitioning a social network in a succession of divisive sparsest cuts. A network flow algorithm based on fundamental principles of graph theory is introduced to identify the sparsest cuts and an underlying hierarchical community structure of the network via maximum concurrent flow. Matula (1985) established the Maximum Concurrent Flow Problem (MCFP), and papers on divisive vs. agglomerative average linkage hierarchical clustering (e.g., Matula 1983, Matula 1986, and Thompson 1985) provide the basis for partitioning a social network by way of sparsest cuts and/or maximum concurrent flow. The MCFP extends the Ford-Fulkerson (1956 and 1962) source-sink max-flow problem to all-pairs maximum concurrent flow. The density of an (S, T)-cut is |(S, T)| / ( | S | • | T | ) where |(S, T)| is the number of edges (equivalently, links or bonds) between communities S and T with | S | • | T | being the maximum number of edges possible. The minimum density cut in the network is the sparsest cut. When the edges are weighted, the density is the average weight of the (S, T)-cut edges, with absent edges treated as edges of zero weight. Sparsest cuts (i.e., minimum density) of the remaining components are iteratively determined via linear programming until a multipartite cut is identified that is more constraining to concurrent flow than any sparsest cut. Empirical results on real-world networks with known hierarchical structures and random networks with embedded communities are used to compare this sparsest cut community structure criterion to the weighted Girvan-Newman community structure criterion (e.g., Newman 2004) that is based on edge betweenness centrality. A new measure of accuracy M is defined to evaluate the results of the graph partitionings found by these criteria.
The AMPL run file script.txt gives the sequence of AMPL commands to run our algorithm on the Florentine families network shown in Figure 1 of the paper. To use the run file on a system with AMPL and CPLEX (or another LP solver compatible with AMPL) download the following files:
Questions about running the code should be addressed to Eli Olinick.