This function generates communities based on the OCG algorithm.

getOCG.clusters(network, init.class.sys = 3, max.class.card = 0, 
                cent.class.sys = 1, min.class = 2, verbose = TRUE, keep.out = FALSE)

Arguments

network

Either a character string naming the file containing the network as an edge list, or a data frame/matrix object containing the edge list.

init.class.sys

An integer number specifying the Initial Class System: 1 - Maximal Cliques, 2 - Edges, or 3 - Centered Cliques. Defaults to 3.

max.class.card

An integer number specifying the maximum allowed class cardinality. Defaults to 0, which indicates no constraint.

cent.class.sys

A binary value indicating the choice of class system for centered cliques: 0 - Final class system, needs the expected minimum number of clusters and the maximum caldinality of the final clusters, or 1 - the class system that maximizes modularity. Defaults to 1.

min.class

An integer number specifying the minimum number of expected classes. Defaults to 2.

verbose

Logical, whether to display progress of the algorithm to the screen. Defaults to TRUE.

keep.out

Logical, whether to keep the OCG partition intermediate file on disk or not. Defaults to FALSE.

Value

An object of class OCG, which is a list containing the following elements:

numbers

An integer vector with the number of edges, nodes, and communities.

modularity

An integer number specifying the modularity of the network.

Q

A real number specifying the value of Q generated by the OCG algorithm.

nodeclusters

A data frame consisting of 2 columns; the first contains node names, and the second contains single community IDs for each node. All communities and their nodes are represented, but not necessarily all nodes.

numclusters

A named integer vector. Names are node names and integer values are the number of communities to which each node belongs.

igraph

An object of class igraph. The network is represented here as an igraph object.

edgelist

A character matrix with 2 columns containing the nodes that interact with each other.

clustsizes

A named integer vector. Names are community IDs and integer values indicate the number of nodes that belong in each community.

References

Becker, E., Robisson, B., Chapple, C.E., Guenoche, A. and Brun, C. (2012) Multifunctional proteins revealed by overlapping clustering in protein interaction network. Bioinformatics 28, 84-90.

Author

Alain Guenoche (main algorithm), and ported into R by Alex T. Kalinka alex.t.kalinka@gmail.com

Note

For optimal results, the input network must contain at least one connected component (a subgraph in which any two vertices are connected by a path, which is not connected to additional vertices in the supergraph).

Examples

## Generate graph and extract OCG communities. g <- swiss[,3:4] oc <- getOCG.clusters(g)
#> Calculating Initial class System....Done #> Nb. of classes 21 #> Nb. of edges not within the classes 11 #> Number of initial classes 21 #> Running.... #> Remaining classes: 20 of 21 Remaining classes: 10 of 21 Remaining classes: None #> Reading OCG data... #> Extracting cluster sizes... 5% Extracting cluster sizes... 11% Extracting cluster sizes... 17% Extracting cluster sizes... 23% Extracting cluster sizes... 29% Extracting cluster sizes... 35% Extracting cluster sizes... 41% Extracting cluster sizes... 47% Extracting cluster sizes... 52% Extracting cluster sizes... 58% Extracting cluster sizes... 64% Extracting cluster sizes... 70% Extracting cluster sizes... 76% Extracting cluster sizes... 82% Extracting cluster sizes... 88% Extracting cluster sizes... 94% Extracting cluster sizes... 100%