An algorithm for minimizing the energy of the model usually works in the following way: It starts with an initial layout, where the positions of the vertices are randomly assigned. Then, in every iteration, the algorithm tries to improve the layout according to the energy model (by using the first derivation of the energy function to compute a direction and a distance for the movement of each vertex). Since the graphs we have to deal with are usually large (especially software graphs), we cannot afford to use algorithms with complexity in O(|V|2) per iteration. The algorithm of Barnes and Hut [BH86] is in O(|E| + |V| log|V|) per iteration, and is therefore sufficient for our purposes.
The energy model encodes decisions about what is considered to be a good layout. In the following, we briefly introduce some of the energy models that are supported by CCVisu. First give the concrete models, and at the end we explain our generic model that can be instantiated for each of the concrete models.
We use the following notation: A graph G=(V, E) consists of a set of vertices V and a set of edges E ⊆ V(2), where V(2) = {{u,v} | u,v∈ V} contains all sets with two vertices (i.e., undirected edges). A layout p: V → ℜd is a function that assigns to each vertex a position in the d-dimensional space (d ∈ {2, 3} is the number of dimensions, ℜ is the set of real numbers), P is the set of all layouts. An energy model U: P → ℜ is a function that assigns to each layout a real number (the smaller the value the better the layout). For a given layout p and two vertices u and v, the term ||p(u) − p(v)|| denotes the Euclidean distance of the two vertices.
The energy model of Fruchterman and Reingold
was designed to enforce layouts with uniform edge length [FR91].
It requires graph with a large diameter (or graph-theoretic distance),
for which it produces esthetic graph layouts.
The energy for a layout p is defined as:
|
The first term of the sum is interpreted as attraction between connected vertices, because its value decreases when the distance of such vertices decreases. The second term is interpreted as repulsion between all pairs of (different) vertices, because its value decreases when the distance between any two vertices increases.
The vertex-repulsion LinLog energy model
was designed for visual graph clustering,
i.e., to produce layouts that fulfill certain clustering criteria [Noa04a].
It was the first energy model that was explicitly designed
for computing clustering layouts (of software graphs).
This model requires graphs with uniform edge degree to produce good layouts.
The energy for a layout p is defined as:
|
The edge-repulsion LinLog energy model is an extension of the
vertex-repulsion LinLog model to overcome the limitation to graphs of uniform edge degree.
It was successfully used in the initial study
of co-change visualization [BN05a].
Noack provides a detailed introduction and comparison
with the vertex-repulsion LinLog model in the technical report [Noa04b].
For a comparison with the Fruchterman-Reingold model we refer to
Section 7 (or to the technical report [BN05b]).
The energy for a layout p is defined as
(the edge degree of a vertex v is denoted by deg(v)):
|
In this energy model, the second term can be interpreted as repulsion between all pairs of edges (more precisely, between the end vertices of the edges).
The weighted edge-repulsion LinLog model is defined for graphs with weighted edges,
as a straight forward extension of the edge-repulsion LinLog model.
A weighted graph G=(V, E, w) consists of
a set of vertices V,
a set of edges E ⊆ V(2), and
a function w: E → ℜ, which assigns a real number (edge weight) to each edge.
(A graph is a weighted graph with w(e) = 1 for all edges e ∈ E.)
The weighted edge degree of a vertex v is
degw(v) = ∑{u, v} ∈ E w({u, v}).
The energy for a layout p is defined as:
|
The tool CCVisu implements a generalization of the above mentioned energy models by using three parameters, which can be used to adjust the actual energy model by command-line options. (1) Option -attrExp a applies value a as exponent to the distance in the attraction term. (2) Option -repuExp r applies value r as exponent to the distance in the repulsion term if r ≠ 0, and applies the logarithm to the distance in the repulsion term if r = 0. (3) Option -vertRepu eliminates the edge-weight factor in the repulsion term by setting e to value 0. The default values are a=1, r=0, and e=1, i.e., the default energy model is the weighted edge-repulsion energy model. Table 1 describes how to set the parameters for each above-mentioned energy model.
For a weighted graph G=(V, E, w), the energy for a layout p is defined as:
|
with the attraction exponent a ∈ ℜ, the repulsion exponent r ∈ ℜ, and the edge-repulsion factor e ∈ {0, 1}.
The energy models for clustering software graphs —e.g., co-change graphs and call graphs— have to fulfill several clustering criteria, in particular, it should separate clusters and lead to interpretable distances. In difference to other graph-drawing applications, the energy models for software graphs must not enforce uniform edge length, must not be biased to the size of the clusters, and must be normalized to non-uniform degrees of the vertices. That is why we use the weighted edge-repulsion LinLog energy model (or its unweighted version) as the standard energy model. However, the generic energy model allows to express the many important energy models for software graphs in the current implementation, and the tool CCVisu is easy to extend to further variants and completely different energy models.
© Dirk Beyer