Input Formats

3 Input Formats

There are five different input formats for the different purposes of CCVisu: the CVS/SVN log-file formats for extracting the co-change graph from a CVS/SVN repository, the Doxygen XML output format (DOX) for extracting the software relations from a source code directory after Doxygen had processed the directory, graphs in Relational Standard Format (RSF) to enable the use of CCVisu for any graphs or previously stored relations, and layouts in text format (LAY) (previously computed with CCVisu or a compatible tool).

3.1 CVS/SVN Log File (CVS/SVN)

CCVisu can be used to extract a co-change graph [BN05a], which is an abstraction of the history in a software repository, from either CVS log files in text format (produced with ’cvs log -Nb’) or SVN log files in XML format (produced with ’svn log -v –xml’). CCVisu uses the log files to directly compute a co-change visualization, or to dump the co-change information into an RSF file, which can serve as input for another tool (if CCVisu is used as pure fact-extractor) or as input for CCVisu in a later processing phase. The mode of operation depends on the parameter -outFormat.

The version control system CVS does not directly keep the information about which files were checked-in together in the repository. The transactions need to be recovered from the logged information about time, user, and log message. The current CVS reader implements the heuristic used in cvs2cl (available at http://www.red-bean.com/cvs2cl) : it considers a sequence of changes of files as one change transaction if the changes have the same user login, the same log message, and time stamps that differ by at most 180 s (the constant can be adjusted by parameter -timeWindow). The co-change graph is extracted on file level. However, if a more fine-grained visualization is necessary (e.g., on method level), the techniques used in Rose [ZDZ03] can be integrated as additional reader. On the other hand, co-change graphs on higher levels (e.g., on package level) can be obtained by applying a technique called ’lifting’.

3.2 Doxygen File (DOX)

CCVisu can be used as fact-extractor for Doxygen XML files. For a given software system, Doxygen can be applied to the source-code directory in order to produce a directory of XML files that describe the structure of the software system (the option to produce XML output can be enabled in the Doxygen configuration file Doxyfile ).

Given such an XML output directory, CCVisu can be applied to extract an RSF file that contains the most important software graphs, such as the inheritance graph, containment graph, and call graph:

ccvisu.sh -inFormat DOX -i xml/index.xml -outFormat RSF -o DOXgraphs.rsf

The main XML file in this case is xml/index.xml, and the following relations are written to the file DOXgraphs.rsf.

Basic Doxygen relations:

COMPOUND     <compound-kind>    <compound-id>       <compound-name>
MEMBER       <member-kind>      <member-id>         <member-name>
CONTAINEDIN  <member-id>        <compound-id>
BASEDON      <compound-id>      <basecompound-id>   <protection-kind>  <virtual-kind>
REFERSTO     <member-id>        <referredmember-id>
LOCATEDAT    <compound-id>      <file-path>         <line-no>
LOCATEDAT    <member-id>        <file-path>         <line-no>

Derived relations:

REF<kind>    <referrer-name>         <referred-name>
refFile      <referrer-file-name>    <referred-file-name>
refClass     <referrer-class-name>   <referred-class-name>

The relation REF<kind> is derived from relation REFERSTO, where the ids are replaces by their names and the second element is of the category <kind>, e.g., REFvariable, REFfunction, ... The category <kind> can be any of the following: define, property, event, variable, typedef, enum, enumvalue, function, signal, prototype, friend, dcop, slot.

The relation refFile (refClass) is derived from relation REFERSTO, where the ids are replaced by the names of the containing files (classes), i.e., this relation is the ’lifting’ of the REF<kind> relation to the file (class) level.

3.3 Graph (RSF)

Graphs are provided in RSF format. This format is used to provide co-change graphs, which were previously computed by CCVisu or other extraction tools, or graphs in general (e.g., graphs representing the static structure of a software system, gene expression networks, etc.).

Each line in an RSF file represents an edge in the following format:

<graph name> <source> <target> <weight>

For example, the line

CALL A B 0.5

represents an edge between vertices A and B of weight 0.5.

Graph requirements. Input graphs for force-directed graph layout must be irreflexive (no self-edges) and connected (no isolated subgraphs). The graph must be connected because for most energy models, the distance of two unconnected vertices is infinite in a layout with minimal energy. A software system consisting of several unconnected components must be visualized using several layouts, one for each component (small unconnected components are usually skipped because of its unimportance).

3.4 Layout (LAY)

For the purpose of transforming a given layout to the VRML or SVG format, or to display the layout on the screen, CCVisu accepts layouts in the layout text format that is described in Section 4.2 (output formats).