VISON: An Ontology-Based Approach for Software Visualization Tool Discoverability

Although many tools have been presented in the research literature of software visualization, there is little evidence of their adoption. To choose a suitable visualization tool, practitioners need to analyze various characteristics of tools such as their supported software concerns and level of maturity. Indeed, some tools can be prototypes for which the lifespan is expected to be short, whereas others can be fairly mature products that are maintained for a longer time. Although such characteristics are often described in papers, we conjecture that practitioners willing to adopt software visualizations require additional support to discover suitable visualization tools. In this paper, we elaborate on our efforts to provide such support. To this end, we systematically analyzed research papers in the literature of software visualization and curated a catalog of 70 available tools that employ various visualization techniques to support the analysis of multiple software concerns. We further encapsulate these characteristics in an ontology. VISON, our software visualization ontology, captures these semantics as concepts and relationships. We report on early results of usage scenarios that demonstrate how the ontology can support (i) developers to find suitable tools for particular development concerns, and (ii) researchers who propose new software visualization tools to identify a baseline tool for a controlled experiment.


I. INTRODUCTION
Complex questions may arise during software development [1]- [4]. Over the last two decades, many software visualizations have been presented in the research literature and shown to be suitable to address some of these questions [5]. However, there is still little evidence of where and how software visualizations are being discovered and adopted by practitioners. To find a suitable tool, practitioners need to examine aspects such as the development tasks supported by the tool, the required execution environment, the level of maturity of the tool, and whether there is a maintenance plan for future improvements and bug fixes. For example, practitioners can be reluctant to adopt some prototypical visualization tools that often have a short lifespan, and more open to adopt tools that belong to long-term projects and are expected to be maintained for a fairly long time.
To address the gap between existing software visualizations and their practical applications, we build on previous studies [6], [7] in which we reviewed the literature of software visualization to collect their characteristics. This article is based on results that were reported in the doctoral thesis of the first author [8]. We updated the review and curated a catalog of 70 publicly available software visualization tools. For each tool in the catalog, we identify (i) the tool's name (e.g., Jive), (ii) software aspects (e.g., behavior, structure, evolution), (iii) software concerns (e.g., execution traces of Java programs), (iv) last update (e.g., 2017), (v) execution environment (e.g., Eclipse plug-in), (vi) employed visualization techniques (e.g., node-link diagram), (vii) display medium (e.g., standard computer screen (SCS), immersive virtual reality (I3D)), and (viii) evaluations (e.g., controlled experiment). Figure 1 shows a Sankey diagram of our catalog of visualization tools introduced in the literature of software visualization. The ontology, which enables both textual and visual search methods, then can be used by practitioners to find suitable software visualizations as well as by researchers who can reflect on the software visual- ization domain. The ontology can also enable higher-level frameworks to support practitioners to search for software visualization tools.
Why a software visualization ontology? Ontologies are formal and explicit descriptions of concepts in a domain [9]. Ontologies can help (i) share a common understanding of the structure of information among people or software agents, (ii) reuse domain knowledge, (iii) enforce domain assumptions, (iv) separate domain knowledge from operational knowledge, and (v) analyze domain knowledge. Figure 2 shows an overview of VISON, our software visualization ontology that encapsulates main characteristics of software visualizations. To the best of our knowledge, VISON is the first ontology of software visualizations. We elaborate on lessons learned from developing the ontology, and early results of usability through usage scenarios.
To populate VISON, we built on a set of selected papers of previous surveys of the software visualization literature [6], [7]. Specifically, we scanned each classified design study paper to identify software visualization tools. For each tool that we found, we checked whether the tool is publicly available on the internet. In the end, we curated a catalog of 70 publicly available software visualization tools that we used to populate our ontology.
The main contribution of this paper is twofold: (i) a curated catalog of 70 available software visualization tools and, (ii) a publicly available [10] software visualization ontology.
The remainder of the paper is structured as follows: Section II describes related work that focuses on practical applications of software visualizations and catalogs of software visualization tools. Section III elaborates on the main concepts of ontologies that are addressed in our study. Section IV presents VISON, our software visualization ontology. We first elaborate on a catalog of 70 available software visualization tools, and then we discuss ontology implementation details. Section VI concludes and presents future work.

II. RELATED WORK
We group related work into two main themes. We first discuss research that proposes approaches to practical applications of software visualization tools. Then, we elaborate on studies that present catalogs of software visualization tools.

A. Practical Applications of Software Visualizations
We observe that there are only a few studies that have been carried out to fill the gap between existing software visualizations and their practical applications. For instance, Hassaine et al. [11] elaborate on an approach for generating visualizations to support software maintenance tasks. Sfayhi and Sahraoui [12] describe how to generate interactive visualizations based on descriptions of code analysis tasks. To this end, developers are required to describe the task using a domain-specific language. Grammel et al. [13] investigate how novices construct information visualizations. Based on the analysis of the usage of simple visualizations such as charts and scatter plots, they identify a user's need for information visualization tools. However, we observe that these visualizations provide limited support for the analysis of development concerns. Three other studies [14]- [16] investigate software development tasks for which visualization tools have been proposed, however, we consider that the tasks in these studies are at a too high-level for developers to find an appropriate visualization to their particular needs. Merino et al. [17] introduce a metavisualization approach of live visualization example objects annotated with the type of development questions that they can help investigate. In the visualization, developers can identify suitable visualization examples by detecting the surrounding keywords in the tag-iconic cloud-based visualization. Instead, we propose the use of an ontology that can encapsulate the semantics of the characteristics of software visualizations. As opposed to the described studies, our ontology-based approach leverages existing software visualization tools by attempting to provide practitioners a means for discovery.

B. Catalogs of Software Visualization Tools
Some studies examine software visualization tools, in particular, to create guidelines for designing and evaluating software visualizations. For example, Storey et al. [18] examine 12 software visualization tools and propose a framework to evaluate software visualizations based on intent, information, presentation, interaction, and effectiveness. Sensalire et al. [19], [20] classify the features users require in software visualization tools. To this end, they elaborate on lessons learned from evaluating 20 software visualization tools and identify dimensions that can help design an evaluation and then analyze the results. In our investigation, we do not attempt to provide a comprehensive catalog of software visualization tools, but we seek to provide a means to boost software visualization discoverability.
Some other studies present taxonomies that characterize software visualization tools. Myers [21] classifies software visualization tools based on whether they focus on code, data, or algorithms; and whether they are implemented in a static or dynamic fashion. Price et al. [22] present a taxonomy of software visualization tools based on six dimensions: scope, content, form, method, interaction, and effectiveness. Maletic et al. [23] propose a taxonomy of five dimensions to classify software visualization tools: tasks, audience, target, representation, and medium. Schots et al. [24] extend this taxonomy by adding two dimensions: resource requirements of visualizations, and evidence of their utility. Merino et al. [6] add needs as a main characteristic of software visualization tools. In their context, "needs" refers to the set of questions that are supported by software visualization tools. Although we consider these studies crucial for reflecting on the software visualization domain, we think that practitioners may require a more comprehensive support to identify a suitable tool. In particular, we believe that the semantics of concepts and their relationships are often missing in taxonomies and other classifications. The use of an ontology enforces the analysis of these relationships, which can play an important role in identifying a suitable visualization tools.

III. ONTOLOGY DESIGN CONSIDERATIONS
An ontology is a formalization of a model to describe what is essential in a domain. That is, the ontology describes the concepts in the domain that can define various properties and restrictions. Hence, an ontology populated with a set of individual instances of the concepts is usually referred to as a knowledge base. However, defining what in the domain is modeled as a concept or an instance is subjective. We opted to follow the widely used guidelines proposed by Noy and McGuiness [25]. We now elaborate on how we addressed their suggested steps to create our software visualization ontology.
Step 1. Determine the domain and scope of the ontology.
• What is the domain that the ontology will cover? Software visualizations.
• For what we are going to use the ontology? To allow 1) developers to find suitable visualizations for their particular concerns and 2) researchers to reflect on the software visualization domain.
• For what types of questions the information in the ontology should provide answers? Questions that identify particular software visualizations that fulfill the restrictions imposed by the context of the developers needs.
• Who will use and maintain the ontology? Software developers willing to adopt visualizations, and who have used a visualization from the ontology and want to add new supported questions to it. Also, researchers who want to add new data to the ontology for a new or an existing indexed visualization approach.
Step 4. Define the concepts and the concept hierarchy. We opt for a bottom-up development process in which we start from instances of proposed software visualizations. For each, we identify the various concepts involved in its context (e.g., tasks, media, environments, frameworks, questions, evaluation strategies). We define a hierarchy of concepts following an "is-a" relation. When defining the concepts, we avoid creating cycles and validate that sibling concepts (i.e., at the same level in the hierarchy) correspond to the same level of generality.
Step 5. Define the properties of concepts. We characterize the concepts based on their properties. For instance, for the concept medium we define the dimensionality (e.g., 2D/3D) property. Then, when we define particular software visualizations as instances in the ontology, we can specify a medium and its dimensionality. Thus, researchers can use the ontology to investigate, for instance, the correlation between evaluation strategies and visualizations that use visualization techniques of a higher dimensionality displayed on a medium of a lower dimensionality.
Step 6. Define the restrictions of the properties. We only use restrictions to define disjoint concepts.
Step 7. Create instances. We create instances in the ontology for each proposed software visualization in our data set. Thus, visualization tools are the materialization of a combination of property values of concepts.

IV. VISON: SOFTWARE VISUALIZATION ONTOLOGY
Certainly, an empty ontology that describes concepts and relationships but has no instances cannot be useful for practitioners. Therefore, before we describe an implementation of our ontology, we elaborate on the systematic approach that we used to populate it. In the following, we describe the process followed to collect a set of relevant software visualization tools and their characteristics from the research literature.

A. Software Visualization Tools
We built on the data sets from the proposed software visualization approaches (presented in previous studies [6], [7]). We reviewed the 387 software visualization papers published in the VISSOFT/SOFTVIS conferences. Since the goal of our investigation is to facilitate the discoverability of software visualization tools, we included in our catalog only software visualization tools that are: (C1) identified with a name and (C2) publicly available on the internet.
We scanned each paper to identify a name for the proposed software visualization approach. Then, we looked for a URL where the tool might be available. In most cases (where we did not find a URL in the paper), we searched the Web using the name of the tool (C1). When we did not find a positive result, we added "visualization" to the search keywords. When we found an available tool (C2), we checked the last time when the tool was updated.
Sometimes, we had to download the tool to look for the date in the files. In the end, we found 70 software visualization tools that fulfill the criteria and that we therefore included in our catalog.
To characterize a tool we first identified its name and whether it focuses on the structure, behavior, or evolution of software systems [26]. Then, for the tools in each category, we identified the development concern expressed by the visualization. Instead of describing high-level tasks (e.g., reverse-engineering), we formulated descriptions with the main keywords of the concerns (e.g., "reports that summarize methods execution"), which we think can help developers relate their particular context to the one envisioned by a proposed visualization tool. We also classified the tools based on their execution environment, the employed visualization technique, and the medium used to display them. Finally, we reused the data presented in our previous study [7] to highlight the maturity of tools that have proven effective to support the target task through evaluations.

1) Behavior:
Several visualization tools support teaching various subjects in computer science. ToonTalk [27] comes with a visual language (similar to Scratch [28]) that is to be used on the Web. The tool targets children as an audience. We are not aware of any evaluation of ToonTalk. However, the tool has been maintained over the last twelve years, which shows evidence of maturity. Similarly, Tiled Grace [29] offers a visual representation alternative to the textual mode when programming in the Grace language. Another mature tool is Clack [30], which helps students of network courses understand the behavior of routers. GraphWorks [31] focuses on supporting students of graph theory, although it has not been maintained in the last few years.
Some other tools are available to deal with understanding the execution of programs for testing. The Eclipse plug-in Jive [33] (shown in Figure 3) stands out since it has been maintained for the last eleven years, which is congruent with anecdotal evidence of its adoption. Even though all of these tools are available, almost none of them have been maintained lately. Amongst them, ProfVis [34] is the only one that has proven effective in an experiment. A few others-Jove [35] and Veld Visualizer [36]-have been presented only through usage scenarios. Other tools that, to our knowledge, have not been evaluated are Jive [37], TraceVis [38], and Evolve [39]. Two tools-Beat [40] and Synchrovis [41]-target the analysis of the behavior of con-  current Java programs, whereas the tool Cerebro [42] can be used to identify software features from the runtime data. Three visualization tools support debugging tasks based on the visualization of program behavior. Dyvise [43] supports the detection of memory problems through the visualization of the Java heap. GEM [44] (shown in Figure 4) is a graphical explorer of MPI programs. Gzoltar [45] has shown evidence of effectiveness through an experiment. More recently, SwarmDebugging [46] is an Eclipse plug-in that aims to reuse the knowledge of previous debugging sessions to recommend locations in the code to define breakpoints.
Four visualization tools support various aspects of teaching computer science courses. PlanAni [47] is a program animation system based on the concept of the roles of variables for teaching programming. ALVIS [48] enables algorithm visualization to learning programming. jGrasp [49] is an integrated development environment with visualizations for teaching Java. Jsvee and Kelmu [50] are visualization libraries to help instructors teach aspects of the runtime of a program.
Other visualization tools deal with various particular concerns. LTSView [51] is the oldest one, and it is still being maintained. It supports the visualization of transition systems that model the behavior of a software. SIFEI [52] and xViZiT [53] focus on the visualization of spreadsheets, whereas regVis [54] deals with the visualization of assembler control-flow based on regular expressions. Method Execution Reports [55] embed word-size graphics in reports of method executions. Kayrebt [56] provides support for activity diagram extraction and a visualization toolset designed for the Linux codebase.
All the twenty-eight listed tools that focus on the behavior of software systems are displayed on the standard computer screen. 2) Structure: Various other visualization tools focus on particular concerns. MetaVis [17] can be used to visualize annotated software visualization example objects. Orion-Planning [57] includes visualization for modularization and consistency of software projects. Explen [58] supports the visualization of large metamodels. iTraceVis [59] has shown evidence of being effective to investigate how developers read code through the visualization of their eye gazes. Spartan Refactoring allows automatic code refactoring in the editor. Visuocode [60] supports the navigation and composition of software systems. Some visualization tools are available for supporting architecture tasks such as SeeIT3D [61] and VisMOOS [62]. SolidFX [63], Softwarenaut [64] (shown in Figure 5), Rigi [65], and Barrio [66] are suitable for the analysis of structures and dependencies in object-oriented software systems, AspectMaps [67] supports aspect-oriented programs, and Variability blueprint [68] does so for featureoriented programs.
Three tools support the visualization of software systems quality based on the analysis of code smells. CodeCity [69] (shown in Figure 6) and CodeMetropolis [70] visualize software metrics based on the city metaphor. StenchBlossom [71] augments the Eclipse source code editor with ambient visualizations. SolidSDD [72] supports visual clone analysis. Explora [73] is a visualization tool for software quality based on the analysis of metrics.
Twenty of the listed tools that focus on the structure of software systems are displayed using the standard computer screen. Only three use immersive virtual reality: PhysVis [76], in which users visualize software metrics shown as a physical particle system, ExplorViz [77], in which developers obtain an overview of the architecture of a system represented as a city, and CityVR [78], which  adds interactions and visualization of software metrics and smells.
3) Evolution: A few tools support the visualization of the evolution of hierarchical structures in object-oriented programs such as AGG [79]. SHriMP [80] is the oldest one and has been maintained for twelve years. MetricView [81] presents a UML class diagram in 3D that is augmented with software metrics. Others deal with various concerns. CVSgrab [82] supports the visualization of the evolution of interactions of developers during debugging, whereas Visual Code Navigator [83] and CVSscan [84] (shown in Figure 7) focus on source code changes. DEVis [85] is used to visualize the evolution of technical documents. Object Evolution Blueprint [86] deals with the evolution of object mutations. Flask dashboard [87] supports the  visualization of the performance evolution over versions of Web services implemented using the Flask framework for Python. TypeV [88] allows one to analyze the evolution of a system through the visualization of abstract syntax trees. ClonEvol [89] (shown in Figure 8) visualizes the evolution of code clones to improve the quality of systems. Software Evolution Storylines [90] supports the visualization of the interactions between developers during software project evolution.
All the twelve listed tools that focus on the evolution of software systems are displayed on the standard computer screen.

4) Behavior/Evolution/Structure:
Eight approaches correspond to frameworks that can be used to visualize multiple aspects of software systems. Four of them correspond to active projects introduced several years ago. Mondrian [92] is an engine for rapid lightweight visualization, which is currently supported in the Roassal engine [93]. CodeBubbles [94] is an environment that encapsulates code snippets into bubbles that can be reused through composition. Vizz3D [95] is a framework for online configuration of 3D information visualizations that was originally available for Eclipse, and later made available for Visual Studio. CHIVE [96] is a framework for developing, in particular, 3D software visualizations.
GEF3D [97] is a framework for developing 2D/2.5D/3D graphical editors. Graph [98] is a domain-specific language for visualizing software dependencies as a graph (shown in Figure 9). Getaviz [99] and SpiderSense [100] enable the design, implementation, and evaluation of software visualizations.
The framework Getaviz supports visualizations displayed in immersive virtual reality, whereas the seven other frameworks are limited to the standard computer screen.
The characterization presented in Table I contains only part of the content of our data set described in previous publications [6], [7]. Various other characteristics of software visualizations can help developers willing to adopt visualization to find a suitable approach. We believe that an ontology can be suitable to implement such richer model.

B. Implementation Details
We implement our ontology using Protégé [102], a popular, free, and open-source framework for the design and use of ontologies. In it, we define the concepts (in the tool called classes), properties, restrictions, and instances. Figure 10 shows the classes view in Protégé with a detail of  the hierarchy of concepts. As the concept, we selected the name of the tools, which are listed in the right pane. Figure 2 shows an overview of our implementation of the concepts hierarchy using the OntoGraf visualization plug-in included in Protégé.
We present the list of metrics available in the Ontology metrics view of Protégé in Table II. Although we are aware that many more individuals and relationships must be added to the ontology to increase its usability, we observe that our current implementation is not small according to a survey of ontology metrics [103] that reported that ontologies on average contain 36.11 classes (standard deviation of 78.53) and 28.13 instances (standard deviation of 97.59).

V. USAGE SCENARIOS
We now demonstrate the ontology through two usage scenarios.

Scenario 1.
Find suitable visualization tools that support the analysis of performance issues at runtime.  Two concepts are defined in the specification of this need: (1) the source of the data is the runtime and (2) the problem dealt is the performance of the software system. We translate this specification to the syntax specified by the Ontology Web Language (OWL). Figure 11 shows the resulting query and the suitable tools returned.

Scenario 2. Find visualization tools under a free license that support the analysis of source code.
Similarly, the specification of this need defines two concepts: (1) the license of the tool has to be free and (2) the source of the data must be the source code of the software system. Figure 12 shows the translated specification of the need in the OWL syntax and the suitable tools returned.

Threats to Validity
In our paper selection process, we might have overlooked papers from relevant venues that describe important software visualization tools. We mitigated this bias by selecting papers published in the two most frequently cited venues dedicated to software visualization: SOFTVIS and VISSOFT. We selected software visualization papers published between 2002 to 2018 in SOFTVIS and VISSOFT. The excluded papers from other venues or published before 2002 may affect the generalizability of our results. We mitigated bias in the data collection procedure (which could obstruct reproducibility of our investigation) by establishing a protocol to extract the data of each paper equally, and by maintaining a spreadsheet to keep records, normalize terms, and identify anomalies.

VI. CONCLUSION
Although many software visualization approaches have been proposed to deal with various software concerns, usually developers are not aware of tools they can put into action. In this paper, we presented our attempts to fill the gap between existing software visualizations and their practical applications: (1) We presented a curated catalog of 70 available software visualization tools that we linked to their repositories; we classified the tools into various categories (e.g., task, data, environment) to help developers who look for suitable visualizations. (2) We summarized our results in developing VISON, our software visualization ontology.
The ontology offers a rich model to encapsulate the various characteristics of software visualizations. We reported on our experience designing and implementing our ontology of software visualizations in the Protégé tool. We demonstrated how the ontology can be used through usage scenarios. Our ontology is publicly available [10]. We expect the ontology will help developers find suitable software visualizations and researchers to reflect on the field. Users of the ontology will be able to contribute, for instance, by adding characteristics of new visualizations, or by adding the results of evaluations of existing visualizations. In the future, we plan to combine our previous work on metavisualization [17] with VISON.