Test Case Generation

Hybrid Multi-level Crossover for Unit Test Case Generation

State-of-the-art search-based approaches for test case generation work at test case level, where tests are represented as sequences of …

Mitchell Olsthoorn, Pouria Derakhshanfar, Annibale Panichella

Improving Test Case Generation for REST APIs Through Hierarchical Clustering

With the ever-increasing use of web APIs in modernday applications, it is becoming more important to test the system as a whole. In the …

Dimitri Stallenberg, Mitchell Olsthoorn, Annibale Panichella

EvoSuite at the SBST 2021 Tool Competition

EvoSuite is a search-based tool that automatically generates executable unit tests for Java code (JUnit tests). This paper summarises the results and experiences of EvoSuite participation at the ninth unit testing competition at SBST 2021, where EvoSuite achieved the highest overall score.

S. Vogl, S. Schweikl, G. Fraser, A. Arcuri, J. Campos, A. Panichella

Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Software testing is an important and time-consuming task that is often done manually. In the last decades, researchers have come up …

Mitchell Olsthoorn, Arie van Deursen, Annibale Panichella

Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfalls, and Opportunities

Test smells attempt to capture design issues in test code that reduce their maintainability. Previous work found such smells to be highly common in automatically generated test-cases, but based this result on specific static detection rules; although these are based on the original definition of “test smells”, a recent empirical study showed that developers perceive these as overly strict and non-representative of the maintainability and quality of test suites. This leads us to investigate how effective such test smell detection tools are on automatically generated test suites. In this paper, we build dataset of 2,340 test cases automatically generated by EVOSUITE for 100 Java classes. We performed a multi-stage, cross-validated manual analysis to identify six types of test smells and label their instances. We benchmark the performance of two test smell detection tools: one widely used in prior work, and one recently introduced with the express goal to match developer perceptions of test smells. Our results show that these test smell detection strategies poorly characterized the issues in automatically generated test suites; the older tool’s detection strategies, especially, misclassified over 70% of test smells, both missing real instances (false negatives) and marking many smell- free tests as smelly (false positives). We identify common patterns in these tests that can be used to improve the tools, refine and update the definition of certain test smells, and highlight as of yet uncharacterized issues. Our findings suggest the need for (i) more appropriate metrics to match development practice; and (ii) more accurate detection strategies, to be evaluated primarily in industrial contexts.

Annibale Panichella, Sebastiano Panichella, Gordon Fraser, Anand Ashok Sawant, Vincent Hellendoorn

Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives

Evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called CrashFunction to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MO-HO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to CrashFunction. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 hard-to-reproduce crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (SGGA) and decomposition-based multi-objectivization approach (decomposition). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to SGGA and decomposition, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and -92% for the required running time.

Pouria Derakhshanfar, Xavier Devroey, Andy Zaidman, Arie van Deursen, A. Panichella

Crash Reproduction Using Helper Objectives

Evolutionary-based crash reproduction techniques aid developers in their debugging practices by generating a test case that reproduces a crash given its stack trace. In these techniques, the search process is typically guided by a single search objective called Crash Distance. Previous studies have shown that current approaches could only reproduce a limited number of crashes due to a lack of diversity in the population during the search. In this study, we address this issue by applying Multi-Objectivization using Helper-Objectives (MO-HO) on crash reproduction. In particular, we add two helper-objectives to the Crash Distance to improve the diversity of the generated test cases and consequently enhance the guidance of the search process. We assessed MO-HO against the single-objective crash reproduction. Our results show that MO-HO can reproduce two additional crashes that were not previously reproducible by the single-objective approach.

Pouria Derakhshanfar, Xavier Devroey, Annibale Panichella, Andy Zaidman, Arie van Deursen

EvoSuite at the SBST 2020 Tool Competition

EvoSuite is a search-based tool that automatically generates executable unit tests for Java code (JUnit tests). This paper summarizes the results and experiences of EvoSuite’s participation at the eighth unit testing competition at SBST 2020, where EvoSuite achieved the highest overall score (406.14 points) for the seventh time in eight editions of the competition.

A. Panichella, J. Campos, G. Fraser