2009 International Conference on Knowledge and Systems Engineering A Parameterized Unit Test Framework Based on Symbolic Java PathFinder Anh-Hoang Truong and Thanh-Nhan Vu College of Technology Vietnam National University 144 Xuan Thuy, Hanoi, Vietnam Email: {hoangta, nhanvt.mcs07}@vnu.edu.vn usually requires an external solver [3] to find the solutions for symbolic expressions Model checking has been a popular topic since last two decades and recently it is widely used to analyse software programs However, model checking is hard due to the complexity of the code and it often cannot completely analyse the program’s state space due to the large amount of memory it requires and the path explosion problem For these reasons, many popular model checkers rely on abstractions to reduce the size of the state space [20] However, these techniques are not well suited for handling code that manipulates complex data as they introduce too many predicates, making the abstraction process inefficient Java PathFinder [2] uses model checking [4] without relying on abstraction (that cannot always achieve good code coverage), but augmenting it with symbolic execution It allows us to extend its capabilities by listening to events whenever a bytecode instruction is executed We base on this feature to add an extension that allows us to run parameterized unit tests, which creates standard unit tests of JUnit We then reply on JUnit test framework to run these generated unit tests The result is that the programmers not have to write unit tests and the generated unit tests ensure high path coverage This is our main contribution in this paper The rest of the paper is organized as follows Section discusses about symbolic execution of JPF and other related backgrounds that are used in the later sections Section shows some related works Section gives detail of our approach We show some experimental results in Section and conclusion in Section Abstract – Parameterized unit test recently gains a lot of attention as it saves testing cost and is more efficient in term of code coverage We present a framework for running parameterized unit tests (PUT) based on Java PathFinder (JPF) and JUnit Our approach bases on model checking and symbolic execution of JPF for generating standard unit tests As a result, we achieve high structural and path coverage The generated unit tests are automatically executed by JUnit so programmers receive immediately assertion failures if any Currently, our approach mainly works with numeric and boolean data type but it is possible to extend our framework for other data types such as string Keywords – Testing, Parameterized Unit Test I INTRODUCTION There are many examples of the damages caused by software errors, especially when software is ubiquitous According to a report by the US National Institute of Standards and Technology, software failures cost the US economy $60 billion per annum, but the improvements in software testing infrastructure are still limited while it could save one-third of this cost [21] Software testing [1], the most commonly used technique for validating the quality of software, takes about half of the total cost of software development One of the reasons is because the practice of testing is still manual in various phases Automated testing helps developers reduce the cost of producing software and increase the reliability of software Numerous methods and researches have been proposed to support and automate some parts of the software testing To produce test data, random testing generates randomly a stream of bits and sends to the program as input parameters The main advantage of this method is its simplicity It also does not require special machines and or computing resources However, this approach is not efficient as one path of a program may be executed many times while some other paths are difficult to get executed In other words, it is difficult to archive the adequacy criterion [5] Symbolic execution can resolve the drawbacks of random testing In this method the variables that normally contain concrete values are replaced by their symbolic counterparts, which express a range of possible values using symbolic expressions Base on these symbolic data, the model checker generates input data set that covers all of possible executions of the program It 978-0-7695-3846-4/09 $26.00 © 2009 IEEE DOI 10.1109/KSE.2009.47 II BACKGROUND A Unit Test and Parameterized Unit Test Unit Test (UT) [11, 12]: is a concept in traditional testing techniques A unit test is a method that has no parameter and returns void type UT is used to test a single unit of code Each UT contains parts: input values, a sequence of instructions and assertions An UT is failed when any of its assertions is violated or an exception is thrown The disadvantage of UT is that it can check some specific execution paths of a program only Parameterized Unit Test (PUT) [7, 11, 12] is an improvement to unit test A PUT is a unit test with parameters PUT allows accepting different input values that are passed via parameters Usually these input values are generated automatically by a tool 201 The relationship between UT and PUT is shown in Figure Traditional UTs can be generalized to PUTs and PUT can instantiate back UTs A symbolic execution tree (SET) can be used to characterize all execution paths of program A symbolic execution tree of a program is a (possibly infinite) tree where nodes are program states during symbolic execution and arcs are possible transitions between states All the leaf nodes of a SET where the PC is satisfiable represent the final states of programme while the paths from the root to these nodes represent the different execution paths Moreover, all feasible execution paths of the program are represented in SET All satisfiable valuations for a PC (in a leaf node) will give us a real input and execution paths with all those inputs are equal, and the number of concrete executions may be infinite Figure 1: The relationship between UT and PUT 1: 2: 3: 4: 5: 6: B Symbolic Execution Symbolic execution [6] is a way of executing a program in which the program variables that contain concrete values are replaced by their symbolic counterparts that express a range of possible values using symbolic expressions In symbolic execution, values of variables and return values of programs are symbolic expressions consisting of symbolic input During the execution process of a program P, if the value of a variable depends on the input parameters, the machine will calculate a symbolic value to replace the concrete value of the variable Given a variable x, the symbolic value of x can be expressed by one of the following formats: (a) An input symbol (b) A formula consisting of symbolic values and operators (c) A formula consisting of symbolic values and concrete values and operators Operators in symbolic execution can be addition (+), subtraction (-), multiply (*) or divide (/), etc When a program is executed in symbolic mode, concrete types are replaced with corresponding symbolic types and concrete operations are replaced with calls to methods that implement corresponding operations on symbolic expressions Figure shows an example of symbolic execution The lower part is the symbolic execution tree of the program above it In the tree, numbers outside the boxes are the line numbers of the statements in the program In symbolic execution, the states of a single thread program consist of three parts: symbolic values of the expressions, a path condition (PC), which is a set of constraints on the values that we have to find to execute on that path, a program counter which indicates the next statement to be executed PC is a boolean formula over input variables and describes which conditions must be true in the state PC accumulates constraints that the inputs must satisfy in order for an execution to follow the particular path When working with variables of an array type, PC needs to add a condition to ensure that there is no out-of-bound array access public void Swap(int x, int y){ if (x > y) { x = x + y; y = x - y; x = x - y; if (x - y > 0) assert(false); } Figure 2: A simple program and its symbolic execution tree There are two types of symbolic execution: • Static symbolic execution: in every branching point, PC is updated and a constraint solver [3] determines whether the appropriate path is feasible If a path is not feasible, the execution backtracks to previous node so only feasible paths are executed • Dynamic symbolic execution: is a symbolic execution technique base on dynamic program analysis where a program can be executed many times with different values of input parameters First, each input parameter is given a random value, the program is executed with these values and constraints are collected in the execution process and new constraints are generated automatically base on collected constraints Test input generators use an algorithm to generate a set of input data so that all executable paths are examined The algorithm is sketched as follows: 202 The program is executed with random values of parameters and a given depth of SET (that useful when program contains recursions or infinite loops) First, SET is initialized Then symbolic values of variables are calculated based on the given values of input parameters and path constraints are generated accordingly When a new constraint is generated, a new node is added to SET When the execution finishes, the path of SET is set to examined With concrete values, a concrete path of SET is examined, and a new node is created for the other branches and this node is marked as non–examined The new node will store the appropriate path constraints After a path is completed, a non-examined node is chosen and a new constraint is built by collecting of all the constraints in nodes that belong to the path from the root of SET to the chosen node The new constraint is sent to a constraint solver The constraint solver will determine: • If the constraint is not satisfied, another nonexamined is chosen and above step is repeated • If the constraint is satisfied, the constraint solver will generate concrete values for the input parameters These values will be used for the next concrete execution • The algorithm terminates when all possible paths of SET are examined All of concrete values of input parameters and analysis information of appropriate concrete execution (method summary) are used for generating UT and reporting purpose suite with high code and assertion coverage It performs a systematic white - box program analysis It learns the program behaviour by monitoring execution traces, and uses a constraint solver to generate new test cases with different behaviours However, the main drawbacks of Pex are it can analyse NET applications only and we can execute only one PUT in each run Some researches introduce a method to support symbolic execution in Java environment by instrumenting Java bytecode [9, 19, 20] In tools like JCUTE and CUTE, original Java programs are converted to programs in other simpler architecture language (such as SCIL, Jimple) [17] and augmented with code to support symbolic execution TABLE I EXAMPLES OF CONVERTING A PROGRAM FROM JAVA TO SCIL Java SCIL p[i] = q[j]; t1 = p + i; t2 = q + j; *t2 = *t1; assert (x > 0); x = 10; if (x > 10) goto L; ERROR; L: x = 10; if (x > 0) x = 1; else x = -1; y = x + y; if (x && y > 0) and avoid the path explosion in symbolic execution (x < 100 && y < 100) Executing this PUT we produce the generated test units as follows //notice JPF to use symbolic bytecode +vm.insn_factory.class=gov.nasa.jpf.symbc.Symbol icInstructionFactory // uses Listener for result reporting +jpf.listener=gov.nasa.jpf.symbc.SymbolicListene r // uses Math libraries +vm.peer_packages=gov.nasa.jpf.symbc:gov.nasa.jp f.jvm //Choose a Constraint Solve +symbolic.dp=iasolver // Name of symbolic method and type of parameters: concrete or symbolic +symbolic.method=UnitUnderTest(sym#sym#con) // Name of class that contains symbolic method Main import junit.framework.TestCase; public class GeneratedJUnits_by_PUT extends TestCase{ private PUT cls_PUT = new PUT(); public GeneratedJUnits_by_PUT(String name){ super(name); } protected void setUp()throws Exception { super.setUp(); } protected void tearDown()throws Exception { super.tearDown(); } public void testNeedTest_1_100(){ cls_PUT.testEuclid(1,100);} public void testNeedTest_1_1(){ cls_PUT testEuclid (1,1); } public void testNeedTest_3_2(){ cls_PUT testEuclid (3, 2); } public void testNeedTest_8_5(){ cls_PUT testEuclid (8, 5); } public void testNeedTest_21_13(){ cls_PUT testEuclid (21, 13); } public void testNeedTest_55_34(){ cls_PUT testEuclid (55, 34); } Symbolic JPF is a virtual machine so the testing class must have a main method to start symbolic execution In the main method, the testing method that needs to be executed symbolically is called with concrete values In testing classes, there is no main method so for these classes can be executed in JPF directly we add PUTDriver component to work around this problem PUTListener class allows the system to control execution process, report results and generate UTs JUnit automatically executes these unit tests so we immediately get assertion errors if any This is really useful because the generated test units cover all paths and we save a lot of manual work when having to write all of them down, especially for large functions and projects VI Figure Architecture of the PUT framework Base on Symbolic JPF and JUnit, we have proposed a parameterized testing framework for Java and implemented it as a tool Our experiments show a working tool that can save manual work of creating test cases and increase the coverage while maintain a compact test suite in the sense that there is no two test cases that the program runs on the same path More important, our approach is extensible to hand branching condition of other data types such as string We plan to add this functionality in the next release of the tool Our proposed system has architecture as in Figure The main component of the system is PUTRunner, which uses a PUTDriver to generate data for the PUTRunner PUTListener will listen to events generated by PUTRunner to generate unit tests and they will be executed as standard unit tests V CONCLUSION EXPERIMENT We show here a simple method Euclid_GCD (Euclid’s algorithm to find the greatest common divisor of two integer numbers) in class NeedTest that we need to test ACKNOWLEDGEMENT This research was partly supported by Vietnam National University, Hanoi under the project QGTD.09.02 We thank Binh-Duong Tran (K50) for the initial implementation of the tool public int Euclid_GCD (int x, int y) { while (x!=y){ if(x>y) x = x-y; if(x0 && x0 && y