1. Trang chủ
  2. » Thể loại khác

DSpace at VNU: Goal-oriented dynamic test generation

18 73 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Cấu trúc

  • Goal-oriented dynamic test generation

    • 1 Introduction

    • 2 The chaining mechanism

    • 3 The extended chaining approach

      • 3.1 Limitations of the chaining approach

      • 3.2 Extended event sequence generation

    • 4 The chaining guided search process

    • 5 Buffer overflow testing

      • 5.1 Buffer overflow checking

      • 5.2 Dynamic symbolic execution-based test generation

      • 5.3 Goal-oriented testing

    • 6 Evaluation

      • 6.1 Subjects

      • 6.2 Methodology

      • 6.3 Experimental results

      • 6.4 Threats to validity

    • 7 Related work

      • 7.1 Symbolic execution

      • 7.2 The chaining approach

      • 7.3 Search-based testing

      • 7.4 Buffer overflow checking

    • 8 Conclusion

    • Acknowledgements

    • References

Nội dung

Information and Software Technology 66 (2015) 40–57 Contents lists available at ScienceDirect Information and Software Technology journal homepage: www.elsevier.com/locate/infsof Goal-oriented dynamic test generation TheAnh Do a, Siau-Cheng Khoo b, Alvis Cheuk Ming Fong a, Russel Pears a,⇑, Tho Thanh Quan c a Auckland University of Technology, 2-14 Wakefield St, Auckland 1010, New Zealand National University of Singapore, COM1, 13 Computing Drive, Singapore 117417, Singapore c Ho Chi Minh City University of Technology, 268 Ly Thuong Kiet St, Ho Chi Minh City, Viet Nam b a r t i c l e i n f o Article history: Received July 2014 Received in revised form 22 April 2015 Accepted 30 May 2015 Available online June 2015 Keywords: Buffer overflow vulnerabilities Dynamic symbolic execution Data and control dependence analysis Type inference analysis a b s t r a c t Context: Memory safety errors such as buffer overflow vulnerabilities are one of the most serious classes of security threats Detecting and removing such security errors are important tasks of software testing for improving the quality and reliability of software in practice Objective: This paper presents a goal-oriented testing approach for effectively and efficiently exploring security vulnerability errors A goal is a potential safety violation and the testing approach is to automatically generate test inputs to uncover the violation Method: We use type inference analysis to diagnose potential safety violations and dynamic symbolic execution to perform test input generation A major challenge facing dynamic symbolic execution in such application is the combinatorial explosion of the path space To address this fundamental scalability issue, we employ data dependence analysis to identify a root cause leading to the execution of the goal and propose a path exploration algorithm to guide dynamic symbolic execution for effectively discovering the goal Results: To evaluate the effectiveness of our proposed approach, we conducted experiments against 23 buffer overflow vulnerabilities We observed a significant improvement of our proposed algorithm over two widely adopted search algorithms Specifically, our algorithm discovered security vulnerability errors within a matter of a few seconds, whereas the two baseline algorithms failed even after 30 of testing on a number of test subjects Conclusion: The experimental results highlight the potential of utilizing data dependence analysis to address the combinatorial path space explosion issue faced by dynamic symbolic execution for effective security testing Ó 2015 Published by Elsevier B.V Introduction Automated software testing is increasingly being seen as an important means for improving the quality and reliability of software in industry It mitigates the hardship of manual testing, which is labor-intensive and error-prone, and alleviates the expensive cost of software testing, which often accounts for around half of the total software development costs One way of enhancing automated software testing is to automate the process of test input generation Over the last three decades, considerable research effort has attempted to achieve this goal, ranging from random testing [21], symbolic execution [39], search-based testing [28], the chaining approach [20, 37], to dynamic symbolic execution [9, 23, 45] ⇑ Corresponding author Tel.: +64 921 9999x5344 E-mail address: russel.pears@aut.ac.nz (R Pears) http://dx.doi.org/10.1016/j.infsof.2015.05.007 0950-5849/Ó 2015 Published by Elsevier B.V Among these proposed techniques, dynamic symbolic execution has been gaining a considerable amount of attention in the current industrial practice [11] Through the power of the underlying constraint solver, it intertwines the strengths of random testing and symbolic execution to achieve the scalability and high precision of dynamic analysis One of the most important insights of dynamic symbolic execution is the ability to reduce the execution into a mix of concrete and symbolic execution when facing complicated pieces of code, which are the real obstacle to classical symbolic execution The technique has been applied to the testing of many industrial software systems and uncovered ‘‘million-dollar’’ bugs [5, 26] While effective, the fundamental scalability issue limiting the capability of dynamic symbolic execution is the combinatorial explosion of the path space, which can be extremely huge or often infinite in sizable and complex programs This phenomenon has been significantly highlighted in several research studies: T Do et al / Information and Software Technology 66 (2015) 40–57 path explosion represents one of the biggest challenges facing symbolic execution, and given a fixed time budget, it is critical to explore the most relevant paths first [14] A significant scalability challenge for symbolic execution is how to handle the exponential number of paths in the code [11] In theory, systematic dynamic test generation can lead to full program path coverage, i.e., program verification In practice, however, the search is typically incomplete both because the number of execution paths in the program under test is huge [25] The impact of this particular limitation of dynamic symbolic execution on the efficiency of software testing is significant If dynamic symbolic execution is carried out in a way that exhaustively and systematically explores all feasible paths of the program under test, then it often ends up with only small regions of the code explored Consequently, in practice the objective of achieving high structural coverage of software testing is hard to realize using dynamic symbolic execution More importantly, the capability of detecting errors can be limited since the code harboring errors may not even be exercised The CheckArray function in Fig could be a good example to illustrate this phenomenon It takes as input an array of 20 elements and checks if all elements equal 25 This yields 220 (=1,048,576) paths with just 20 symbolic predicates In practice, this path space explosion problem becomes worse as the input of programs can be a stream of data with too large (or unknown) size [24] In the attempt to ‘‘explore the most relevant paths first’’ [14], a major challenge arising from path exploration is among the far too many program paths, how to mine for appropriate paths for quickly achieving desired testing criteria Consider the execution of branch (5, 6) in CheckArray function, for example The first observation is that this branch does not form any symbolic predicate as its conditional expression depends on the locally declared variable success; any attempt to flip its alternative branch to trigger its execution will fail The second observation is that among the 1,048,576 paths, there is only one path that executes all ‘‘else’’ branches at the conditional statement to propagate the desired true value of success down to statement to execute branch (5, 6) These observations demonstrate difficulties in developing path exploration algorithms where the execution of code does not depend directly on the symbolic input This is widely adopted in programming practices, however For instance, Cadar et al [12], when testing a number of medium-sized applications, found that less than 42% of the executed statements depend on the symbolic input Independently, 41 Binkley et al [6] studied the testability transformation problem in search-based testing, and observed that the variety usage of Boolean-typed variables complicates test input generation and degrades program testability Of the 23 buffer overflow vulnerabilities in our experimental study, none depends directly on the symbolic input To cope with such challenges, we present in this paper an approach to improve the dynamic symbolic execution-based path exploration process in the context of goal-oriented testing Stated formally: Given a test goal g (e.g statement or branch) in the program P, the goal is to find a test input t with which g is executed To begin with, we utilized the chaining approach [20, 37] to form a search mechanism Particularly, given a test goal to explore, the chaining approach first performs data dependence analysis to identify statements that affect the execution of the test goal, and then uses these statements to create sequences of events that are to be executed prior to the execution of the test goal The advantage of doing this is twofold: (1) it precisely focuses on the cause of getting the test goal to be executed and (2) it slices away code segments that are irrelevant to the execution of the test goal Next, we propose a search algorithm, named Guider, which is driven by the chaining mechanism and utilizes dynamic symbolic execution to perform path exploration for exploring the test goal Guider distinguishes itself from existing search algorithms in three major aspects: (1) it mitigates the path explosion problem by centralizing on data dependences which truly affect the executability of the test goal; (2) it is able to refine path exploration when the local search space is saturated; and (3) it determines control dependences on the fly and exploits the static program structure to optimize path exploration Lastly, we develop a dynamic symbolic execution-based buffer overflow testing framework, named Sebo Sebo works in two phases In the first phase, it uses Deputy [13]—an advanced type system for pointers, to diagnose potential runtime violations on buffer operations in the program under test In the second phase, it uses Crest [7]—an extensible symbolic execution engine, to perform dynamic symbolic execution for test input generation We implemented our proposed algorithm—Guider, in the Sebo framework and conducted experiments against 23 buffer overflow vulnerabilities to evaluate its effectiveness We observed a significant improvement of Guider over two widely adopted search algorithms in dealing with the path explosion problem to uncover buffer overflow vulnerabilities Fig The function CheckArray checks if all elements of an input array equal 25 This example is used to illustrate the path explosion problem facing dynamic symbolic execution and the difficulty of developing path exploration algorithms in which the code under test does not directly depend on the symbolic input It is also used to illustrate the search mechanism in the chaining approach 42 T Do et al / Information and Software Technology 66 (2015) 40–57 This paper makes the following primary contributions:  A path exploration algorithm that exploits data dependencies to perform dynamic symbolic execution for effectively and efficiently uncovering desired test goals  A buffer overflow testing framework which incorporates the proposed path exploration algorithm with type inference analysis to detect buffer overflow vulnerabilities in low level C programs  Study results on a benchmark of 23 buffer overflow vulnerabilities to evaluate the capability of the proposed algorithm The evaluation provides interesting observations about the path exploration characteristics of different algorithms under different time settings and numbers of paths explored in searching for buffer overflow defects The rest of the paper is organized as follows Section introduces the chaining approach Section describes the extended chaining approach, an extension of the chaining approach to improve the efficiency of data propagation toward exploring test goals Section presents our proposed search algorithm Guider Section describes the Sebo testing framework and Section discusses the experimental study We review related work in Section and conclude the paper in Section The chaining mechanism The chaining approach [20] was proposed to make use of data dependency to guide the search process The basic idea is to identify statements leading up to the goal structure, which may influence the outcome of the test goal These statements are sequences of events that the search process must walk along to target the test goal The chaining approach hence can be considered as a slicing technique [49] which simplifies the program by focusing on selected aspects of semantics What distinguishes the chaining approach from program slicing is the way it projects the execution of a test goal Slicing takes into account both data and control dependences, which often yields a slice too large to explore The chaining approach focuses only on data dependences and addresses control dependences on the fly We illustrate the core of the chaining approach using again the CheckArray function in Fig in which the test goal is to explore node For this, the chaining approach first generates an initial event sequence: E0 ¼ hðs; ;Þ; ð6; ;Þi where each event is a tuple ei = (ni, Ci), where ni is a program node and Ci is a set of variables referred to as a constraint set Notation-wise, we refer the node ni by ei ? n and the constraint set Ci by ei ? Ci Now suppose the search process fails to find an input array with all elements equal 25 to execute the test goal, thus failing to move from node to node Node is hence considered to be a problem node Formally, a problem node refers to a conditional statement for which the search process within a fixed testing limit cannot find inputs to execute an intended branch from this node The chaining approach then performs data dependence analysis with respect of this problem node to identify definition statements that define data for variables used in the conditional expression In this case, the conditional expression consists of variable success, which is defined at nodes and Two event sequences are constructed accordingly, E1 and E2, from the initial event sequence E0: E1 ẳ hs; ;ị; 1; fsuccessgị; 5; ;ị; 6; ;ịi E2 ẳ hs; ;ị; 4; fsuccessgị; ð5; ;Þ; ð6; ;Þi Notice that for every two adjacent events in an event sequence, ei = (ni, Ci) and ei+1 = (ni+1, Ci+1), there must exist a path from ni to ni+1 along which all variables in Ci are not redefined Such a path allows the effect of a definition statement to be transferred up to the goal structure Obviously, sequence E2 cannot help to explore the test goal as the value of success variable is false, which leads to the execution of the ‘‘else’’ branch instead Sequence E1, on the other hand, guides the search process to first reach node from the function entry, which sets the value of success variable to the desired true value, and then continues from node to node When moving to node 5, the value of success variable may be killed at node if branch (3, 4) is executed If so, the search process is once again guided to change the flow of control at node to execute the ‘‘else’’ branch, which prevents success variable from being set to the unwanted false value This guidance is continuously refined throughout the for loop to preserve the constraint set {success} of event (1, {success}) while reaching to event (5, £) Eventually, the value of all elements in the input array is altered to 25, providing the desired input to expose the test goal, node We now formalize the process of creating a new event sequence from an existing sequence E Let E = he1, , eiÀ1, ei, ei+1, , emi be an event sequence Suppose the search process driven by this event sequence guides the execution up to event ei and a problem node p is encountered between events ei and ei+1 Let d be a definition statement node of problem node p Two events are generated, ep = (p, £) and ed = (d, D(d)), corresponding to the problem node and its definition A new event sequence is now created by inserting these two events into sequence E Event ep is always inserted between ei and ei+1 However, event ed in general, may be inserted in any position between e1 and ep Suppose the insertion of event ed is between events ek and ek+1 The following sequence is then created: E0 ¼ he1 ; ; ekÀ1 ; ek ; ed ; ekỵ1 ; ; ei1 ; ei ; ep ; eiỵ1 ; ; em i Since new events are added to the sequence, the implication of data propagation may be violated This results in modifications of the associated constraint sets of involved events The update is done in the following three steps: (1) Cd = Ck [ D(d) (2) Cp = Ci (3) "j, k + j i, Cj = Cj [ D(d) In the first step, the constraint set Cd of event ed is initialized to the union of D(d) and the constraint set of the preceding event ek This modification ensures that the constraint set Ck of event ek is preserved up to event ek+1 while going through the newly inserted event ed The second step also imposes the same requirement on event ep by assigning Ci to its constraint set In the final step, all constraint sets of events between ek+1 and ei are modified by including the variable defined at d Thus the chaining approach guarantees to propagate the effect of the definition at node d up to the problem node p Given this formalization, the search process when following an event sequence attempts to adjust the execution to move from one event to another without violating the constraint set in the preceding event This implies a systematic mechanism to propagate the effect of all possible data flows up to the goal structure Unfortunately, this implication is not correct We investigate in depth this phenomenon in the next section by assessing the event sequence generation process employed in the chaining approach The extended chaining approach The chaining approach was utilized in this work when we observed that the control dependence information of the program T Do et al / Information and Software Technology 66 (2015) 40–57 may not be sufficient to guide the search process for finding test inputs to explore high complexity code [20] By recognizing search failure may be due to data dependences, the chaining approach employs a backup strategy through the construction of event sequences which may guide the search process to propagate desired data flows to trigger the execution of test goals The construction of event sequences is achieved by inserting new events which navigate the search process to take into account last definitions of variables used at problem nodes Last definitions alone however may not be able to provide sufficient guidance toward influencing the outcome at problem nodes 3.1 Limitations of the chaining approach To understand the limitations of the chaining approach, consider the function CheckCounter in Fig The input to this function is an array of 20 elements and the test goal is to explore node 14 Note that the execution of the test goal node 14 necessitates only half of the array input equal 25 Using the chaining approach, the initial event sequence is E0 = h(s, £), (14, £)i The search process hardly in the first attempts can find a test input with only 10 array elements equal 25 to explore the test goal The search failure activates the chaining approach to determine node 13 to be the problem node and attempt to refine event sequence E0 By performing data dependence analysis with respect to the counter variable used in the conditional expression of the problem node, the chaining approach appends the two definitions of counter at nodes and 12 to sequence E0 to create the following two event sequences: 43 E1 ẳ hs; ;ị; 8; fcountergị; 13; ;ị; 14; ;ịi E2 ẳ hs; ;ị; 12; fcountergị; ð13; ;Þ; ð14; ;Þi Event sequence E1 is obviously infeasible since the value of counter variable being carried by E1 is zero while the desired value to execute branch (13, 14) is 10 Event sequence E2 is not feasible, but requires that the variable counter is to be incremented only once This is not sufficient enough to guarantee an array input can be found with out of the remaining 19 elements equal 25 to execute the test goal Consequently, node 13 is identified to again be the problem node The original chaining approach is not designed to deal with this situation; it therefore terminates and reports node 14 could not be explored Furthermore, consider the example in Fig The CheckArray and CheckCounter functions used in this example are from Figs and 2, respectively Suppose that the test goal is to explore node 19 Initially, the chaining approach creates the event sequence E0 = h(s, £), (19, £)i When the search process fails to explore node 19, the chaining approach identifies node 18 to be the problem node and refines E0 by inserting nodes 16 and 17— the only two definition statements of variables valid01 and valid02 used in node 18, into E0 to create the following event sequences: E1 ẳ hs; ;ị; 16; fvalid01 gị; 18; ;ị; 19; ;ịi E2 ẳ hs; ;ị; 17; fvalid02 gị; 18; ;Þ; ð19; ;Þi Intuitively, when following E1 or E2, the search process can always propagate the definitions at nodes 16 and 17 down to the problem node 18 However, by simply doing so, rarely branch Fig Code snippet illustrating the limitation of the chaining approach proposed in the work of Ferguson and Korel [20], where the approach is not able to propagate transitive data dependences to influence the execution of the test goal It is later used to describe the event sequence generation proposal of the extended chaining approach by McMinn and Holcombe [37] Fig Code snippet illustrating the limitation of the chaining approach proposed in the work of Ferguson and Korel [20], where the approach is not able to incorporate data dependences to influence the execution of the test goal It is later used to describe the event sequence generation proposal of the extended chaining approach by McMinn and Holcombe [37] 44 T Do et al / Information and Software Technology 66 (2015) 40–57 (18, 19) can be executed since both valid01 and valid02 variables can carry false value This is because there is no guidance encoded in both sequences E1 and E2 of what value the CheckArray function at node 16 and the CheckCounter at node 17 must return Consequently, the search process after following either E1 or E2 encounters again the same problem node 18 It terminates and reports node 19 could not be explored The failure of the chaining approach in the two examples above originates from the following two main reasons: by adding variables used at pn (uses(pn)) This is because the effect of def(pn) can no longer affect the problem node as the traversal passes over node pn However, the variables used at pn can affect the problem node since they are used to compute the value assigned to def(pn) By doing this, the extended chaining approach enables the event sequence generation process to incorporate the effect of transitive data dependences when creating new event sequences CreateEventSequences then recurses using the visited node pn, the updated influencing set, and the event prior to the input event e Scenario 2—The currently visited node pn is a definition node that defines a variable in the constraint set C of the preceding event e In this case, any path passing through node pn to the next event violates the constraint set C and therefore the procedure stops going backward further from this node Scenario 3—The currently visited node pn does not define any variable in the constraint set, but instead defines a variable in the current influencing set (lines 11–12) In this case, node pn presents whether a direct data dependence or an indirect data dependence that should be propagated up to the problem node The procedure then checks if there exists a path from event e to node pn that does not violate the constraint set C If so, a new event sequence is generated using the original event sequence generation process as described in Section for the definition node pn The procedure then recurses using the new influencing set I0 = In{def(pn)} Scenario 4—In this scenario, the procedure simply recurses using the currently visited node pn along with the current influencing set I and event e (line 18)  The chaining approach resolves only one level of data dependences Obviously, if following an event sequence results in failure, this sequence should be extended to include further transitive data dependences to the problem node to continue exploring the test goal  The chaining approach takes each data dependency result in isolation It is obvious in the second example that event E1 can carry true value of valid01 variable to node 18 but the value of valid02 can be false; and vice versa To address this issue, the sequence generation process should consider all possible combinations of data dependences when creating new event sequences These limitations of the chaining approach have been addressed in the work of McMinn and Holcombe [37], called the extended chaining approach We describe in detail this extended approach in the following section 3.2 Extended event sequence generation The key idea behind the extended chaining approach is to take into account the effect of not only direct data dependences but also indirect data dependences in guiding the search process toward exploring a given problem node [37] Specifically, an extension is made to the event sequence generation process to consider definitions for all variables that can potentially affect the outcome at the problem node The extended chaining approach enables this through using the concept of influencing sets, which capture all variables whose definitions can either directly or indirectly influence the problem node The extended event sequence generation process is sketched in Fig through the GenerateEventSequences procedure The input to this procedure includes event sequence E, two events e1 and e2, and problem node pb, or GenerateEventSequences(E, e1, e2, pb) The context of calling this procedure is that the search process while attempting to traverse event sequence E from event e1 to event e2 encountered problem node pb The procedure performs the event sequence generation process in the following manner Initially, at the given input problem node, the influencing set is simply the set of variables used in the conditional expression of the problem node Paths are traversed backward from the problem node The influencing set is adapted according to the path taken So, starting from the current problem node pb, its initial influencing set Ipb, and the event e1 = (n1, C1) prior to the problem node of the event sequence E, the procedure invokes the CreateEventSequences procedure to traverse the control flow graph of the program under test in a backward manner In procedure CreateEventSequences, it visits each node pn in prev_nodes, which is simply the set of program nodes connected to the current node by an outgoing edge For each visited node pn, there are the following four possible scenarios Note that during the traversal of program nodes in the extended event sequence generation process, a global data structure of search points is used to ensure that the traversal terminates when traversing cyclic paths in the program Intuitively, the recursive algorithm to perform the event sequence generation process presented above is basically similar to the program slicing algorithm proposed in the work of Weiser [49] in 1981 The key difference between the two techniques is that the event sequence generation process does only consider data dependence information while Weiser’s slicing algorithm takes into account both data and control dependences We now demonstrate how the use of influencing sets and the extended chaining approach can tackle limitations encountered by the original chaining approach Consider again the CheckCounter function in Fig from which we evaluate how the extended event sequence generation process can help unroll the loop for uncovering the test goal at node 14 Previously, the search process when traversing event sequence E2 failed to explore the test goal and encountered again the problem node 13: E2 ¼ hðs; ;Þ; ð12; fcountergÞ; ð13; ;Þ; ð14; ;Þi Using the extended event sequence generation process, the influencing set initially is I = {counter} In traversing the control flow graph backward from the problem node 13, node 12 is encountered This corresponds to the first scenario in the algorithm where the path traversal has reached the event (12, {counter}) in event sequence E2 An update is made to the influencing set by removing the variable counter defined at node 12: I I n fdef ð12Þg fcounterg n fcounterg Scenario 1—The currently visited node pn is the same as the node n of the prior event e = (n, C) (line 5) If so, the procedure checks if pn is a definition node defining a variable in the influencing set, or def(pn) I If this is the case, the influencing set is modified by removing the variable defined at pn (def(pn)) and ; resulting in an empty set, and by adding the used variable counter: T Do et al / Information and Software Technology 66 (2015) 40–57 45 Fig Recursive procedure for generating event sequences using influencing sets in the extended chaining approach I I [ usesð12Þ ; [ fcounterg fcounterg This yields the same influencing set I = {counter} As the path traversal goes backward further from node 12, the presence of variable counter in the influencing set forces the consideration of its last definitions at nodes and 12 when these nodes are visited This corresponds to scenario where new event sequences are generated to propagate the effect of indirect data dependences down to the problem In particular, the following event sequences are created to guide the search process: E21 ẳ hs; ;ị; 8; fcountergị; 12; fcountergị; 13; ;ị; 14; ;ịi E22 ẳ hs; ;ị; 12; fcountergị; ð12; fcountergÞ; ð13; ;Þ; ð14; ;Þi Event sequence E22 directs the search process to unroll the loop two times, giving a better value for counter variable to satisfy the condition counter == THRESHOLD This process is repeated until there are THRESHOLD (or 10) times of the presence of event (12, {counter}) in an event sequence to eventually execute node 14 Consider again the example in Fig where previously the search process when following event sequence E1 failed to explore the test goal at node 19 and encountered again the problem node 18: E1 ẳ hs; ;ị; 16; fval01 gị; ð18; ;Þ; ð19; ;Þi Now by using the extended event sequence generation process, the initial influencing set includes all variables being 46 T Do et al / Information and Software Technology 66 (2015) 40–57 referenced at the problem node, or I = {val01, val02} In traversing the control flow graph backward from the current problem node, node 17 is encountered This corresponds to the third scenario where the currently visited node is a definition that defines the variable val02 in the influencing set A new event sequence is generated to incorporate the effect of this data dependence into the current event sequence E1 to influence the outcome at the problem node: E11 ẳ hs; ;ị; 16; fval01 gị; 17; fval01; val02 gÞ; ð18; ;Þ; ð19; ;Þi and the influencing set is modified by removing the defined variable val02: I I n fdef ð17Þg fval01; val02 g n fval02 g fval01 g As the path traversal continues with the influencing set I = {val01}, it finally reaches event (16, {val01}) at node 16 This corresponds to the first scenario where the influencing set is again updated by removing the variable val01 and by adding the local variable success of the CheckArray function, or I = {success} The presence of this success variable enforces the consideration of its definitions at nodes and to refine the current event sequence E1 when the traversal visits the function CheckArray Specifically, the following two event sequences are generated as the result of the third scenario: E12 ẳ hs; ;ị; 1; fsuccessgị; 16; fval01 gị; 18; ;ị; 19; ;ịi E13 ẳ hs; ;ị; 4; fsuccessgị; ð16; fval01 gÞ; ð18; ;Þ; ð19; ;Þi It is important to at this stage mention that the creation of event sequence E11 actually reflects an enhancement of the extended chaining approach where possible combinations of data dependences are considered to influence the execution of the problem node Event sequences E12 and E13 meanwhile represent the other enhancement where transitive data dependences are propagated to the problem node to trigger its execution The above two important features of the extended chaining approach allow event sequences to be continuously refined to obtain more accurate data dependence information for guiding the search process to effectively and efficiently explore test goals In this specific example, when following either event sequences E11 or E12 , the extended event sequence generation process can finally generate the following desired event sequence: hðs; ;Þ; ð1; fsuccessgÞ; ð16; fval01 gÞ; ð14; fval01; successgÞ; ð17; fval01; val02 gÞ; ð18; ;Þ; ð19; ;Þi The search process with the guidance of this event sequence will explore the program under test in the following pattern Starting from the entry node, the search process is guided to reach node where the Boolean value true is assigned to the local success variable in the CheckArray function This true value of success variable is then carried to node 16 of event (16, {val01}) to assign to the val01 variable From node 16, the search process is guided to explore node 14 where the local variable success of the CheckCounter function is defined as true value This value is then propagated to the assignment statement at node 17 to define the value for the val02 variable At this stage, the value of both variables val01 and val02 is true, providing the desired value to satisfy the ‘‘true’’ predicate val01 && val02 to execute the test goal node 19 Note that the propagation of the definition at node to node 16 is already presented in Section while the exploration of node 14 requires the search process to unroll the loop in CheckCounter function THRESHOLD (or 10) times The extended chaining approach described in this section forms a core part to our goal-oriented testing approach The back and forth interaction between the search process and the extended event sequence generation procedure GenerateEventSequences(E, e1, e2, pn) enables accurate data dependence information to guide dynamic symbolic execution for effectively and efficiently exploring high complexity test goals The chaining guided search process In this section, we present our proposed search algorithm— Guider, for carrying out goal-oriented test input generation Guider employs dynamic symbolic execution to perform test input generation and utilizes the extended chaining approach to guide path exploration toward effectively and efficiently exploring given test goals The entry point of the algorithm is given in Fig It takes as input a program under test P, a test goal g, and a testing limit limit Fig The proposed goal-oriented dynamic test generation algorithm—Guider Guider utilizes dynamic symbolic execution to perform test input generation while it explores the path space under the guidance of the (extended) chaining approach as well as the static structure of the program under test T Do et al / Information and Software Technology 66 (2015) 40–57 47 Fig Procedure ExploreEventSequence takes as input an event sequence E and a program execution p It attempts to adjust p in order to find a new program execution p0 such that p0 traverses event sequence E The output is a test input t such that the execution of P with t executes g, or null implying such t was not found or limit was expired The search algorithm uses worklist to keep all unexplored event sequences generated during the search process Associated with each event sequence is a program execution, which is used to perform path exploration toward traversing the event sequence Note that traversing an event sequence completely implies that the test goal g was executed since the last event of all event sequences always refers to g (Section 2) The event sequence traversal is done in each iteration of the while loop by calling ExploreEventSequence procedure (Fig 6) This process is repeated until either an input was found to explore the test goal g or the testing limit limit was expired The core functionality of the algorithm lies in the ExploreEventSequence procedure Simply put, this procedure performs a pattern concretization algorithm, where the input event sequence E can be considered to be a target pattern and the input execution p is to be adjusted in order to concretize E Stated formally: – Given an event sequence E = he1, e2, e3, , emi and a program execution p, the goal is to find a program execution p0 on which E is concretized This problem can be further reduced to the problem of concretizing every two adjacent events in the event sequence E To this, we use e1 and e2 to capture every two adjacent events on E (lines 14–15) and s to iterate over every executed statement on the executed program path PP to inspect a concretization of e1 and e2 We call e2 the target event An invariant maintained during the concretization inspection is that event e1 has already been concretized (i.e., node e1 ? n has been found) The inspection thus aims to reach event e2 (i.e., to find node e2 ? n) without modifying any variable in the constraint set e1 ? C Note that this invariant is 48 T Do et al / Information and Software Technology 66 (2015) 40–57 satisfied in the beginning as e1 points to the first event of E, which is actually the program entry (Section 2), and the path iteration is started at the statement right after the program entry (lines 17 and 19) The preservation of the constraint set C of event e1 is to propagate data definitions up to event e2 to influence the execution of node e2 ? n Now, going down along the executed path, we inspect every executed statement s and consider the following four possible scenarios Scenario 1—The target event e2 is discovered (line 20) This is found by checking if the currently inspecting statement s is the program node of e2 If so, we update e1 and e2 to the next two adjacent events of E to continue the concretization process (lines 34–35) In case e2 is the last event of E, the concretization of E has been accomplished on the executed path PP The algorithm terminates by returning the input executing p, i.e p(p) (line 32) Scenario 2—The currently inspecting statement s violates the constraint set C of event e1 (line 36) This is found by checking if s is a definition statement that redefines any variable in C If so, the implication of data propagation encoded in constraint C of event e1 is no longer valid In this case, we attempt to adjust the current program execution p to avoid the execution of this violating statement s through calling AdjustWhenViolated procedure (Fig 7) The AdjustWhenViolated procedure takes as input two events e1 and e2, sequence E, violating statement vs and program execution p It goes backward along the executed path PP, starting from the violating statement vs to the statement where event e1 was discovered, and examines at each statement encountered to perform an execution adjustment Specifically, for each statement b, it checks if the following four conditions hold (lines 64–67): if b is a branch statement, if b is a symbolic predicate, if the violating statement vs is transitively control dependent on branch b, and if the alternative branch of b can reach event e2 The first two conditions, (1) and (2), are to ensure that b can be flipped, and the last two conditions, (3) and (4), are to ensure that flipping of b to its alternative branch avoids the execution of vs and reaches event e2 If these four conditions together are satisfied, then the flipping is computed by invoking the SolveAtBranch procedure (Fig 9) The satisfiability of the flipping yields a new program execution p0 on which the sequence concretization can safely proceed further from the conditional statement c of branch b The soundness of doing so is guaranteed by two properties: the executed paths PP0 of p0 and PP of p match identically from the program entry up to statement c and the flipping is restricted to (branch) statements down below the statement where event e1 was discovered These two properties ensure that the sequence concretization result up to c is preserved In case the path adjustment failed, the RefineEventSequence procedure (Fig 8) is invoked to refine the current event sequence E (line 76) We describe the sequence refinement procedure below Scenario 3—The currently inspecting statement s is a branch statement and its alternative branch b has a minimal distance to reach event e2 (lines 44–46) This scenario results from the observation that if the control flow of the current execution p can be changed to execute this minimal distance branch b, it may potentially reach event e2 quickly For this, the algorithm checks if s is also a symbolic predicate and hence can be flipped (line 47) If so, the flipping is performed by invoking the SolveAtBranch procedure to change the execution of p from s to b (lines 48–53) This scenario represents our algorithm’s attempt to optimize path exploration Scenario 4—The currently inspecting statement s is a branch statement and s cannot reach event e2 (line 55) This is determined by confirming that there does not exist a program path from s to node n of event e2 in the static control flow graph In this case, procedure RefineEventSequence is invoked to refine the current event sequence E Fig Procedure AdjustWhenViolated takes as input two events e1 and e2, event sequence E, violating statement vs and program execution p It attempts to adjust p in order to avoid the execution of the violating statement vs In the case the path adjustment fails, it will refines the current event sequence E by calling RefineEventSequence procedure 49 T Do et al / Information and Software Technology 66 (2015) 40–57 Fig Procedure RefineEventSequence takes as input two events e1 and e2, event sequence E, statement s, and program execution p It attempts to incorporate direct and indirect data dependences into event sequence E from program nodes that prevented program execution p from reaching event e2 RefineEventSequence calls GenerateEventSequences procedure to perform the (extended) chaining approach for creating refined event sequences Fig Procedure SolveAtBranch takes as input a branch statement b and a program execution p It attempts to change the control flow from executing branch b to executing its alternative branch instead This is achieved by negating the symbolic predicate up to branch b and then solving the corresponding constraint system with the underlying constraint solver (SolveConstraintSystem) ExecuteProgram procedure performs dynamic symbolic execution to return a new program execution Next, we illustrate how the sequence refinement procedure RefineEventSequence works to refine a given event sequence The input to RefineEventSequence includes two events e1 and e2, event sequence E, statement s, and program execution p The context of calling this procedure is that the sequence concretization process while attempting to reach event e2 encountered statement s where:  s violated the constraint C of event e1 and the algorithm failed to adjust the program execution p in order to avoid the execution of s (scenario 2) or  the program execution p if following s can no longer reach event e2 (scenario 4) The main functionality of the RefineEventSequence procedure is to identify problem nodes causing the failure of concretizing sequence E on the executed path PP of execution p To this, it inspects all branch statements from node n of event e1 down to statement s, and at each branch b checks its alternative branch b the following conditions: into consideration that the execution of b can potentially reach event e2 quickly If these two conditions together are satisfied, then the conditional statement c of b is considered to be a problem node used to refine sequence E The GenerateEventSequences procedure (Section 3) is then invoked to incorporate direct (and indirect) data dependences of c into sequence E to create more refined event sequences These created sequences are associated with the current execution p before being added into worklist Finally, the SolveAtBranch procedure is given in Fig This procedure attempts to change the input program execution p from branch b to its alternative branch b Buffer overflow testing In this section, we present Sebo—a dynamic Symbolic Execution-based Buffer Overflow testing framework The main focus of Sebo is to explore buffer overflow vulnerabilities in C programs by combining type inference analysis and dynamic symbolic execution techniques The challenging problems motivating the development of Sebo can be summarized below: if event e2 is transitively control dependent on b and if event b has a minimal distance to event e2 The first condition is to ensure that changing the execution of p from b to b can eventually reach e2 The second condition takes  Type inference analysis can ensure memory safety by inserting runtime checks to verify potential overflow vulnerabilities on buffer operations However, it lacks the ability to uncover circumstances under which buffer operations are vulnerable 50 T Do et al / Information and Software Technology 66 (2015) 40–57 Fig 10 Code snippet illustrating a buffer overflow error The error happens if the value of index variable is equal to the size of the array buf This is known as one-off-bound array access errors Fig 11 Deputy’s output inserts a runtime check right before the write operation on the array buf The inserted check ensures that the value of index variable must be in the range of [0, 1000) before executing the assignment statement  Dynamic symbolic execution is an effective technique to automate test input generation in software testing However, it suffers significantly from the combinatorial explosion of the path space and therefore is problematic to apply The execution mechanism of Sebo is therefore designed to first diagnose potential buffer overflow vulnerabilities using Deputy [13]—a type system for pointers—and then to explore actual vulnerabilities using Crest [7]—an extensible symbolic execution engine—to perform dynamic symbolic execution for test input generation To enable this mechanism, Sebo consists of the following design components: buffer overflow checking, dynamic symbolic execution-based test input generation, and goal-oriented testing In the following subsections, we describe these design components that comprise our proposed framework Sebo for buffer overflow vulnerability detection 5.1 Buffer overflow checking Given a C program under test, Sebo uses Deputy [13]—a type system for pointers, to enforce memory safety in the program Memory safety means that all memory accesses are within the bounds of an allocated memory region Enforcing these safety properties is an important step toward improving C programs as this eliminates many common security vulnerabilities, such as buffer overflows In Deputy, all buffer operations are verified by a hybrid type-checking approach In the first place, Deputy verifies buffer operations by statically reasoning about runtime values of expressions in the program For buffer operations where static verification is not sufficient, Deputy enforces safety policies by inserting runtime checks Any violations to runtime checks reveal errors in the program Programmers can use runtime checks to check under which circumstances violations occur Sebo translates runtime checks into test goals, each test goal expresses the opposite semantic of a runtime check, e.g by negating runtime check conditions A test goal encodes conditions under which buffer operations are vulnerable As a result, the goal of exploring buffer overflow errors can now be reduced to finding test inputs to explore test goals To illustrate, consider the piece of code given in Fig 10, which assigns ‘?’ symbol to buf array of size N = 1000 at the index index Deputy verifies this memory write operation may not be safe and inserts a runtime check right before this write operation Fig 11 This check CLeq is a pointer access check with the condition (buf + index) + = 1000), then there is a memory access error SEBO_ERROR() Fig 12 shows Sebo’s output 5.2 Dynamic symbolic execution-based test generation Sebo uses Crest [7]—an extensible symbolic execution engine, to perform test input generation After the buffer overflow checking phase, the output program (together with a set of test goals) is instrumented for concrete and symbolic execution In Crest, one can configure search algorithms to carry out path exploration The output is test goals explored and corresponding test inputs We now explore path exploration of dynamic symbolic execution with respect to the ability to trigger program defects In Fig 10, the given code adds three more paths into the path space, these being: Path : index < Path : index >¼ ^ index > 1000 Path : index >¼ ^ index = ^ index = 1000 This condition, when associated with the path constraint of path 3, enforces the constraint solver to return the solution: index = 1000 to uncover the overflow vulnerability The use of Deputy is hence important in enabling Sebo to detect buffer overflow vulnerabilities in C programs 51 T Do et al / Information and Software Technology 66 (2015) 40–57 Fig 12 Sebo’s output interprets the Deputy runtime check for buffer overflow as follows: if the value of index variable is equal or greater than 1000, then there is an array access error Table An overview of the benchmark selected in the evaluation of the Sebo framework for buffer overflow testing The benchmark includes 23 test subjects that correspondingly represent 23 buffer overflow vulnerabilities Column # Loc shows the number of lines of code of each vulnerability while column # Statement gives the number of statements Program Subject Vulnerability Test subject # Loc # Statement apache 10 11 12 13 14 15 16 17 18 19 20 21 22 23 — CVE-2004-0940 CVE-2006-3747 CVE-2006-6909 CVE-2007-0406 CVE-2007-0455 CVE-2006-6332 CVE-2006-6652 CVE-2006-6749 CVE-2006-6876 CVE-2007-0453 BID-6679 CA-1999-14 CVE-2001-0011 CVE-1999-0368 CVE-1999-0878 CVE-2003-0466 CVE-1999-0047 CVE-1999-0206 CVE-2001-0653 CVE-2002-0906 CVE-2002-1337 CVE-2003-0161 CVE-2003-0681 — iter1_prefixLong_arr_bad full_bad strcmp_bad simp_bad gd_full_bad interproc_bad glob3_int_bad parse_config_bad istrstr_loops_bad nonsimp_bad loop_bad simp_bad big_bad prefix_bad small_invalid symlinks_bad mime2_bad mime_fromqp_ptr_bad tTflag_bad parse_dns_reply_cast_bad crackaddr_bad prescan_arr_med_test_bad both_bad — 167 184 161 58 318 134 166 160 111 102 57 166 362 278 89 257 481 108 269 66 611 115 92 4512 292 270 179 47 386 108 200 266 119 68 91 119 299 383 86 351 544 97 200 71 749 104 104 5133 edbrowse gxine libgd MADWiFi NetBSD-libc OpenSER samba SpamAssassin bind wu-ftpd sendmail Total 5.3 Goal-oriented testing The overall objective of Sebo is to strengthen the buffer overflow detection capability by combining type inference analysis and dynamic symbolic execution techniques In Sebo, any search algorithm can be employed to perform path exploration for exploring buffer overflow errors Recognizing the combinatorial explosion problem of the path space experienced by dynamic symbolic execution, the Sebo design reduces the buffer overflow testing task to exploring given test goals in the code rather than exhaustively exploring all feasible program paths In this context, the development of search algorithms to enhance the efficiency of path exploration over the whole program path space for effectively and efficiently exploring given test goals is central in the Sebo testing framework Our search algorithm Guider proposed in Section is an attempt toward achieving this goal Evaluation To evaluate the capability of our proposed search algorithm— Guider, in exploring buffer overflow vulnerabilities, we consider the following research questions:  In buffer overflow detection, how does Guider perform in comparison with systematic depth-first search?  In buffer overflow detection, how does Guider perform in comparison with coverage-optimized search?  How does the time setting influence the capability of search algorithms in buffer overflow detection?  How does the number of paths explored influence the capability of search algorithms in buffer overflow detection? In the following subsections, we describe (1) the set of selected test subjects, (2) our evaluation methodology, (3) the experimental results, and (4) the threats to validity 6.1 Subjects We use the buffer overflow benchmark developed by Ku et al [31] to conduct the evaluation The benchmark consists of 23 buffer overflow vulnerabilities, extracted from historic vulnerabilities in 12 widely used applications written in C programming language Most of the vulnerabilities come from the Common Vulnerabilities and Exposures (CVE) database [15] They cover a wide variety of overflow errors with sufficient complexity to make analysis challenging Each vulnerability comes with a set of different types of programs, representing different levels of difficulties of vulnerability exploration We selected the programs of sizes from 50 to 650 lines of C code to perform the experiments Table gives an overview of these selected test subjects Note that an application may have several test subjects, and each test subject represents a particular buffer overflow vulnerability For example, the FTP server daemon wu-ftpd has three test subjects, these being 14, 15 and 16; and these test subjects corresponds to vulnerabilities CVE-1999-0368, CVE-1999-0878 and CVE-2003-0466, respectively 6.2 Methodology Using the same set of test subjects, we compared the buffer overflow detection capability of our proposed search algorithm Guider with that of two widely adopted search algorithms: depth-first search (or Dfs) and control-flow graph directed search (or CfgDirected) Dfs [23] offers a systematic search mechanism 52 T Do et al / Information and Software Technology 66 (2015) 40–57 to exhaustively explore all feasible program paths and much research has confirmed its ineffectiveness in practical situations due to the combinatorial explosion of the path space We therefore evaluated to what extent Guider can mitigate this path explosion problem as suffered by Dfs to explore overflow errors In contrast to Dfs, CfgDirected [7] is an example of a coverage-optimized approach This search algorithm makes use of the static control flow graph of the program to guide the search down short static paths to unexplored code Burnim and Sen [7] showed that CfgDirected was the best search algorithm to maximize structural coverage We therefore evaluated Sebo against CfgDirected since recent approaches tend to strengthen error detection capabilities by improving structural coverage, for instance Klee [9] and Sage [24] Notice that this does make sense in the context of our Sebo testing framework, where a buffer overflow error is actually encoded as a statement in the program For the time setting, we used time limits (1 min, min, 10 min, and 30 min) and conducted the experiments using the following strategy With a given search algorithm, we started with the first time limit (1 min) and ran it on all test subjects, for each subject For each test subject, we captured: (1) if the buffer overflow vulnerability was uncovered; (2) the number of test goals explored; (3) the time amount spent; and (4) the number of paths explored For all the test subjects that the search algorithm could not uncover vulnerabilities, we conducted the experiments again but on the next time limit (5 min) This procedure was repeated for the 10 and 30 time limits, respectively By doing this, we could study the impact of time on the capability of search algorithms in exploring buffer overflow errors Note that we stopped a search algorithm when it could explore all test goals of the current test subject and recorded the time used If at least one test goal was explored, we considered the vulnerability to have been uncovered and did not move to larger time limits This was done with the understanding that (1) if an error was detected, fixing it may prevent other errors to happen; and (2) due to the limitation of type inference analysis, not all test goals in Sebo represent real buffer overflow errors (i.e they can be false positives and therefore unreachable) All experiments were run on GHz CoreTM2 Duo CPU with GB of RAM, running Ubuntu GNU/Linux with a 32-bit kernel version 3.2.0–38 In the Sebo approach, a test subject is first analyzed to enforce memory safety by means of runtime checks These runtime checks then were interpreted in the form of test goals, where the testing task was to activate search algorithms to perform dynamic symbolic execution to expose test goals 6.3 Experimental results Table summarizes the experimental results we obtained after the first testing time limit (1 min) Dfs was able to successfully uncover buffer overflow vulnerabilities for 13 of the 23 test subjects Remarkably, for 10 of these uncovered vulnerabilities, Dfs accomplished the path exploration process within or s A manual investigation of the 10 subjects showed that all the test subjects had only one test goal and the structure of the program leading to the execution of the test goal was amenable to depth-first orders This enabled Dfs to uncover the test goal within a small number of path explorations For the other cases, Dfs timed out while attempting to uncover all the test goals of each test subject For the remaining cases, 10 out of 23, Dfs failed to uncover within the testing time limit given CfgDirected uncovered of 23 buffer overflow vulnerabilities, demonstrating a less effective performance than Dfs for the first testing time limit Amongst these cases, CfgDirected was able to terminate the path exploration process within s for cases; for the other cases, it timed out Noticeably, for several test subjects where Dfs could discover overflow errors within a matter of a few seconds, CfgDirected could not even though it explored a considerable number of paths CfgDirected algorithm failed to uncover 16 cases within of testing Guider demonstrated a significantly improved performance over both Dfs and CfgDirected It discovered in total 22 buffer overflow vulnerabilities The only case that Guider failed to uncover after one minute of testing was sendmail/CVE-2001-0653 Amongst the 22 cases uncovered, Guider timed out in cases while attempting to explore all test goals For the other 18 cases, Guider optimized path exploration to uncover vulnerabilities within relatively small numbers of paths explored In the case of wu-ftpd/CVE-2003-0466, Dfs took s and explored 56 paths to Table Testing results of Dfs, CfgDirected, and Guider when performed on the benchmark of 23 buffer overflow vulnerabilities in testing limit The testing result has the format: [Y|N]/(X/Y)/T/P, where Y means the vulnerability was uncovered and N otherwise, X is the number of test goals explored, Y is the total number of test goals of the test subject, T is the time amount spent (measured in seconds), and P is the number of paths explored Program apache edbrowse gxine libgd MADWiFi NetBSD-libc OpenSER samba SpamAssassin bind wu-ftpd sendmail Subject 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Dfs CfgDirected Guider N/(0/3)/60/2741 N/(0/1)/60/997 N/(0/1)/60/668 Y/(3/3)/5/1587 N/(0/1)/60/22,041 N/(0/1)/60/5874 Y/(3/3)/1/51 Y/(1/1)/0/29 Y/(1/1)/1/36 Y/(1/1)/0/17 N/(0/1)/60/15,843 Y/(1/2)/0/1 N/(0/1)/60/1477 N/(0/1)/60/13,240 N/(0/1)/60/17,792 Y/(1/1)/0/17 Y/(1/1)/0/199 Y/(1/1)/0/2 Y/(1/1)/0/30 Y/(1/1)/0/40 Y/(1/1)/1/28 Y/(1/1)/1/56 Y/(2/7)/60/8693 Y/(2/3)/60/3270 N/(0/1)/60/11,860 Y/(1/1)/0/4 N/(0/28)/60/647 N/(0/3)/60/1908 Y/(1/2)/60/10,685 N/(0/1)/60/21,747 Y/(1/1)/5/1661 Y/(1/2)/60/18,057 N/(0/1)/60/19,186 N/(0/1)/60/19,810 N/(0/1)/60/21,221 N/(0/1)/60/21,515 Y/(1/1)/0/22 Y/(1/1)/0/6 N/(0/1)/60/20,820 N/(0/1)/60/19,780 N/(0/1)/60/21,688 N/(0/1)/60/20,683 Y/(4/7)/60/12,855 N/(0/3)/60/21,700 N/(0/1)/60/18,239 Y/(1/1)/0/4 N/(0/28)/60/2841 N/(0/3)/60/11,417 N/(0/2)/60/20,342 Y/(1/1)/0/17 Y/(1/1)/6/1050 Y/(1/2)/0/1 Y/(1/1)/1/37 Y/(1/1)/0/53 Y/(1/1)/0/23 Y/(1/1)/0/18 Y/(1/1)/0/15 Y/(1/1)/0/4 Y/(1/1)/0/33 Y/(1/1)/1/50 Y/(1/1)/0/28 Y/(1/1)/3/661 Y/(4/7)/60/5254 Y/(2/3)/60/7003 N/(0/1)/60/3360 Y/(1/1)/0/5 Y/(1/28)/60/1483 Y/(2/3)/60/2620 Y/(2/2)/0/42 T Do et al / Information and Software Technology 66 (2015) 40–57 Table Numbers of explored paths by Dfs on 10 buffer overflow vulnerabilities after min, 10 min, and 30 of testing Program apache edbrowse libgd NetBSD-libc OpenSER sendmail Subject 19 21 22 Dfs 10 30 8229 1805 3159 74,053 4985 63,037 82,795 56,872 2497 2999 16,348 2568 6114 144,913 9354 123,228 162,538 108,805 4522 — 47,779 5705 18,299 146,631 25,168 357,298 466,363 313,130 25,496 — Table Numbers of explored paths by CfgDirected on 16 buffer overflow vulnerabilities after min, 10 min, and 30 of testing Program apache edbrowse gxine NetBSD-libc OpenSER samba bind wu-ftpd sendmail Subject 10 13 14 15 16 18 19 21 22 23 CfgDirected 10 30 105,970 9793 105,717 91,535 93,380 98,522 105,025 96,970 92,692 101,617 94,752 99,612 85,548 19,227 56,374 97,270 209,983 19,561 208,689 179,648 185,361 200,057 211,813 198,744 186,849 205,735 196,361 204,932 172,897 — 114,811 194,751 621,714 108,802 621,818 549,749 587,953 598,077 623,439 574,827 563,949 614,851 561,692 607,686 545,163 — 350,075 597,552 Table Numbers of explored paths by Guider on buffer overflow vulnerability after min, 10 min, and 30 of testing Program sendmail Subject 19 Guider 10 30 8424 10,244 10,348 discover the vulnerability, while Guider spent s and explored 661 paths to achieve the same result In summary, after the first testing time limit, Dfs failed in 10 cases, CfgDirected 16 cases, and Sebo just case to uncover buffer overflow vulnerabilities We continued the experiments on these failing cases with the extended testing time limits The purpose of doing this was to determine if the capability of search algorithms in detecting buffer overflow vulnerabilities is improved through the ability to explore many more program paths when the testing time is expanded We started with min, moved to 10 min, and finally 30 In the consideration of the relatively small size of the selected test subjects, we decided to stop after 30 The experimental results are given in Tables 3–5 Note that subjects in ‘‘Subject’’ columns refer to the corresponding vulnerabilities in Table and the figures are the numbers of paths explored by search algorithms within given testing time limits After of testing, Dfs (Table 3) timed out in all 10 cases On sendmail/CVE-2003-0161 subject, it was able to uncover the buffer overflow vulnerability, however When the remaining failing cases were tested on the extended time limit (10 min), 53 none could be explored by this search algorithm When the time limit was set to 30 min, Dfs discovered the buffer overflow error of libgd/CVE-2007-0455 only after 606 s, with 146,631 paths explored For CfgDirected (Table 4), of the 16 failing cases, only one, sendmail/CVE-2002-1337, was uncovered after For the remaining 15 cases, on both the 10 and 30 time limits, CfgDirected was not able to uncover any cases Remarkably, with increases in the testing time we witnessed a massive increase in the number of program paths explored by both Dfs and CfgDirected However, looking for a path to trigger the execution of buffer overflow errors is problematic for these search algorithms even though we were testing programs with a few hundreds of lines of code Guider (Table 5) failed to uncover the buffer overflow vulnerability of its only failed case in all time limits Note however that on this test subject both Dfs and CfgDirected also failed This is because on this test subject, sendmail/CVE-2001-0653, Deputy is able to trigger the buffer overflow error but it is not able to trigger the integer underflow error, which is the root cause leading to the buffer overflow Dynamic symbolic execution that only focuses on path exploration is not sufficient to uncover this buffer overflow vulnerability within the time limits allocated We conclude the experiments with the following observations First, our proposed search algorithm Guider demonstrated very encouraging results over both Dfs and CfgDirected in both the capability to uncover buffer overflow errors and the capability to optimize the path exploration For several test subjects, Guider could explore the buffer overflow errors within a matter of a few seconds, while Dfs and CfgDirected both failed after even 30 of testing This explains the efficiency of using data dependence analysis in guiding dynamic analysis Second, the time setting does influence the capability of search algorithms in buffer overflow detection, its impact is relatively small Finally, exploring more paths has very little impact on improving the buffer overflow detection capability 6.4 Threats to validity We evaluated our search algorithm using only one benchmark of 23 relatively small-sized programs The evaluation setting used two baseline search algorithms and four testing time limits It is possible that other programs and other evaluation settings would exhibit different results We acknowledge that the search algorithm is currently an experimental study and we should extend the proposed approach and conduct experiments on large programs to properly assess the validity of our proposal and observations We, however, believe that when testing sizeable and complex programs where the path space is too large to exhaustively explore, the ability to break down the path space and to precisely guide the path exploration by focusing on selected aspects of semantics of our proposed approach is essential for optimizing the very expensive cost of performing dynamic symbolic execution to strengthen security vulnerability detection Also as part of the future work, we will attempt to enhance the evaluation methodology to analyse the impact of the different aspects of the proposed approach contributing to the obtained results For example, an assessment of how effectively indirect data dependences affect the exploration of buffer overflow vulnerability defects in comparison with direct data dependences will give insightful observations about the proposed search technique Additionally, measurements of how quickly the minimal distance usage helps reach and uncover vulnerabilities may be useful to gain interesting insights about the use of the static program structure 54 T Do et al / Information and Software Technology 66 (2015) 40–57 Related work The work presented in this paper is a further development of our previous work [19], where an initial study was conducted to examine the potential of using data dependency analysis to guide dynamic symbolic execution The previous work targeted to improve structural program coverage This work improved upon the previous with several core development enhancements Firstly, we now implemented whole program data dependency analysis [49] and considerably whole program dynamic control dependency analysis [52] to enable our path exploration algorithm Guider to deal with more complex test subjects Secondly, the current working version of Guider is the result of further developing, refining, and formalizing the previous approach to enable unfolding deep and nested loop structures for testing challenging defects such as buffer overflow vulnerabilities Finally, noticeable effort has attempted to finalize the buffer overflow testing framework in which potential memory safety violations identified by Deputy [13] are transformed into test goals and Guider is to perform path exploration based on dynamic symbolic execution to actively search for test goals This section continues to evaluate the current Guider approach with other related work 7.1 Symbolic execution Symbolic execution was introduced in the 1970s [8, 30] and has received an increasing industrial and academic interest during the last decade [11, 39] A significant scalability challenge when applying symbolic execution is how to handle the combinatorial explosion of the path space Approaches being in favor of depth-first explorations such as Dart [23] and Cute [45] deeply widen the program path space but lack the ability to forward the execution to further unexplored control flow points These approaches when executed against large programs for finite time often ends up with small regions of the code explored, failing to uncover errors harboring in the unexplored code In the attempt to deal with the path explosion problem, novel techniques have been proposed, including abstraction [3], compositional [22, 32], and parallel [46] techniques The aim is to prune the path space needed to systematically explore [3, 22, 32] or to enlarge path exploration by exploiting the increased availability of computational power [46] A large body of research, however, explores practical tradeoffs by developing search algorithms to improve the efficiency of path exploration over the path space Most proposed algorithms focus on achieving high statement and branch coverage Pex [47] is an automated structural testing tool developed at Microsoft Research; it integrates a rich set of basic search strategies and gives a fair choice among them While the integration does help improve code coverage by attempting different program control flows, exploring code elements may require specific guidance of control and data dependencies Fitnex [51] further makes Pex more guided by using fitness functions to measure the improvement of path exploration The main obstacle of this approach is the flag problem [6], where fitness functions face a flat fitness landscape, giving no guidance to the search process Flags, unfortunately, are widely used in real world programs [6, 12] Klee [9] is open-source and has been used by a variety of researchers in academia and industry Like Pex, Klee also implements a number of search heuristics and activates each heuristic in a round robin fashion for high coverage Baluda et al [4] showed however that when the executability of code elements requires data dependences computed inside loop or nested loop structures, Klee achieves very poor coverage Crest [7] is an extensible platform for building and experimenting with search heuristics for achieving high structural coverage Among heuristics implemented in Crest, control-flow graph directed search (or CfgDirected) is shown to be more effective than the others through the reported experimental data This search strategy leverages the static control flow of the program to guide the search down short static paths to unexplored code Theoretically, the control flow guidance may be imprecise since the execution of code elements may require data dependences going beyond static short paths and/or being calculated in dynamic paths Practically, CfgDirected goes k steps backward on the currently explored path to continue path exploration In Sage [24], it implements a generational search for multiple path exploration with a coverage-optimized heuristic to improve coverage Thummalapenta et al [48] and Ma et al [38] both confirmed that approaches such as Pex, Klee, and Sage without the ability to precisely focus on a particular code element may not be sufficient to improve coverage and to enhance error detection capabilities It is worth mentioning that the practical impact of the path space explosion limitation on the applicability of dynamic symbolic execution is tremendous Microsoft Research, for example, has 100 + machines running the Sage system performing dynamic symbolic execution [26] Coping with this fundamental scalability problem therefore emerged as a primary objective during this research When compared to these approaches, our proposed search algorithm differs by directing path exploration to focus on data dependencies in order to influence the execution of given code elements This feature is important in the context that dynamic symbolic execution faces the path explosion problem and the execution of considerable portions of the code does not depend directly on the symbolic input In regression testing, program slicing has been utilized to perform path exploration toward exposing changes in different versions of the program for test suite augmentation [40, 41, 44] Person et al [40] presented an intra-procedural analysis technique that takes advantage of control and data dependence information to guide symbolic execution toward exploring program paths that characterize the changes of the modified program as compared to its original version The technique has been evaluated using three Java programs having no loops and recursions and shown to be effective for augmenting the test suite for regression testing Program slicing was also used to construct abstract models for exploring feasible execution traces to defined locations, e.g given reachability properties, potential deadlocks and race conditions [43] However, since both control and data dependencies are taken into account, the resulting slice may be too large to explore Note that a single small piece of code can yield a number of paths too huge to exhaustively be explored When compared to these techniques, our proposed search algorithm focuses only on data dependency 7.2 The chaining approach The chaining approach that we utilized in this work is a test input generation technique [20], which relies on a local search method called the alternative-variable method to find test inputs but this is performed largely randomly McMinn and Holcombe [37] extended the chaining approach and proposed to combine with search-based testing to improve the test input generation efficiency In this work, we leveraged the extended chaining approach [37] to improve the efficiency of dynamic symbolic execution-based path exploration toward exploring buffer overflow vulnerabilities This is implemented in our testing framework—Sebo, in integration with type inference analysis T Do et al / Information and Software Technology 66 (2015) 40–57 Note that in the benchmark we experimented in this work, all the buffer overflow defects require unrolling loop structures and propagating indirect data dependences to be exposed These conditions are beyond the capability of the original chaining approach Besides, the impact of minimal distance does speak very little in our experiments For most buffer overflow vulnerabilities, the refinement of event sequences to carry indirect data dependences and combines them together in the extended chaining approach is crucial to uncovering the buffer overflows In CfgDirected in Crest, it uses minimal distance to guide the path exploration; in its actual implementation, it actually goes backward along the current program execution to explore the path space More recently, Thummalapenta et al [48] proposed Seeker implemented on top of Pex [47] Seeker comes up with a method sequence synthesis approach for improving high coverage in object-oriented programs This approach exploits data dependence analysis to synthesize method sequences that produce desired object states to reach target locations Seeker explores only one level of data dependences and thus faces limitations like the chaining approach does In the mobile application domain, data dependence analysis has been exploited to guide automated software testing for event-driven applications [1, 2, 29] Jensen et al [29] presented a system testing approach that combines symbolic execution and event sequence generation Symbolic execution computes event handler summaries and data and control flow information of each event handler Event sequence generation is to find valid event sequences and test inputs to explore desired code elements The approach works in the assumption that the complexity of the application under test does not lie in the code base of event handlers but the interaction between them Guider, in contrast, attempts to mitigate the path space explosion raised due to the high complexity of the application code Guider currently does not work with event-driven applications, however 7.3 Search-based testing Search-based testing [28] formulates the process of finding test inputs as a search problem [10, 27] It uses fitness functions to measure the improvement of the search process, and meta-heuristic search techniques such as Hill Climbing, Simulated Annealing, and Evolutionary Algorithms to find test inputs The search space is the space of possible inputs of the program under test Most research on search-based testing focuses on automated test input generation [28] The most distinguishing feature is that unlike symbolic execution which depends mostly on the capability of constraint solvers to find test inputs, search-based testing performs the process of finding test inputs itself As a result, this approach can handle primitive input types such as floating-point numbers that non-linear constraint solvers cannot Furthermore, the use of fitness functions provides suitable measurement to guide the search process for exploring code elements These features of search-based testing have been exploited to improve dynamic symbolic execution toward achieving high structural coverage by guiding path exploration [51] and by solving floating point computations [35] The effectiveness of search-based testing in automated test input generation is limited in several aspects, however Firstly, the cost of finding test inputs can be considerably expensive since computation of fitness functions requires to concretely executing the program under test every time an exploratory move is performed [34] Note that for complex and sizable programs, a program execution can be expensive In symbolic execution, the process of generating test inputs is performed by solving constraint systems Finally, the use of fitness functions suffers significantly 55 from the flag problem [6] Flags are commonly used in programming practice [6, 12], however Recently, Del Grosso et al [18] proposed to combine search-based testing with type inference analysis for detecting buffer overflow errors This approach utilizes program slicing to identify statements influencing the execution of potential buffer overflow errors for search optimisation Our approach differs by considering only data dependences and addressing control dependences on the fly Furthermore, our approach exploits the static program structure to quickly explore overflow errors Del Grosso et al [18] also conducted experiments against the vulnerabilities bind/CA-1999-14, wu-ftpd/CVE-1999-0368, wu-ftpd/CVE-1999-0878, and wu-ftpd/CVE-2003-0466 In terms of execution performance, their approach required a considerable amount of time while our approach Guider terminated the search process within a matter of few seconds to detect buffer overflow errors 7.4 Buffer overflow checking Buffer overflows account for nearly half of all known security vulnerabilities in real world software [50] The severity of buffer overflow attacks has been ranked high among software security vulnerabilities since an attack may let the attacker to gain control of the host system and then execute the injected code Not surprisingly, buffer overflow vulnerabilities dominate in the area of remote network penetration vulnerabilities [17] and represent one of the most serious classes that security threats [16] Despite the considerable effort to statically and automatically tackle buffer overflow defects, it has been recognized that in many cases concretely executing the program under test is the only way to address this problem [18, 42] Dynamic symbolic execution-based techniques such as Exe [12] and Sage [24] systematically inject assertions into the program during test input generation This may lead to a performance burden on the entire testing system, which is already slow because of the expense of performing dynamic symbolic execution and the unpredictable effectiveness of constraint solvers Pex [47] and Klee [9] work on NET Framework and LLVM [33], respectively, where memory safety is verified by the underlying complier In our Sebo testing framework, Deputy [13] was utilized to provide a goal-oriented testing approach, such that dynamic symbolic execution was used for uncovering buffer overflow vulnerabilities instead of systematically exploring all possible program paths Conclusion Automation of software testing is important in improving the quality and reliability of software applications This paper explored and developed automated methods to improve the efficiency of software testing The starting point was the test input generation technique based on dynamic symbolic execution, we studied the use of program control and data dependence to strengthen path exploration over the potentially infinite path space for effective security vulnerability testing The main contribution of this research is the path exploration algorithm Guider Guider utilizes data dependence analysis to identify a root cause leading to the execution of a given test goal It performs dynamic symbolic execution-based path exploration in a way to propagate data dependences down to the goal structure in order to influence its execution Guider observes control dependences on the fly and optimizes path exploration by exploiting the static control structure of the program under test Based on the proposed path exploration algorithm, we formed a goal-oriented testing approach for automatically exploring safety violations in the program The proposed testing framework Sebo 56 T Do et al / Information and Software Technology 66 (2015) 40–57 integrates ideas from symbolic analysis, constraint solving, dynamic program analysis, control and data dependence analysis, and type inference analysis, and apply them to strengthen buffer overflow vulnerability detection A preliminary evaluation conducted against 23 buffer overflow vulnerabilities shows a significant improvement of our search algorithm over two widely adopted search algorithms in both the capability to uncover vulnerabilities and the capability to optimize path exploration The evaluation also provides valuable observations in the context of developing techniques to perform dynamic symbolic execution toward uncovering program errors Acknowledgements We would like to thank the reviewers for their helpful comments and Suresh Thummalapenta and Tao Xie for their advice on the method sequence synthesis approach in Seeker [48] This work has been conducted under the post submission development scheme from the School of Computing and Mathematical Sciences, Auckland University of Technology, New Zealand It was partially done when the first author interned at the National University of Singapore (NUS), under the support of two NUS research Grants, R-252-000-403-112 and R-252-000-484-112 References [1] S Anand, M Naik, M.J Harrold, H Yang, Automated concolic testing of smartphone apps, in: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, 2012 [2] S Arlt, A Podelski, C Bertolini, M Schaf, I Banerjee, A.M Memon, Lightweight static analysis for GUI testing, in: Proceedings of the IEEE 23rd International Symposium on Software Reliability Engineering, 2012, pp 301–310 [3] S Anand, C.S Pa˘sa˘reanu, W Visser, Symbolic execution with abstract subsumption checking, in: Proceedings of the 13th International SPIN Workshop on Model Checking Software, 2006, pp 163–181 [4] M Baluda, P Braione, G Denaro, M Pezzè, Enhancing structural software coverage by incrementally computing branch executability, Software Qual J 19 (Dec 2011) 725–751 [5] E Bounimova, P Godefroid, D.A Molnar, Billions and billions of constraints: whitebox fuzz testing in production, in: Proceedings of the 2013 International Conference on Software Engineering, 2013, pp 122–131 [6] D Binkley, M Harman, K Lakhotia, FlagRemover: a testability transformation for transforming loop-assigned flags, ACM Trans Softw Eng Methodol 20 (12) (2011) [7] J Burnim, K Sen, Heuristics for scalable dynamic test generation, in: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008, pp 443–446 [8] L.A Clarke, A system to generate test data and symbolically execute programs, IEEE Trans Softw Eng (May 1976) 215–222 [9] C Cadar, D Dunbar, D.R Engler, KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs, in: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, 2008, pp 209–224 [10] J Clarke, J.J Dolado, M Harman, R Hierons, B Jones, M Lumkin, B Mitchell, S Mancoridis, K Rees, M Roper, M Shepperd, Reformulating software engineering as a search problem, IEE Proc – Softw 150 (3) (2003) 161–175 [11] C Cadar, P Godefroid, S Khurshid, C.S Pa˘sa˘reanu, K Sen, N Tillmann, W Visser, Symbolic execution for software testing in practice preliminary assessment, in: Proceedings of the 33rd International Conference on Software Engineering, 2011, pp 1066–1071 [12] C Cadar, V Ganesh, P.M Pawlowski, D.L Dill, D.R Engler, EXE: automatically generating inputs of death, ACM Trans Inform Syst Secur 12 (10) (Dec 2008) [13] J Condit, M Harren, Z.R Anderson, D Gay, G.C Necula, Dependent types for low-level programming, in: Proceedings of the 16th European Conference on Programming, 2007, pp 520–535 [14] C Cadar, K Sen, Symbolic execution for software testing: three decades later, Commun ACM 56 (Feb 2013) 82–90 [15] Common Vulnerabilities and Exposures, [16] Common Weakness Enumeration, The 2011 CWE/SANS Top 25 Most Dangerous Software Errors [17] C Cowan, P Wagle, C Pu, S Beattie, J Walpole, Buffer overflows: attacks and defenses for the vulnerability of the decade, in: Proceedings of DARPA Information Survivability Conference and Exposition, 2000, pp 119–129 [18] C Del Grosso, G Antoniol, E Merlo, P Galinier, Detecting buffer overflow via automatic test input data generation, Comput Oper Res 35 (10) (2008) 3125–3143 [19] T.A Do, A.C.M Fong, R Pears, Dynamic symbolic execution guided by data dependency analysis for high structural coverage, in: L.A Maciaszek, J Filipe (Eds.), Communications in Computer and Information Science, vol 410, Springer, Heidelberg, 2013, pp 3–15 [20] R Ferguson, B Korel, The chaining approach for software test data generation, ACM Trans Softw Eng Methodol (1996) 63–86 [21] J.E Forrester, B.P Miller, An empirical study of the robustness of Windows NT applications using random testing, in: Proceedings of the 4th Conference on USENIX Windows Systems Symposium, 2000, pp 6–6 [22] P Godefroid, Compositional dynamic test generation, in: Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007, pp 47–54 [23] P Godefroid, N Klarlund, K Sen, DART: directed automated random testing, in: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005, pp 213–223 [24] P Godefroid, M.Y Levin, D.A Molnar, Automated whitebox fuzz testing, in: Proceedings of Network and Distributed Systems Security, 2008 [25] P Godefroid, M.Y Levin, D.A Molnar, Active property checking, in: Proceedings of the 8th ACM International Conference on Embedded Software, 2008, pp 207–216 [26] P Godefroid, M.Y Levin, D.A Molnar, SAGE: whitebox fuzzing for security testing, Commun ACM 55 (3) (2012) 40–44 [27] M Harman, The current state and future of search based software engineering, in: Future of Software Engineering, 2007, pp 342–357 [28] M Harman, S.A Mansouri, Y Zhang, Search-based software engineering: trends, techniques and applications, ACM Comput Surv 45 (1) (2012) [29] C.S Jensen, M.R Prasad, A Møller, Automated testing with targeted event sequence generation, in: Proceedings of the 2013 International Symposium on Software Testing and Analysis, 2013, pp 67–77 [30] J.C King, Symbolic execution and program testing, Commun ACM 19 (July) (1976) 385–394 [31] K Ku, T.E Hart, M Chechik, D Lie, A buffer overflow benchmark for software model checkers, in: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, 2007, pp 389–392 [32] V Kuznetsov, J Kinder, S Bucur, G Candea, Efficient state merging in symbolic execution, in: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012, pp 193–204 [33] C Lattner, V.S Adve, LLVM: a compilation framework for lifelong program analysis and transformation, in: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, 2004, pp 75–88 [34] K Lakhotia, M Harman, H Gross, AUSTIN: an open source tool for search based software testing of C programs, Inf Softw Technol 55 (1) (2013) [35] K Lakhotia, N Tillmann, M Harman, J de Halleux, FloPSy: search-based floating point constraint solving for symbolic execution, in: Proceedings of the 22nd IFIP WG 6.1 International Conference on Testing Software and Systems, 2010, pp 142–157 [36] D.A Molnar, Dynamic Test Generation for Large Binary Programs, Ph.D Dissertation, University of California, Berkeley, 2009 [37] P McMinn, M Holcombe, Evolutionary testing using an extended chaining approach, Evol Comput 14 (1) (2006) [38] K.K Ma, Y.P Khoo, J.S Foster, M Hicks, Directed symbolic execution, in: Proceedings of the 18th International Conference on Static Analysis, 2011, pp 95–111 [39] C.S Pa˘sa˘reanu, W Visser, A survey of new trends in symbolic execution for software testing and analysis, Softw Tools Technol Transfer 11 (October) (2009) 339–353 [40] S Person, G Yang, N Rungta, S Khurshid, Directed incremental symbolic execution, in: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011, pp 504–515 [41] D Qi, H.D.T Nguyen, A Roychoudhury, Path exploration based on symbolic output, in: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011, pp 278– 288 [42] O Ruwase, M.S Lam, A practical dynamic buffer overflow detector, in: Proceedings of the 11th Annual Network and Distributed System Security Symposium, 2004 [43] N Rungta, E.G Mercer, W Visser, Efficient testing of concurrent programs with abstraction-guided symbolic execution, in: Proceedings of the 16th International SPIN Workshop on Model Checking Software, 2009, pp 174–191 [44] R.A Santelices, M.J Harrold, Exploiting program dependencies for scalable multiple-path symbolic execution, in: Proceedings of the 19th International Symposium on Software Testing and Analysis, 2010, pp 195–206 [45] K Sen, D Marinov, G Agha, CUTE: a concolic unit testing engine for C, in: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005, pp 263–272 [46] M Staats, C.S Pa˘sa˘reanu, Parallel symbolic execution for structural test generation, in: Proceedings of the 19th International Symposium on Software Testing and Analysis, 2010, pp 183–194 [47] N Tillmann, J de Halleux, Pex – white box test generation for.NET, in: Proceedings of the 2nd International Conference on Tests and Proofs, 2008, pp 134–153 [48] S Thummalapenta, T Xie, N Tillmann, J de Halleux, Z Su, Synthesizing method sequences for high-coverage testing, in: Proceedings of the 2011 ACM T Do et al / Information and Software Technology 66 (2015) 40–57 International Conference on Object Oriented Programming Systems Languages and Applications, 2011, pp 189–206 [49] M Weiser, Program slicing, in: Proceedings of the 5th International Conference on Software Engineering, 1981, pp 439–449 [50] D Wagner, J.S Foster, E.A Brewer, A Aiken, A first step towards automated detection of buffer overrun vulnerabilities, in: Proceedings of the Network and Distributed System Security Symposium, 2000 57 [51] T Xie, N Tillmann, J de Halleux, W Schulte, Fitness–guided path exploration in dynamic symbolic execution, in: Proceedings of the 39th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2009, pp 359–368 [52] B Xin, X Zhang, Efficient online detection of dynamic control dependence, in: Proceedings of the 2007 International Symposium on Software Testing and Analysis, 2007, pp 185–195 ... carrying out goal-oriented test input generation Guider employs dynamic symbolic execution to perform test input generation and utilizes the extended chaining approach to guide path exploration toward... data dependences going beyond static short paths and/or being calculated in dynamic paths Practically, CfgDirected goes k steps backward on the currently explored path to continue path exploration... optimizes path exploration by exploiting the static control structure of the program under test Based on the proposed path exploration algorithm, we formed a goal-oriented testing approach for automatically

Ngày đăng: 16/12/2017, 10:34