168 Chapter 13 ■ Refactoring SELF-TEST QUESTION 13.1 Compare the factoring Inline Class with the factoring Extract Class. Once we have identified the classes within a software system, the next step is to review the relationships between the classes. The classes that make up the software collaborate with each other to achieve the required behavior, but they use each other in different ways. There are two ways in which classes relate to each other: 1. composition – one object creates another object from a class using new. An example is a window object that creates a button object. 2. inheritance – one class inherits from another. An example is a class that extends the library Frame class. The important task of design is to distinguish these two cases, so that inheritance can be successfully applied or avoided. One way of checking that we have correctly identified the appropriate relationships between classes is to use the “is-a” or “has-a” test: ■ the use of the phrase “is-a” in the description of an object (or class) signifies that it is probably an inheritance relationship. ■ the use of the phrase “has-a” indicates that there is no inheritance relationship. Instead the relationship is composition. (An alternative phrase that has the same meaning is “consists-of ”.) We return to the cyberspace invaders game, designed in Chapter 11, seeking to find any inheritance relationships. If we can find any such relationships, we can sim- plify and shorten the program, making good use of reuse. In the game, several of the classes – Defender, Alien, Laser and Bomb – incorporate the same methods. These methods are: getX, getY, getHeight and getWidth that obtain the position and size of the graphical objects. We will remove these ingredients from each class and place them in a superclass. We will name this class Sprite, since the word sprite is a commonly used term for a moving graphical object in games programming. The UML class diagram for the Sprite class is: 13.7 ● Identify composition or inheritance One way in which a class can become very small is when it has been the subject of the Move Method and the Move Variable refactorings, so that it has become sucked dry. This illustrates how many of the refactorings are interconnected – using one leads to using another, and so on. BELL_C13.QXD 1/30/05 4:23 PM Page 168 13.7 Identify composition or inheritance 169 class Sprite Instance variables x y height width Methods getX getY getHeight getWidth Readers might see this more clearly if we look at the code. The Java code for the class Sprite is as follows: public class Sprite { protected int x, y, width, height; public int getX() { return x; } public int getY() { return y; } public int getWidth() { return width; } public int getHeight() { return height; } } The classes Defender, Alien, Laser and Bomb now inherit these methods from this superclass Sprite. Checking the validity of this design, we say “each of the classes Defender, Alien, Laser and Bomb is a Sprite”. Figure 13.1 shows these rela- tionships in a class diagram. Remember that an arrow points from a subclass to a superclass. > > BELL_C13.QXD 1/30/05 4:23 PM Page 169 Polymorphism enables objects that are similar to be treated in the same way. This means classes that share some common superclass. A section of code that uses a number of if statements (or a case statement) should be subject to scrutiny because it may be that 13.8 ● Use polymorphism 170 Chapter 13 ■ Refactoring We have now successfully identified inheritance relationships between classes in the game program. This concludes the refactoring – we have transformed the design into a better design. When we see common methods or variables in two or more classes, they become candidates for inheritance, but we need to be careful because delegation may be more appropriate. So we need to distinguish between the two. To sum up, the two kinds of relationship between classes are as follows. Relationship between classes test Java code involves SELF-TEST QUESTION 13.2 Analyze the relationships between the following groups of classes (are they is-a or has-a): 1. house, door, roof, dwelling 2. person, man, woman 3. car, piston, gearbox, engine 4. vehicle, car, bus. Inheritance is-a extends Composition has-a or consists-of new Sprite Defender Alien Laser Bomb Figure 13.1 Class diagram for inherited components in the game BELL_C13.QXD 1/30/05 4:23 PM Page 170 Summary 171 Summary Refactoring means improving the architectural structure of a piece of software. This can be done at the end of design or during design. A number of useful refactorings have been identified, given names and cataloged. The refactorings described in this chapter are: ■ Encapsulate Data ■ Move Method ■ Extract Class ■ Inline Class ■ identify composition or inheritance ■ use polymorphism. it is making poor use of polymorphism. The purpose of the if statements may be to distinguish the different classes and thereby take appropriate action. But it may be sim- pler to refactor the class, eliminate the if statements and exploit polymorphism. In the game program, we identified the commonalities in a number of classes – Alien, Defender, Bomb and Laser. We placed the common factors in a superclass called Sprite. Now we can treat all the objects uniformly. We place the game objects in an array list named game and write the following code to display them: for (int s = 0; s < game.size(); s++) { Object item = game.get(s); Sprite sprite = (Sprite) item; sprite.display(paper); } which is much neater than a whole series of if statements. The idea of taking a design and changing it can be a surprise. It may seem akin to cre- ating an ad hoc design and then experimenting with it. It has the flavor of hacking. Some people argue that a good design method should produce a good design – that it should not need improvement. Equally, many developers are reluctant to tinker with an architectural design that has been created according to sound principles. However, refactoring has a respectable pedigree. It recognizes that a perfect initial design is unlike- ly and it offers a number of possible strategies for improving a structure. A refactoring such as Extract Method gives the developer the green light to modify an initial design. Note that refactoring implies that iteration is commonly used during OOD. 13.9 ● Discussion BELL_C13.QXD 1/30/05 4:23 PM Page 171 172 Chapter 13 ■ Refactoring Answers to self-test questions 13.1 Inline Class is the opposite of the Extract Class. 13.2 1. a house has-a roof and a door. A house is-a dwelling 2. a man (and a woman) is-a person 3. an engine has-a piston and a gearbox and an engine 4. a car and a bus is-a vehicle. Exercises • 13.1 In the cyberspace invaders game, we have already carried out a refactoring, identify- ing a superclass Sprite and applying inheritance. Some of the graphical objects in the game move vertically (bombs, lasers) while some move horizontally (alien, defender). Consider new superclasses MovesVertically and MovesHorizontally and draw the class diagrams for this new inheritance structure. Assess whether this refac- toring is useful. 13.2 At what stage do you stop the process of refactoring? 13.3 Examine your architectural designs for the software case studies (Appendix A) and see if refactoring is achievable and desirable. BELL_C13.QXD 1/30/05 4:23 PM Page 172 PART C PROGRAMMING LANGUAGES BELL_CPARTC.QXD 1/30/05 4:30 PM Page 173 BELL_CPARTC.QXD 1/30/05 4:30 PM Page 174 Everyone involved in programming has their favorite programming language, or lan- guage feature they would like to have available. There are many languages, each with their proponents. So this chapter is probably the most controversial in this book. This chapter is not a survey of programming languages, nor is it an attempt to recommend one language over another. Rather, we wish to discuss the features that a good pro- gramming language should have from the viewpoint of the software engineer. We limit our discussion to “traditional” procedural languages, such as Fortran, Cobol, Ada, C++, Visual Basic, C# and Java. (Other approaches to programming languages are functional programming and logic programming.) The main theme of this chapter is a discussion of the basic features a language should provide to assist the software development process. That is, what features encourage the development of software which is reliable and maintainable? A significant part of the software engineer’s task is concerned with how to model, within a program, objects from some problem domain. Programming, after all, is large- ly the manipulation of data. In the words of Niklaus Wirth, the designer of Pascal, “Algorithms + Data Structures = Programs” – which asserts the symbiosis between data 14.1 ● Introduction CHAPTER 14 The basics This chapter reviews the basic features of a programming language suitable for software engineering, including: ■ design principles ■ syntax ■ control structures ■ methods and parameters ■ data typing ■ simple data structures. BELL_C14.QXD 1/30/05 4:23 PM Page 175 176 Chapter 14 ■ The basics and actions. The data description and manipulation facilities of a programming lan- guage should therefore allow the programmer to represent “real-world” objects easily and faithfully. In recent years, increasing attention has been given to the problem of providing improved data abstraction facilities for programmers. We discuss this in Chapter 15 on programming language features for OOP. As we shall see, most mainstream programming languages have a small core and all the functionality of the language is provided by libraries. This chapter addresses this core. Facilities for programming in the large are reviewed in Chapter 16. Other features of languages – exceptions and assertions – are dealt with in Chapter 17. It is important to realize that programming languages are very difficult animals to evaluate and compare. For example, although it is often claimed that language X is a general purpose language, in practice languages tend to be used within particular communities. Thus, Cobol has been the preferred language of the information sys- tems community, Fortran, the language of the scientist and engineer, C, the language of the systems programmer and Ada, the language for developing real-time or embed- ded computer systems. Cobol is not equipped for applications requiring complex numerical computation, just as the data description facilities in Fortran are poor and ill suited to information systems applications. Programming languages are classified in many ways. For example, “high-level” or “low-level”. A high-level language, such as Cobol, Visual Basic or C#, is said to be problem-oriented and to reduce software production and maintenance costs. A low-level language, such as assembler, is said to be machine-oriented, facilitating the program- mer’s complete control over the efficiency of their programs. Between high- and low- level languages, another category, the systems implementation language or high-level assembler, has emerged. Languages such as C attempt to bind into a single language the expressive power of a high-level language and the ultimate control which only a language that provides access at the register and primitive machine instruction level can provide. Languages may also be classified using other concepts, such as whether they are weakly or strongly typed. This is discussed below. Simplicity, clarity and orthogonality One school of thought argues that the only way to ensure that programmers will con- sistently produce reliable programs is to make the programming language simple. For programmers to become truly proficient in a language, the language must be small and simple enough that it can be understood in its entirety. The programmer can then use the language with confidence, probably without recourse to a language manual. 14.3 ● Design principles 14.2 ● Classifying programming languages and features BELL_C14.QXD 1/30/05 4:23 PM Page 176 14.3 Design principles 177 Cobol and PL/1 are examples of languages which are large and unwieldy. For example, Cobol currently contains about 300 reserved words. Not surprisingly, it is a common programming error mistakenly to choose a reserved word as a user-defined identifier. What are the problems of large languages? Because they contain so many features, some are seldom used and, consequently, rarely fully understood. Also, since language features must not only be understood independently, but also in terms of their interaction with each other, the larger the number of features, the more complex it will be and the harder to understand their interactions. Although smaller, simpler languages are clearly desirable, the software engineer of the near future will often have to wrestle with existing large, complex languages. For example, to meet the requirements laid down by its sponsors, the US Department of Defense, the programming language Ada is a large and complex language requiring a 300-page reference manual to describe it. The clarity of a language is also an important factor. In recent years, there has been a marked and welcome trend to design languages for the programmers who program in them rather than for the machines the programs are to run on. Many older languages incorporate features that reflect the instruction sets of the computers they were orig- inally designed to be executed on. As one example, consider the Fortran arithmetic if statement, which has the following form: if (expression) label1,label2,label3 This statement evaluates expression and then branches to one of the statements labeled label1, label2, or label3 depending on whether the result is positive, zero, or negative. The reason for the existence of this peculiar statement is that early IBM machines had a machine instruction which compared the value of a register to a value in memory and branched to one of three locations. The language designers of the 1960s were motivated to prove that high-level languages could generate efficient code. Although we will be forever grateful to them for succeeding in proving this point, they introduced features into languages, such as Cobol and Fortran, which are clumsy and error-prone from the programmer’s viewpoint. Moreover, even though the languages have subsequently been enhanced with features reflecting modern programming ideas, the original features still remain. A programming language is the tool that programmers use to communicate their intentions. It should therefore be a language which accords with what people find nat- ural, unambiguous and meaningful – in other words, clear. Perhaps language designers are not the best judges of the clarity of a new language feature. A better approach to testing a language feature may be to set up controlled experiments in which subjects are asked to answer questions about fragments of program code. This experimental psy- chology approach is gaining some acceptance and some results are discussed in the sec- tion on control abstractions. A programmer can only write reliable programs if he or she understands precisely what every language construct does. The quality of the language definition and sup- porting documentation are critical. Ambiguity or vagueness in the language definition erodes a programmer’s confidence in the language. It should not be necessary to have to write and run a program fragment to confirm the semantics of some language feature. BELL_C14.QXD 1/30/05 4:23 PM Page 177 . to “traditional” procedural languages, such as Fortran, Cobol, Ada, C++, Visual Basic, C# and Java. (Other approaches to programming languages are functional programming and logic programming. ) The. data abstraction facilities for programmers. We discuss this in Chapter 15 on programming language features for OOP. As we shall see, most mainstream programming languages have a small core and. dwelling 2. a man (and a woman) is -a person 3. an engine has -a piston and a gearbox and an engine 4. a car and a bus is -a vehicle. Exercises • 13.1 In the cyberspace invaders game, we have already carried