178 Chapter 14 ■ The basics Programming languages should also display a high degree of orthogonality. This means that it should be possible to combine language features freely; special cases and restrictions should not be prevalent. Java and similar languages distinguish between two types of variables – built-in primitive types and proper objects. This means that these two groups must be treated differently, for example, when they are inserted into a data struc- ture. A lack of orthogonality in a language has an unsettling effect on programmers; they no longer have the confidence to make generalizations and inferences about the language. It is no easy matter to design a language that is simple, clear and orthogonal. Indeed, in some cases these goals would seem to be incompatible with one another. A language design- er could, for the sake of orthogonality, allow combinations of features that are not very use- ful. Simplicity would be sacrificed for increased orthogonality! While we await the simple, clear, orthogonal programming language of the future, these concepts remain good meas- ures with which the software engineer can evaluate the programming languages of today. The syntax of a programming language should be consistent, natural and promote the readability of programs. Syntactic flaws in a language can have a serious effect on pro- gram development. One syntactic flaw found in languages is the use of begin-end pairs or bracketing conventions, {}, for grouping statements together. Omitting an end or closing bracket is a very common programming error. The use of explicit keywords, such as endif and endwhile, leads to fewer errors and more readily understandable programs. Programs are also easier to maintain. For example, consider adding a second statement with the Java if statement shown below. if (integerValue > 0) numberOfPositiveValues = numberOfPositiveValues + 1; We now have to group the two statements together into a compound statement using a pair of braces. if (integerValue > 0) { numberOfPositiveValues = numberOfPositiveValues + 1; numberOfNonZeroValues = numberOfNonZeroValues + 1; } Some editing is required here. Compare this with the explicit keyword approach in the style of Visual Basic. Here the only editing required would be the insertion of the new statement. if (integerValue > 0) numberOfPositiveValues = numberOfPositiveValues + 1; endif 14.4 ● Language syntax > > > > > > BELL_C14.QXD 1/30/05 4:23 PM Page 178 14.5 Control structures 179 In addition, explicit keywords eliminate the classic “dangling else” problem preva- lent in many languages – see the discussion of selection statements below Ideally the static, physical layout of a program should reflect as far as is possible the dynamic algorithm which the program describes. There are a number of syntactic con- cepts which can help achieve this goal. The ability to format a program freely allows the programmer the freedom to use such techniques as indentation and blank lines to high- light the structure and improve the readability of a program. For example, prudent inden- tation can help convey to the programmer that a loop is nested within another loop. Such indentation is strictly redundant, but assists considerably in promoting readability. Older languages, such as Fortran and Cobol, impose a fixed formatting style on the program- mer. Components of statements are constrained to lie within certain columns on each input source line. For example, Fortran reserves columns 1 through 5 for statement labels and columns 7 through 72 for program statements. These constraints are not intuitive to the programmer. Rather they date back to the time when programs were normally pre- sented to the computer in the form of decks of 80-column punched cards and a program statement was normally expected to be contained on a single card. The readability of a program can also be improved by the use of meaningful identi- fiers to name program objects. Limitations on the length of names, as found in early ver- sions of Basic (two characters) and Fortran (six characters), force the programmer to use unnatural, cryptic and error-prone abbreviations. These restrictions were dictated by the need for efficient programming language compilers. Arguably, programming languages should be designed to be convenient for the programmer rather than the compiler, and the ability to use meaningful names, irrespective of their length, enhances the self- documenting properties of a program. More recent languages allow the programmer to use names of unrestricted length, so that program objects can be named appropriately. Another factor which affects the readability of a program is the consistency of the syntax of a language. For example, operators should not have different meanings in different con- texts. The operator “=” should not double as both the assignment operator and the equal- ity operator. Similarly, it should not be possible for the meaning of language keywords to change under programmer control. The keyword if, for example, should be used solely for expressing conditional statements. If the programmer is able to define an array with the identifier if, the time required to read and understand the program will be increased as we must now examine the context in which the identifier if is used to determine its meaning. A programming language for software engineering must provide a small but power- ful set of control structures to describe the flow of execution within a program unit. In the late 1960s and 1970s there was considerable debate as to what control struc- tures were required. The advocates of structured programming have largely won the day and there is now a reasonable consensus of opinion as to what kind of primitive control structures are essential. A language must provide primitives for the three basic structured programming constructs; sequence, selection and repetition. There are, however, considerable variations both in the syntax and the semantics of the control structures found in modern programming languages. 14.5 ● Control structures BELL_C14.QXD 1/30/05 4:23 PM Page 179 180 Chapter 14 ■ The basics Early programming languages, such as Fortran, did not provide a rich set of con- trol structures. The programmer used a set of low-level control structures, such as the unconditional branch or goto statement and the logical if to express the control flow within a program. For example, the following Fortran program fragment illustrates the use of these low-level control structures to simulate a condition controlled loop. n = 10 10 if (n .eq. 0) goto 20 write (6,*) n n = n - 1 goto 10 20 continue These low-level control structures provide the programmer with too much freedom to construct poorly structured programs. In particular, uncontrolled use of the goto statement for controlling program flow leads to programs which are, in general, hard to read and unreliable. There is now general agreement that higher level control abstractions must be pro- vided and should consist of: ■ sequence – to group together a related set of program statements ■ selection – to select whether a group of statements should be executed or not based on the value of some condition ■ repetition – to execute repeatedly a group of statements. This basic set of primitives fits in well with the top-down philosophy of program design; each primitive has a single entry point and a single exit point. These primitives are realized in similar ways in most programming languages. For brevity, we will look in detail only at representative examples from common programming languages. For further details on this subject refer to Chapter 7 on structured programming. Java, in common with most modern languages, provides two basic selection constructs The first, the if statement, provides one or two-way selection and the second, the case statement provides a convenient multiway selection structure. Dangling else Does the language use explicit closing symbols, such as endif, thus avoiding the “dangling else” problem? Nested if structures of the form shown below raise the question of how ifs and elses are to be matched. Is the “dangling” else associ- ated with the outer or inner if? Remember that the indentation structure is of no consequence. 14.6 ● Selection > > BELL_C14.QXD 1/30/05 4:23 PM Page 180 14.6 Selection 181 if (condition) if (condition) statement1 else statement2 Java resolves this dilemma by applying the rule that an else is associated with the most recent non-terminated if lacking an else. Thus, the else is associated with the inner if. If, as the indentation suggests, we had intended the else to be associated with the outer if, we have to resort to clumsy fixes. But the clearest and cleanest solution is afforded by the provision of explicit braces (or key words) as follows. if (condition) { if (condition) { statement1 } } else { statement2 } Nesting Nested if statements can quite easily become unreadable. Does the language provide any help? For example, the readability of “chained” if statements can be improved by the introduction of an elsif clause. In particular, this eliminates the need for multiple endifs to close a series of nested ifs. Consider the following example, with and with- out the elsif form. Java does not provide an elsif facility, but some languages do, for example, Visual Basic.Net. if condition1 then if condition1 then statement1 statement1 else if condition2 then elsif condition2 then statement2 statement2 else if condition3 then elsif condition3 then statement3 statement3 else if condition4 then elsif condition4 then statement4 statement4 else else statement5 statement5 endif endif endif endif endif > > > > > > BELL_C14.QXD 1/30/05 4:23 PM Page 181 182 Chapter 14 ■ The basics Case Like other languages, Java provides a case or switch statement. Here is used to find the number of days in each month: switch (month) { case 1: case 3: case 5: case 8: case 10: case 12: days = 31; break; case 4: case 6: case 9: case 11: days = 30; break; case 2: days = 28; break; default: days = 0; break; } The break statement causes control to be transferred to the end of the switch statement. If a break statement is omitted, execution continues onto the next case and generally this is not what you would want to happen. So inadvertently omitting a break statement creates an error that might be difficult to locate. If the default option is omitted, and no case matches, nothing is done. The expressiveness of the case statement is impaired if the type of the case selector is restricted. It should not have to be an integer (as above), but in most languages it is. Similarly, it should be easy to specify multiple alternative case choices (e.g. 1|5|7 meaning 1 or 5 or 7) and a range of values as a case choice (e.g. 1 99). But Java does not allow this. The reliability of the case statement is enhanced if the case choices must specify actions for all the possible values of the case selector. If not, the semantics should, at least, clearly state what will happen if the case expression evaluates to an unspecified choice. The ability to specify an action for all unspecified choices through a default or similar clause is appealing. > > BELL_C14.QXD 1/30/05 4:23 PM Page 182 14.7 Repetition 183 There is something of a controversy here. Some people argue that when a case state- ment is executed, the programmer should be completely aware of all the possibilities that can occur. So the default statement is redundant and just an invitation to be lazy and sloppy. Where necessary, the argument goes, a case statement should be preceded by if statements that ensure that only valid values are supplied to the case statement. if-not It would be reasonable to think that there would no longer be any controversy over lan- guage structures for selection. The if-else is apparently well established. However, the lack of symmetry in the if statement is open to criticism. While it is clear that the then part is carried out if the condition is true, the else part is rather tagged on at the end to cater for all other situations. Experimental evidence suggests that significantly fewer bugs arise if the programmer is required to restate the condition (in its negative form) prior to the else as shown below: if condition statement1 not condition else statement2 endif Control structures for repetition traditionally fall into two classes. There are loop struc- tures where the number of iterations is fixed, and those where the number of iterations is controlled by the evaluation of some condition. Fixed length iteration is often imple- mented using a form similar to that shown below: for control_variable = initial_expression to final_expression step step_expression do statement(s) endfor The usefulness and reliability of the for statement can be affected by a number of issues as now discussed Should the type of the loop control variable be limited to integers? Perhaps any ordi- nal type should be allowed. However, reals (floats) should not be allowed. For example, consider how many iterations are specified by the following: for x = 0.0 to 1.0 step 0.33 do Here it is not at all obvious exactly how many repetitions will be performed, and things are made worse by the fact that computers represent real values only approximately. 14.7 ● Repetition > > > > BELL_C14.QXD 1/30/05 4:23 PM Page 183 184 Chapter 14 ■ The basics (Note how disallowing the use of reals as loop control variables conflicts with the aim of orthogonality). The semantics of the for is greatly affected by the answers to the following ques- tions. When and how many times are the initial expression, final expression and step expressions evaluated? Can any of these expressions be modified within the loop? What is of concern here is whether or not it is clear how many iterations of the loop will be performed. If the expressions can be modified and the expressions are recomputed on each iteration, then there is a distinct possibility of producing an infinite loop. Similar problems arise if the loop control variable can be modified within the loop. The scope of the loop control variable is best limited to the for statement, as in Java. If it is not, then what should its value be on exit from the loop, or should it be undefined? Condition-controlled loops are simpler in form. Almost all modern languages pro- vide a leading decision repetition structure ( while-do) and some, for convenience, also provide a trailing decision form ( repeat-until). while condition do repeat statement(s) statement(s) endwhile until condition The while form continues to iterate while a condition evaluates to true. Since the test appears at the head of the form, the while performs zero or many iterations of the loop body. The repeat, on the other hand, iterates until a condition is true. The test appears following the body of the loop, ensuring that the repeat performs at least one iteration. Thus the while statement is the more general looping mechanism of the two, so if a language provides only one looping mechanism, it should therefore be the while. However the repeat is sometimes more appropriate in some programming situations. > > SELF-TEST QUESTION 14.1 Identify a situation where repeat is more appropriate than while. Some languages provide the opposites of these two loops: do statement(s) while condition and: until condition do statement(s) end until > > > > BELL_C14.QXD 1/30/05 4:23 PM Page 184 14.7 Repetition 185 C, C++, C# and Java all provide while-do and do-while structures. They also pro- vide a type of for statement that combines together several commonly used ingredi- ents. An example of this loop structure is: for (i = 0; i < 10; i++) { statement(s) } in which: ■ the first statement within the brackets is done once, before the loop is executed ■ the second item, a condition, determines whether the loop will continue ■ the third statement is executed at the end of each repetition. We will meet yet another construct for repetition – the foreach statement – in the chapter on object-oriented programming language features (Chapter 15). This is con- venient for processing all the elements of a data structure. The while and repeat structures are satisfactory for the vast majority of iterations we wish to specify. For the most part, loops which terminate at either their beginning or end are sufficient. However, there are situations, notably when encountering some exceptional condition, where it is appropriate to be able to branch out of a repetition structure at an arbitrary point within the loop. Sometimes it is necessary to break out of a series of nested loops rather than a single loop. In many languages, the program- mer is limited to two options. The terminating conditions of each loop can be enhanced to accommodate the “exceptional” exit, and if statements can be used within the loop to transfer control to the end of the loop should the exceptional condition occur. This solution is clumsy at best and considerably decreases the readability of the code. A sec- ond, and arguably better, solution is to use the much-maligned goto statement to branch directly out of the loops. Ideally, however, since there is a recognized need for n and a half times loops, the language should provide a controlled way of exiting from one or more loops. Java provides the following facility where an orderly break may be made but only to the statement following the loop(s). while (condition) { statement(s) if (condition) break; statement(s) } In the example above, control will be transferred to the statement following the loop when condition is true. This may be the only way of exiting from this loop. here: while (condition) { while (condition) { > > > > > BELL_C14.QXD 1/30/05 4:23 PM Page 185 186 Chapter 14 ■ The basics statement(s) if (exitCondition) break here; statement(s) } } In the second example above, control will be transferred out of both while loops when exitCondition is true. Note how the outer while loop is labeled here: and how this label is used by the if statement to specify that control is to be transferred to the end of the while loop (not the beginning) when exitCondition is satisfied. > SELF-TEST QUESTION 14.2 Sketch out the code for a method to search an array of integers to find some desired integer. Write two versions – one using the break mech- anism and one without break. The languages C, C++, Ada and Java provide a mechanism such as the above for breaking out in the middle of loops. There is some controversy about using break statements. Some people argue that it is simply too much like the notorious goto statement. There is a difference, however, because break can only be used to break out of a loop, not enter into a loop. Neither can break be used to break out of an if statement. Thus it might be argued that break is a goto that is under control. Handling errors or exceptional situations is a common programming situation. In the past, such an eventuality was handled using the goto statement. Nowadays features are built in to programming languages to facilitate the more elegant handling of such situations. We discuss the handling of exceptions in Chapter 17. Procedural or algorithmic abstraction is one of the most powerful tools in the pro- grammer’s arsenal. When designing a program, we abstract what should be done before we specify how it should be done. Before OOP, program designs evolved as layers of procedural abstractions, each layer specifying more detail than the layer above. Procedural abstractions in programming languages, such as procedures and functions, allow the layered design of a program to be accurately reflected in the structure of the program text. Even in relatively small programs, the ability to factor a program into small, functional modules is essential; factoring increases the read- ability and maintainability of programs. What does the software engineer require from a language in terms of support for procedural abstraction? We suggest the 14.8 ● Methods BELL_C14.QXD 1/30/05 4:23 PM Page 186 14.8 Methods 187 following list of requirements: ■ an adequate set of primitives for defining procedural abstractions ■ safe and efficient mechanisms for controlling communication between program units ■ simple, clearly defined mechanisms for controlling access to data objects defined within program units. Procedures and functions The basic procedural abstraction primitives provided in programming languages are procedures and functions. Procedures can be thought of as extending the statements of the language, while functions can be thought of as extending the operators of the lan- guage. A procedure call looks like a distinct statement, whereas a function call appears as or within an expression. The power of procedural abstraction is that it allows the programmer to consider the method as an independent entity performing a well-described task largely independent of the rest of the program. When a procedure is called, it achieves its effect by modify- ing the data in the program which called it. Ideally, this effect is communicated to the calling program unit in a controlled fashion by the modification of the parameters passed to the procedure. Functions, like their mathematical counterparts, return only a single value and must therefore be embedded within expressions. A typical syntax for writing procedures and functions is shown below: void procedureName(parameters) { declarations procedure body } resultType functionName(parameters) { declarations function body return value; } It is critical that the interface between program units be small and well defined if we are to achieve independence between units. Ideally both procedures and functions should only accept but not return information through their parameters. A single result should be returned as the result of calling a function. For example, to place text in a text box, use a procedure call as illustrated by the fol- lowing code: setText("your message here"); and a function call to obtain a value: String text = getText(); > > BELL_C14.QXD 1/30/05 4:23 PM Page 187 . simple, clear, orthogonal programming language of the future, these concepts remain good meas- ures with which the software engineer can evaluate the programming languages of today. The syntax of a programming. such as endif and endwhile, leads to fewer errors and more readily understandable programs. Programs are also easier to maintain. For example, consider adding a second statement with the Java if. 3: case 5: case 8: case 10: case 12: days = 31; break; case 4: case 6: case 9: case 11: days = 30; break; case 2: days = 28; break; default: days = 0; break; } The break statement causes