Software Engineering For Students: A Programming Approach Part 22 pot

188 Chapter 14 ■ The basics Unfortunately, most programming languages do not enforce even these simple, logical rules. Thus it is largely the responsibility of the programmer to ensure that procedures and functions do not have side effects. A side effect is any change to information outside a method caused by a call – other than the parameters to a procedure. Most programming languages do not prevent programmers from directly accessing and modify- ing data objects (global data) defined outside of the local environment of the method. Along with pointers and the goto statement, global data has come to be regarded as a major source of programming problems. We shall see in Chapter 15? (on object- oriented features of programming languages) how, in classes, access to global data is controlled. Many abstractions, particularly those which manipulate recursive data structures such as lists, graphs, and trees, are more concisely described recursively. Some languages, for example Cobol and Fortran, do not support recursion. We have seen that, ideally: ■ parameters are passed to a procedure so that the procedure will accomplish some task. There is no need for information to be passed back to the caller. So there is no need for parameter values to change. ■ functions communicate a value back to the caller as the return value. So again there is no need for parameter values to be changed. Two major schemes for parameters have emerged: ■ call by value (termed value parameters) – this means that a copy of the information is passed as the parameter. Therefore the method can use the information but cannot change it. ■ call by reference (termed reference parameters) – this means that a pointer to the information is passed as the parameter. Therefore the method can both access and change the information. These pointers are not a problem because the pointers are not themselves accessible to the programmer. (The programmer cannot access or change the pointer, merely the information pointed to.) The pointer is simply the mechanism for communicating the information. We discuss programming using pointers in Chapter 15 on object-oriented programming. The programming language could enforce a discipline where procedures and functions can only be supplied with value parameters, but most do not. A number of parameter- passing schemes are employed in programming languages but no language provides a completely safe and secure parameter-passing mechanism. There is a performance consideration for value parameters. Passing by value is inef- ficient for passing large, aggregate data structures such as an array, as a copy must be made. In such situations, it is commonplace to pass the data structure by reference even if the parameter should not be modified by the method. 14.9 ● Parameter-passing mechanisms BELL_C14.QXD 1/30/05 4:23 PM Page 188 14.9 Parameter-passing mechanisms 189 Java provides the following scheme. All primitive data terms are passed by value, which is most commendable, but all proper objects are passed by reference. No distinction is made between procedures and functions. Thus a method of either type (procedure or function) can modify any non-primitive parameter and it is left to the programmer to enforce a discipline over changing parameters. A small concession is that the pointer to an object cannot be changed, for example to point to another object. Fortran employs only a single parameter passing mode: call by reference. Thus, undesirably, all actual parameters in Fortran may potentially be changed by any sub- routine or function. The programmer is responsible for ensuring the safe implementation of input and output parameters. Using call by reference, the location of the actual parameter is bound to the formal parameter. The formal and actual parameter names are thus aliases; modification of the formal parameter automatically modifies the actual parameter. This is what you might expect of a language where arrays are often used, and the performance hit of copying arrays is unacceptable. Fortran also, unfortunately, restricts the type of result that may be returned from functions to scalar types only (i.e. not arrays etc.). Visual Basic.Net provides the choice of value or reference parameters, described by the key words ByVal (the default) and ByRef in the method header. But when objects are passed, they are always passed by reference. In C#, by default, primitive data items are passed by value, objects are passed by reference. But you can pass a primitive data item by reference if the parameter is preceded by the key word ref in both the method header and the method call. You can also pre- cede an object name by ref, in which case you are passing a pointer to a pointer. This means that the method can return an entirely different object. Call by value-result is often used as an alternative to call by reference for input-output parameters. It avoids the use of aliases at the expense of copying. Parameters passed by value-result are initially treated as in call by value; a copy of the value of the actual parameter is passed to the formal parameter, which again acts as a local variable. Manipulation of the formal parameter does not immediately affect the actual parameter. On exit from the procedure, the final value of the formal parameter is assigned into the actual parameter. Call by result may be used as an alternative to call by reference for output parameters. Parameters passed by value are treated exactly as those passed by value-result except that no initial value is assigned to the local formal parameter. Ada identifies three types of parameter: ■ input parameters to allow a method read-only access to an actual parameter. The actual parameter is purely an input parameter; the method should not be able to modify the value of the actual parameter ■ output parameters to allow a procedure write-only access to an actual parameter. The actual parameter is purely an output parameter; the procedure should not be able to read the value of the actual parameter ■ input-output parameters to allow a procedure read-and-write access to an actual parameter. The value of the actual parameter may be modified by the procedure. Ada only allows input variables to functions. The parameter-passing mechanisms used in Ada (described as in, out and in out) would therefore seem to be ideal. However, BELL_C14.QXD 1/30/05 4:23 PM Page 189 190 Chapter 14 ■ The basics Ada does not specify whether they are to be implemented using sharing or copying. Though beneficial to the language implementer, since the space requirements of the parameter can be used to determine whether sharing or copying should be used, this de- cision can be troublesome to the programmer. In the presence of aliases, call by value- result and call by reference may return different results. Programmers are accustomed to being provided with a rudimentary set of primitive data types. These are provided built in and ready made by the programming language. They usually include: ■ Boolean ■ char ■ integer ■ real or floating point. These data types are accompanied by a supporting cast of operations (relational, arithmetic, etc.). For each type, it should be possible to clearly define the form of the literals or constants which make up the type. For example, the constants true and false make up the set of constants for the type Boolean. Similarly, we should be able to define the operations for each type. For the type Boolean, these might include the operations =, <>, not, and, and or. In most languages the primitive data types are not true objects (in the sense of objects created from classes). But in Eiffel and Smalltalk, every data type is a proper object and can be treated just like any other object. For certain application domains, advanced computation facilities, such as extended precision real numbers or long integers, are essential. The ability to specify the range of integers and reals and the precision to which reals are represented reduces the depend- ence on the physical characteristics, such as the word size, of a particular machine. This increases the portability of programs. However, some languages (for example C and C++) leave the issue of the precision and range of numbers to the compiler writer for the particular target machine. Java gets around this sloppiness by precisely defining the representation of all its built-in data types. Whatever machine a program is executed on, the expectation is that data is represented in exactly the same manner. Thus the program will produce exactly the same behavior, whatever the machine. A data type is a set of data objects and a set of operations applicable to all objects of that type. Almost all languages can be thought of as supporting this concept to some extent. Many languages require the programmer to define explicitly the type (e.g. integer or character) of all objects to be used in a program, and, to some extent or another, depending on the individual language, this information prescribes the operations that 14.11 ● Data typing 14.10 ● Primitive data types BELL_C14.QXD 1/30/05 4:23 PM Page 190 14.12 Strong versus weak typing 191 can be applied to the objects. Thus, we could state, for example, that Fortran, Cobol, C, C++, Ada, C#, Visual Basic.Net and Java are all typed languages. However, only Ada, C#, Visual Basic.Net and Java would be considered strongly typed languages. A language is said to be strongly typed if it can be determined at compile-time whether or not each operation performed on an object is consistent with the type of that object. Operations inconsistent with the type of an object are considered illegal. A strongly typed language therefore forces the programmer to consider more closely how objects are to be defined and used within a program. The additional information provided to the compiler by the programmer allows the compiler to perform automatic type checking operations and discover type inconsistencies. Studies have shown that programs written in strongly typed languages are clearer, more reliable, and more portable. Strong typing necessarily places some restrictions on what a programmer may do with data objects. However, this apparent decrease in flexibility is more than com- pensated for by the increased security and reliability of the ensuing programs. Languages such as Lisp, APL, and POP-2 allow a variable to change its type at run- time. This is known as dynamic typing as opposed to the static typing found in languages where the type of an object is permanently fixed. Where dynamic typing is employed, type checking must occur at run-time rather than compile-time. Dynamic typing provides additional freedom and flexibility but at a cost. More discipline is required on the part of the programmer so that the freedom provided by dynamic typing is not abused. That freedom is often very useful, even necessary, in some applications, for example, problem-solving programs which use sophisticated artificial intelligence techniques for searching complex data structures would be very difficult to write in languages without dynamic typing. What issues need to be considered when evaluating the data type facilities provided by a programming language? We suggest the following list: ■ does the language provide an adequate set of primitive data types? ■ can these primitives be combined in useful ways to form aggregate or structured data types? ■ does the language allow the programmer to define new data types? How well do such new data types integrate with the rest of the language? ■ to what extent does the language support the notion of strong typing? ■ when are data types considered equivalent? ■ are type conversions handled in a safe and secure manner? ■ is it possible for the programmer to circumvent automatic type checking operations? The debate as to whether strongly typed languages are preferable to weakly typed languages closely mirrors the earlier debate among programming language aficionados about the virtues of the goto statement. The pro-goto group argued that the construct was required and its absence would restrict programmers. The anti- goto group contended that indiscriminate use of the construct encouraged the production of “spaghetti-like” code. 14.12 ● Strong versus weak typing BELL_C14.QXD 1/30/05 4:23 PM Page 191 192 Chapter 14 ■ The basics The weakly typed languages group similarly argue that some types of programs are very difficult, if not impossible, to write in strongly typed languages. For example, a program that manipulates graphical images will sometimes need to perform arithmetic on the image and at other times examine the data bit-by-bit. The strongly typed languages group argue that the increased reliability and security outweigh these disadvantages. A compromise has been struck; strong typing is gener- ally seen as highly desirable but languages provide well-defined escape mechanisms to circumvent type checking for those instances where it is truly required. Weakly typed languages such as Fortran and C provide little compile-time type checking support. However, they do provide the ability to view the representation of information as different types. For example, using the equivalence statement in Fortran, a programmer is able to subvert typing: integer a logical b equivalence a, b The variable b is a logical, which is the Fortran term for Boolean. The equivalence declaration states that the variables a and b share the same memory. While econ- omy of storage is the primary use of the equivalence statement, it also allows the same storage to be interpreted as representing an integer in one case and a logical (Boolean) in the second. The programmer can now apply both arithmetic operations and logical operations on the same storage simply by choosing the appropriate alias ( a or b) to reference it. This incredible language feature is dangerous because programs using it are unclear. Moreover such programs are not portable because the representations used for integers and Booleans are usually machine dependent. To a small number of programming applications, the ability to circumvent typing to gain access to the underlying physical representation of data is essential. How can this be provided in a language that is strongly typed? The best solution is probably to force the programmer to state explicitly in the code that they wish to violate the type checking operations of the language. This approach is taken by Ada, where an object may be rein- terpreted as being of a different type only by using the unchecked conversion facility. The question of conversion between types is inextricably linked with the strength of typing in a language. Fortran, being weakly typed, performs many conversions (or co- ercions) implicitly during the evaluation of arithmetic expressions. These implicit conversions may result in a loss of information and can be dangerous to the programmer. As we saw earlier, Fortran allows mixed mode arithmetic and freely converts reals to integers on assignment. Java and other strongly typed languages perform implicit conversions only when there will be no accompanying loss of information. Thus, an assignment of an integer to a real variable results in implicit conversion of the integer to a real – the programmer does nothing. However, an attempt to assign a real value to an integer variable will result in a type incompatibility error. Such an assignment must be carried out using an explicit conversion function. That is, the programmer is forced by the language to BELL_C14.QXD 1/30/05 4:23 PM Page 192 14.13 User-defined data types (enumerations) 193 explicitly consider the loss of information implied by the use of the conversion function. In Java, for example, a real can be converted to an integer, but only by using an explicit casting operator: float f = 1.2345; int i = (int) f; The casting operator is the name of the destination type, enclosed in brackets – in this case (int). When this is used, the compiler accepts that the programmer is truly asking for a conversion and is responsibly aware of the possible consequences. SELF-TEST QUESTION 14.3 Java provides shift and Boolean operations for integers and reals. Does this violate strong typing? The readability, reliability, and data abstraction capabilities of a language are enhanced if the programmer can extend the primitive data types provided by the language. The ability to define user-defined types separates the languages C, C++ and Ada from their predecessors. For example, consider the following definition of a C++ enumerated type which is introduced by the key word enum: enum Day {Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday}; The type Day is a new type. Variables of this type may only take on values that are the literals of that type (that is Monday, Tuesday, etc). Now we can declare a variable of this type, as follows: Day today; And we can perform such operations as today = Monday; if (today == Saturday) etc We also get some type checking carried out by the compiler. Assignments such as the following will be flagged as type errors by the compiler. today = January; today = 7; 14.13 ● User-defined data types (enumerations) BELL_C14.QXD 1/30/05 4:23 PM Page 193 194 Chapter 14 ■ The basics In a language without this facility, we are forced to map the days of the week onto integers, so that 1 means Monday etc. But then we get no help from the compiler when we write (correct) statements, such as: int today; today = 2; or even the “illegal” today = 0; since today is an integer variable and therefore may be assigned any integer value. SELF-TEST QUESTION 14.4 Make the case for user-defined types. Enumerated types, such as the C++ facility described above, have their limitations. An enumerated type can be declared, variables created, assignments and comparisons carried out, but these are the only operations and we cannot create any more. For example, in the above example one cannot write a method nextDay. Moreover different enums cannot contain identical names. For example, we are prevented from writing: enum Weekend {Saturday, Sunday}; because the names clash with those already in enum Day. Arguably, if the language provides classes (Chapter 15) it does not need enums. In fact the Java enum facility is almost a class. Composite data types allow the programmer to model structured data objects. The most common aggregate data abstraction provided by programming languages is the array: a collection of homogeneous elements (all elements of the same type) which may be referenced through their positions (usually an integer) within the collection. Arrays are characterized by the type of their elements and by the index or subscript range or ranges which specify the size, number of dimensions and how individual elements of the array may be referenced. For example, the Java array definition shown below defines an array named table. It is a one-dimensional array of integers with the subscript varying from 0 through 9. In Java, subscripts always start at 0, betraying the C origins of the language as a language close to machine instructions. int table[] = new int[10]; 14.14 ● Arrays BELL_C14.QXD 1/30/05 4:23 PM Page 194 14.15 Records (structures) 195 Individual elements of the array are referenced by specifying the array name and an expression for each subscript, for example, table[2]. The implementation of arrays in programming languages raises the following con- siderations: ■ what restrictions are placed on the element type? For complete freedom of expression there should be no restrictions. ■ valid indices should be any subrange of numbers (e.g. 2010 to 2020) ■ at what time must the size of an array be known? The utility of arrays in a programming language is governed by the time (compile-time or run-time) at which the size of the array must be known. ■ what operations may be applied to complete arrays? For example, it is very convenient to be able to carry out array assignment or comparison between compatible arrays using a single concise statement. ■ are convenient techniques available for the initialization of arrays? The time at which a size must be specified for an array has implications on how the array may be used. In Java, as in most languages, the size of an array must be defined statically – the size and subscript ranges are required to be known at compile-time. This has the advantage of allowing the compiler to generate code automatically to check for out-of-range subscripts. However, the disadvantage of this simple scheme is that, to allow the program to accommodate data sets of differing sizes, we would like to delay deciding the size of the array until run-time. Most languages provide arrays whose size is fixed at compile-time, so if variable size is needed, a dynamic data structure is the answer (see Chapter 15). SELF-TEST QUESTION 14.5 Argue for and against the language making array subscripts start at 0. Data objects in problem domains are not always simply collections of homogeneous objects (same types). Rather, they are often collections of heterogeneous objects (different types). Although such collections can be represented using arrays, many programming languages provide a record data aggregate. Records (or structures as they are termed in C and C++) are generalizations of arrays where the elements (or fields) may be of different types and where individual components are referenced by (field) name rather than by position. For example, the C++ struct definition shown below describes information relat- ing to a time. Each object of type Time has three components named hour, minute and second. 14.15 ● Records (structures) BELL_C14.QXD 1/30/05 4:23 PM Page 195 196 Chapter 14 ■ The basics struct Time { int hour; int minute; int second; } We can now declare a variable of this type: Time time; Components of records are selected by name. The method used by Ada, PL/1 and C++ first specifies the variable and then the component. For example, time.minute = 46; Each component of a record may be of any type – including aggregate types, such as arrays and records. Similarly, the element type of an array might be a record type. Programming languages which provide such data abstractions as arrays and records and allow them to be combined orthogonally in this fashion allow a wide range of real data objects to be modeled in a natural fashion. The languages Cobol, PL/1, C, C++, C# and Ada support records. (In C, C++ and C# a record is termed a struct.) The Java language does not provide records as described above because this facility can simply be implemented as a class, using the object-oriented features of the language (see Chapter 15). Simply declare a class, with the requisite fields within it. > SELF-TEST QUESTION 14.6 Make the case for arrays and records. Summary In this chapter we have surveyed the basic characteristics that a programming language should have from the viewpoint of the software engineer. It seems that small things – like syntax – can affect software reliability and maintenance. Some people think that a language should be rich in features – and therefore powerful. Other people think that a language should be small but elegant so that it can be mastered completely by the programmer. > BELL_C14.QXD 1/30/05 4:23 PM Page 196 Exercises 197 14.1 Suppose that you were asked to design a new programming language for software engineering. ■ select and justify a set of control structures ■ select and justify a set of primitive data types. 14.2 Argue either for or against strong typing in a programming language. 14.3 How many kinds of looping structure do we need in a programming language? Make suggestions. 14.4 From the discussion in this chapter, list the possible problem features with either programming languages in general or a programming language of your choice. 14.5 “In language design, small is beautiful.” Discuss. 14.6 Argue for or against the inclusion of the break statement in a programming language 14.7 The language LISP has the ultimate simple syntax. Every statement is a list. For example: (+ 1 2) returns the sum of the parameters. Investigate the syntax of Lisp and discuss whether every language could and should have syntax that is as simple. The following issues are considered to be important: ■ matching the language to the application area of the project ■ clarity, simplicity, and orthogonality ■ syntax ■ control abstractions ■ primitive data types ■ data typing ■ enumerations ■ arrays ■ records (structures). Exercises • BELL_C14.QXD 1/30/05 4:23 PM Page 197 . are initially treated as in call by value; a copy of the value of the actual parameter is passed to the formal parameter, which again acts as a local variable. Manipulation of the formal parameter. The formal and actual parameter names are thus aliases; modification of the formal parameter automatically modifies the actual parameter. This is what you might expect of a language where arrays. the local formal parameter. Ada identifies three types of parameter: ■ input parameters to allow a method read-only access to an actual parameter. The actual parameter is purely an input parameter;

Định dạng
Số trang	10
Dung lượng	152,33 KB