HandBooks Professional Java-C-Scrip-SQL part 155 potx

Footnotes 1. Obviously, this is a simplification: normally, we would want to be able to find a customer's record by his name or other salient characteristic. However, that part of the problem can be handled by, for example, a hash coded lookup from the name into a record number, as we will see in the next chapter. Here we are concerned with what happens after we know the record number. 2. This figure could just as easily be considered a layout for a single record with variable-length fields; however, the explanation is valid either way. 3. The problem of changing the length of an existing record can be handled by deleting the old version of the record and adding a new version having a different length. 4. I am indebted for this extremely valuable algorithm to its inventor, Henry Beitz, who generously shared it with me in the mid-1970's. 5. In general, I use the terms "quantum" and "block" interchangeably; in the few cases where a distinction is needed, I will note it explicitly. 6. In the current implementation, the default block size is 16K. However, it is easy to change that size in order to be able to handle larger individual items or to increase storage efficiency. 7. Some blocks used to store internal tables such as the free space list are not divided into items, but consist of an array of fixed-size elements, sometimes preceded by a header structure describing the array. 8. For simplicity, in our sample application each user record is stored in one item; however, since any one item must fit within a quantum, applications dealing with records that might exceed the size of a quantum should store each potentially lengthy field as a separate item. 9. Four of these bytes are used to hold the index in the IRA to which this item corresponds; this information would be very helpful in reconstructing as much as possible of the file if it should become corrupted. Another two bytes are used to keep track of the type of the item. These entries are for error trapping and file reconstruction if the file should somehow become corrupted. 10. Of course, this assumes that we have set the parameters of the quantum file to values that allow us to expand the file to a size large enough to hold that much data. The header file "blocki.h" contains the constants BlockSize and MaxFileQuantumCount, which together determine the maximum size of a quantum file. The beginning of that file also contains a number of other constants and structures related to this issue; you should be able to modify the capacity and space efficiency of a quantum file fairly easily after examining that header file. 11. The limit is 256 so that an object number can fit into one byte; this reduces the size of the free space list, as we will see later. 12. By the way, there is a possible optimization that could be employed here: sorting the buffers to be rewritten to the disk in order of their quantum numbers (i.e., their positions in the file). This could improve performance in systems where the hard disk controller doesn't already provide this service; however, most (if not all) modern disk systems take care of this for us, so sorting by quantum number would not provide any benefit. 13. The second edition also included this C implementation along with an earlier, less capable, version of the C++ implementation we're examining here. 14. Another restriction in C++ operator overloading is that it's impossible to make up your own operators. According to Bjarne Stroustrup, this facility has been carefully considered by the standards committee and has failed of adoption due to difficulties with operator precedence and binding strength. Apparently, it was the exponentiation operator that was the deciding factor; it's the first operator that users from the numerical community usually want to define, but its mathematical properties don't match the precedence and binding rules of any of the "normal" C++ operators. 15. I'm not claiming this is a good use for overloading; it's only for tutorial purposes. 16. Warning: do not compile and execute the program in Figure overload2. Although it will compile, it reads from random locations, which may cause a core dump on some systems. 17. By the way, this isn't just a theoretical problem: it happened to me during the development of this program. 18. This is an example of the "handle/body" class paradigm, described in Advanced C++: Programming Styles and Idioms, by James O. Coplien (Addison-Wesley Publishing Company, Reading, Massachusetts, 1992). Warning: as its title indicates, this is not an easy book; however, it does reward careful study by those who already have a solid grasp of C++ fundamentals. For a kinder, gentler introduction to several advanced C++ idioms of wide applicability, see my Who's Afraid of More C++? (AP Professional, San Diego, California, 1998). 19. In order to solve this problem in a more general way, the ANSI standards committee for C++ has approved the addition of "namespaces", which allow the programmer to specify the library from which one or more functions are to be taken in order to prevent name conflicts. 20. Of course, there are other ways to accomplish the goal of protecting the class user from concern about internals of a given class, as we've discussed briefly in the sections titled "Data Hiding" and "Function Hiding". 21. If we had any functions that could change the contents of a shared object, they would also have to be modified to prevent undesirable interactions between "separate" handle objects that share data. However, we don't have any such functions in this case. 22. Actually, the reference count should never be less than 0, but I'm engaging in some defensive programming here. 23. To reduce the length of the function names in this class, I'm going to omit the MainObjectArrayPtr qualifier at the beginning of those names. 24. We'll see exactly how this block access works when we cover the MainObjectBlock class, but for now it's sufficient to note that the main object array is potentially divided into blocks which are accessed via the standard virtual memory system. 25. Actually, the name of the "lowest free object" variable should be something like m_StartLookingHere, but I doubt it will cause you too much confusion after you see how it is used. 26. If a preprocessor variable called DEBUG is defined, then the action of this macro will be to terminate the program if the condition is not met; otherwise, it will do nothing. You can find the implementation of this macro and its underlying function in qfassert.h and qfassert.cpp. 27. We often will step through an array assigning values to each element in turn, for example when importing data from an ASCII file; since we are not modifying previously stored values, the quantum we used last is the most likely to have room to add another item. In such a case, the most recently written-to quantum is half full on the average; this makes it a good place to look for some free space. In addition, it is very likely to be already in memory, so we won't have to do any disk accesses to get at it. 28. By the way, this function was originally named GetFreeSpace, and its return type was called FreeSpaceEntry, but I had to change the name of the return type to avoid a conflict with a name that Microsoft had used once upon a time in their MFC classes and still had some claim on; I changed the name of the function to match. This is a good illustration of the need to avoid polluting the global name space; using the namespace construct in the new C++ standard would be a good solution to such a problem. 29. The alert reader will notice that the type of the NewBigPointerBlock variable is BigPointerBlockPtr, not BigPointerBlock. However, the functions that we call through that variable via the operator-> are from the BigPointerBlock class, because that is the type of the pointer that operator-> returns, as explained in the section on overloading operator->. 30. Another possible use is to implement variant arrays, in which the structure of the array is variable. In that case, we might use the type to determine whether we have the item we want or some kind of intermediate structure requiring further processing to extract the actual data. 31. The reason we mark this quantum as being full is twofold: first, there won't be very much (if any) space left in this quantum after we have added the little pointer array; and second, we don't want to store any actual data for our new main object in the same quantum as we are using for a section of the little pointer array, to make the reconstruction of a partially corrupted file easier. 32. There's one exception to this rule, for reasons described above: if the user specifies 0 elements, I change it to 1. 33. It would probably have been better to create a class to contain this function as well as a few others that are global in this implementation. 34. These are FreeSpaceBlockPtr, MainObjectBlockPtr, BigPointerBlockPtr, LittlePointerBlockPtr, and LeafBlockPtr. 35. Of course, while stepping through the program at human speeds in Turbo Debugger, the timestamps were nicely distributed; this is a demonstration of Heisenberg's Uncertainty Principle as it applies to debugging. 36. In a 32-bit implementation, it's entirely possible that the counter will never turn over, as its maximum value is more than four billion. However, if you let the program run long enough, eventually that will occur; at full tilt, such an event might take a few months. 37. At least, it's the smallest interface for a class that actually does anything. As we'll see, this program contains one class that doesn't contribute either data or functions. I'll explain why that is when we get to it. 38. For a detailed example of how this works, see the section entitled "Polite Pointing". 39. I'm not going to prefix the name of each embedded function or class with the name of its enclosing class, to make the explanations shorter; I'll just include it in the title of the main section discussing the class. 40. I'll cover this and the other global functions in the next chapter. 41. It is important to remember that the last item in a quantum is actually at the lowest address of any item in that quantum, because items are stored in the quantum starting from the end and working back toward the beginning of the quantum. 42. We can't delete unused item index entries that aren't at the end of the index because that would change the item numbers of the following items, rendering them inaccessible. 43. Of course, in the real program, we will find the IRA by looking it up in the main object index, but that detail is irrelevant to the current discussion of deleting an element. 44. The statement that calculates the value we should return for an empty quantum may be a bit puzzling. The reason it works as it does is that the routine that looks for an empty block compares the available free space code to a specific value, namely AvailableQuantum. If we did not return a value that corresponded to that free space code, a quantum that was ever used for anything would never be considered empty again. 45. It is important to note that the number of elements that I'm referring to here is not the number of elements in the big pointer array (i.e., the number of quanta that the small pointer array occupy) but the actual number of elements in the array itself (i.e., the number of data items that the user has stored in the array). 46. Note that we have to remember to set the modified flag for the big pointer array quantum whenever we change a value in the big pointer array header or the big pointer array itself, so that these changes will be reflected on the disk rather than being lost. 47. Although we will be unable to go into the details of the implementation of the AccessVector type, it is defined in the header file vector.h; if you are familiar with C++ templates, I recommend that you read that header file to understand how this type actually works. 48. The "last quantum added to" variable is stored in the big pointer quantum. When first writing the code to update that variable, I had forgotten to update the "modified" flag for the big pointer quantum when the variable actually changed. As a result, every time the buffer used for the big pointer quantum was reused for a new block, that buffer was overwritten without being written out to the disk. When the big pointer quantum was reloaded, the "last quantum added to" variable was reset to an earlier, obsolete value, with the result that a new quantum was started unnecessarily. This error caused the file to grow very rapidly. 49. For a similar reason, if we were adding an item to a large object with many little pointer arrays, each of which contained only a few distinct quantum number references, we wouldn't be gathering information about very much of the total storage taken up by the object; we might very well start a new quantum when there was plenty of space in another quantum owned by this object. . able to find a customer's record by his name or other salient characteristic. However, that part of the problem can be handled by, for example, a hash coded lookup from the name into a record. several advanced C++ idioms of wide applicability, see my Who's Afraid of More C++? (AP Professional, San Diego, California, 1998). 19. In order to solve this problem in a more general. quantum as we are using for a section of the little pointer array, to make the reconstruction of a partially corrupted file easier. 32. There's one exception to this rule, for reasons described

HandBooks Professional Java-C-Scrip-SQL part 155 potx

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan