of less than 32 bytes is 0. Since the original check for sufficient space in a quantum tested whether the space code for the quantum was greater than or equal to the space code for the new item, even blocks with no free space (i.e., those having a free space code of 0) were considered possible storage places for these very small items. This resulted in a large number of unnecessary disk accesses to read those quanta so that their actual free space could be calculated. Changing from ">=" to ">" in the test fixed this; of course, I could also have made the size calculation round up rather than down. Another problem I encountered with the free space handling in the new algorithm is that blocks and quanta are no longer synonymous, as they were in the old program. That is, the virtual memory system deals not only with quanta (i.e., blocks containing user data) but also with free space blocks and the block(s) occupied by the main object list. At one point, some routines dealing with the free space list were using physical block numbers and others were using logical quantum numbers; however, since only blocks containing user data are controlled by the free space list, the correct solution was to use only logical quantum numbers in all such routines. The FreeSpaceArray::FindEmptyBlock Function The last function that we will cover in this class is FindEmptyBlock, which is shown in Figure newquant.27. The FreeSpaceArray::FindEmptyBlock function (from quantum\newquant.cpp) (Figure newquant.27) codelist/newquant.27 As you can see, this code uses the member variable called m_CurrentLowestFreeBlock to reduce the amount of time needed to find a free block as the file fills up. Each time that a free block with a block number less than the current value is created or located, the m_CurrentLowestFreeBlock variable is updated to point to that block, and FindEmptyBlock starts at that location when looking for an empty block. FindEmptyBlock has an understandable resemblance to FindSpaceForItem, since both of them search the free space list looking for a quantum with a certain amount of space available; however, this resemblance misled me into introducing a bug in FindEmptyBlock. The problem came from the fact that I was using m_CurrentLowestFreeBlock to determine the starting block and element for searching the free space list before entering the outer loop, rather than initializing the loop indices in the for statements. This worked properly on the first time through the outer loop that is executed once for each block in the free space list, but when all the elements of the first block had been read and the outer loop index was incremented to refer to the next block, the inner index was left unchanged rather than being reset to 0 to address the first element of the new block. This would have showed itself in some unpleasant way after the blocks controlled by the first block of the free space list were allocated to objects, but luckily I found it by examination before that occurred. The FreeSpaceArrayPtr class The last class involved in the maintenance of the free space list is FreeSpaceArrayPtr, whose interface is shown in Figure newquant.28. The interface for the FreeSpaceArrayPtr class (from quantum\newquant.h) (Figure newquant.28) codelist/newquant.28 Because this is a perfectly typical handle class, similar in every way to the ones that we've discussed already, you should be able to figure it out without further explanation. The Global Functions There are four functions that we haven't covered yet: BlockMove, AdjustOffset, FindUnusedItem, and FindBuffer. What they have in common is that they're all used quite a few times during the execution of the program and therefore would be good candidates for rewriting in assembly language. In the previous edition of this book I did just that, so why am I not doing it here? There are actually two reasons. First, in the several years that have elapsed since the publication of the second edition, the use of processors other than Intel's has become much more common in high-end applications. Obviously, any assembly language enhancements would be useless to readers who are using, for example, DEC's Alpha processor. However, an equally good reason is the lack of a standard for creating assembly language functions that interface with C++ programs or using assembly language in a C++ function, which would limit the utility of any such performance enhancements to those who were using the same compiler as the one for which I developed them. Raising the Standard Happily, compatibility among C++ compilers themselves has improved considerably of late. After several decades of development of the C++ language, which led to a number of similar but not identical dialects, ISO and ANSI (the international and American standards bodies, respectively) have finally approved a C++ standard. The compiler manufacturers now have no excuse for not producing compilers that comply with this standard, and in fact most of the features of the standard have already been incorporated into a number of commercial compilers. The similarity among C++ compilers is already sufficient that I was able to compile all the programs in this book using both Microsoft's Visual C++ 5.0 compiler and the DJGPP 2.8.0 compiler on the CD-ROM in the back of the book. They should also compile successfully on any compiler that complies with the new draft standard for C++. 52 Unfortunately, there is no such standard for assembly language enhancements to C++ programs. Therefore, any such enhancements that I would write would be useful only with one compiler, and I elected to eliminate such enhancements rather than providing them only for one compiler. However, this does not mean that such enhancements would be useless for you. If you know what machine and compiler you're going to be using, and if the performance of this program is unacceptable on the machine which it must run, then you may very well want to write assembly language functions to replace C++ functions that are heavily used at runtime. The four functions whose declarations appear in Figure asmfunc.00 are excellent candidates for replacement with assembly language equivalents, according to the tests that I ran for the second edition of this book. The declarations of the global functions (quantum\asmfunc.h) (Figure asmfunc.00) codelist/asmfunc.00 The BlockMove Function Let's start with the simplest of these functions, BlockMove, shown in Figure asmfunc.01. The BlockMove global function (from quantum\asmfunc.cpp) (Figure asmfunc.01) codelist/asmfunc.01 As you can see, this function consists of nothing more than a call to the standard C library function memmove. It might be hard to imagine that such a basic function could be improved upon, but that's what I found in the previous edition of this book: by re-coding that routine in assembly language I was able to speed it up by a factor of approximately two to one over the version supplied with Borland C++ 3.1. Of course, without doing the same thing for the compilers I'm using in this book, I can't guarantee that such a speedup is possible; however, it is a good place to look for performance improvement because it is used very frequently. The AdjustOffset Function The next of these functions, AdjustOffset, is shown in Figure asmfunc.02. The AdjustOffset global function (from quantum\asmfunc.cpp) (Figure asmfunc.02) codelist/asmfunc.02 This one also isn't very complicated. It merely steps through each of the items in an item index, starting from the one whose address is passed as its first argument, for the count provided as the second argument. To each of these items, it adds the adjustment provided as the third argument, which can be either positive or negative, depending on the reason that we're calling this function. If we have deleted something, then the adjustment will be negative because the items referred to by the item index will be moving closer to the end of the quantum. On the other hand, if we've inserted new data into the quantum, then the adjustment will be positive because those items will be moving farther from the end of the quantum. The FindUnusedItem Function The next of these functions, FindUnusedItem, is shown in Figure asmfunc.03. The FindUnusedItem global function (from quantum\asmfunc.cpp) (Figure asmfunc.03) codelist/asmfunc.03 This function is slightly more complicated than the previous two, but it's still pretty simple. It steps through the item index of a quantum, checking for an entry whose type is UNUSED_ITEM. If it finds such an entry, it returns the index of that entry; otherwise, it returns the value -1 to indicate that there are no unused entries. We use this function when we want to insert a new item in a quantum and need to find an available item index entry for that item. The FindBuffer Function The last of these functions, FindBuffer, is shown in Figure asmfunc.04. The FindBuffer global function (from quantum\asmfunc.cpp) (Figure asmfunc.04) codelist/asmfunc.04 This function steps through the block number list that keeps track of which block is in each buffer, looking for a particular block number. If it finds that the desired block is in one of the buffers, it returns the index of that buffer in the block number list. If the block isn't in any of the buffers, it returns the special value -1 to indicate that the search failed. Summary In this chapter, we have seen how quantum files allow us to gain efficient random access to a large volume of variable-length textual data. In the next chapter, we will use this algorithm as a building block to provide random access by key to large quantities of variable-length data. Problems 1. What modifications to the quantum file implementation would be needed to add a "mass load mode" to facilitate the addition of large numbers of records by reducing the free space list search time? 2. What changes to the "Directory" facility would make it more generally useful? 3. Write a QFIX program that employs the redundant information in the block headers and item index to reconstruct as much as possible of the logical structure of a quantum file that has become corrupted: i.e., in which at least one of the blocks is unreadable or logically inconsistent. (You can find suggested approaches to problems in Chapter artopt.htm). . through the block number list that keeps track of which block is in each buffer, looking for a particular block number. If it finds that the desired block is in one of the buffers, it returns