is called during the execution of operator-> for any of the BlockPtr classes. 34 Figure mbr1 shows an early implementation of this routine. Early MakeBlockResident code (Figure mbr1) codelist/mbr1.00 This is a fairly straightforward implementation of a least recently used (LRU) priority algorithm, but there are a few wrinkles to reduce overhead. First, we use the reference parameter p_OldBufferNumber to retrieve the buffer number (if any) in which the searched-for block was located on our last attempt. If the number of the block held in that buffer matches the one we're looking for, we will be able to avoid the overhead of searching the entire buffer list. The reason that p_OldBufferNumber is a reference parameter is so that we can update the caller's copy when we locate the block in a different buffer; that way, the next time that MakeBlockResident is called to retrieve the same block's address, we can check that buffer number first. In order to make this work, we can't implement the LRU priority list by moving the most recently used block to the front of the block list; the saved buffer number would be useless if the blocks moved around in the list every time a block other than the most recently used one was referenced. Instead, each slot has an attached timestamp, updated by calling the time function every time the corresponding block is referenced. When we need to free up a buffer, we select the one with the lowest (i.e., oldest) timestamp. If that buffer has the "modified" attribute set, then it has been updated in memory and needs to be written back to the disk before being reused, so we do that. Then we read the new block into the buffer, update the caller's p_OldBufferNumber for next time, and return the address of the buffer. This seems simple enough, without much scope for vast improvements in efficiency; at least, it seemed that way to me, until I ran Turbo Profiler TM on it and discovered that it was horrendously inefficient. The profiler indicated that 73% of the CPU time of my test program was accounted for by calls to the time routine! To add insult to injury, the timing resolution of that routine is quite coarse (approximately 18 msec), with the result that most of the blocks would often get the same timestamp when running the program normally. 35 Upon consideration, I realized that the best possible "timestamp" would be a simple count of the number of times that MakeBlockResident was called. This would entail almost no overhead and would be of exactly the correct resolution to decide which block was least recently used; this is the mechanism used in the current version. Punching the Time Clock One interesting consideration in the original design of this mechanism was what size of counter to use for the timestamp. At first, it seemed necessary (and sufficient) to use a 32-bit type such as an unsigned long, which would allow 4 billion accesses before the counter would "turn over" to 0. However, because the original implementation was compiled with a 16-bit compiler, the natural size to use would be an unsigned (meaning unsigned short) variable. Therefore, I had to decide whether a longer type was necessary. After some thought, I decided that it wasn't. The question is: what happens when the counter turns over? To figure this out, let's do a thought experiment, using two- byte counters. Suppose that the priority list looks like Figure timestamp1, with the latest timestamp being 65535. Timestamps before turnover (Figure timestamp1) Block number Timestamp 14 65533 22 65535 23 65000 9 65100 Let's suppose that the next reference is to block 9, with the counter turning over to 0. The list will now look like Figure timestamp2. Timestamps immediately after turnover (Figure timestamp2) Block number Timestamp 14 65533 22 65535 23 65000 9 0 The next time that a new block has to be loaded, block 9 will be replaced, instead of block 23, which is actually the least recently used block. What effect will this have on performance? At first glance, it doesn't appear that the maximum possible effect could be very large; after all, each turnover would only cause each buffer to be replaced incorrectly once. If we have 100 buffers (a typical number), the worst case would be that the "wrong" buffer is replaced 100 times out of 64K, which is approximately 1.5% of the time; with fewer buffers, the effect is even smaller. There is no danger to the data, since the buffers will be written out if they have changed. I suspected that the cost (under a 16-bit compiler) of handling a unsigned long counter instead of an unsigned short on every call to MakeBlockResident would probably be larger than the cost of this inefficient buffer use, but it didn't appear important either way. Getting Our Clocks Cleaned Although the preceding analysis was good enough to convince me not to worry about the counter turning over, unfortunately it wasn't good enough to convince the machine. What actually happened was that after a large number of block references, the program started to run very slowly. I was right that the data wasn't in danger, but performance suffered greatly. Why? Let's go back to our example. Which buffer would be replaced when the next block needed to be read in? The one currently holding block 9, since it has the "lowest" priority. If that block number happened to be 32, for example, that would leave us with the arrangement in Figure timestamp3. Timestamps shortly after turnover (Figure timestamp3) Block number Timestamp 14 65533 22 65535 23 65000 32 1 The problem should now be obvious: the newest block still has the "lowest" priority! The reason that the program started to run very slowly after turnover was that the "fossil" timestamps on the old blocks were preventing them from being reused for more active blocks, so every block that had to be read in had to share buffers with the ones that had been read in after turnover. The solution was fairly simple; on turnover, I set all of the timestamps to 0 to give every buffer the same priority. This isn't really optimal, since it doesn't preserve the relative priority of the blocks already in memory; however, it has the virtue of simplicity, and does reduce the problem to the fairly insignificant level indicated by my first analysis of the turnover problem. Speed Demon Is this the end of our concern for the MakeBlockResident routine? Not at all; as befits its central role in the virtual memory mechanism, this routine has undergone quite a few transformations during the development process. One attempt to speed it up took the form of creating a FastResidenceCheck routine that would have the sole purpose of checking whether the old buffer number saved from the previous call to load the same block number was still good; if so, it would return that buffer number after resetting the timestamp. The theoretical advantage of splitting this function off from the more general case was that such a routine might be simple enough to be inlined effectively, which would remove the overhead of one function call from the time needed to make sure that the block in question was memory resident. Unfortunately, this measure turned out to be ineffective; one reason was that the routines that called MakeBlockResident typically didn't reuse the object where the former buffer number was saved, but had to create another one every time they were called by their client routines. Therefore, the attempt to "remember" the previous buffer number wasn't successful in most cases. While FastResidenceCheck was in use, it suffered from a bug caused by improperly initializing the old buffer number to 0, a valid buffer number. The result of this error was that when a block happened to be loaded into buffer number 0, operator-> didn't initialize the pointer to the ItemIndex array, since the new buffer number "matched" the old buffer number. This problem would have been solved anyway by the new versions of operator->, which always initialize any pointers that might need to be updated; after the attempt to avoid apparently redundant initializations of these pointers caused a couple of bugs, I decided that discretion was the better part of optimization. As a result of this change, we no longer have to inform the caller of the buffer number that this block contains, so the reference argument to MakeBlockResident for that purpose has been removed. Virtual Perfection The acid test of this virtual memory mechanism is to run the program with only one block buffer; unsurprisingly, this test revealed a nasty bug. It seems that I had neglected to initialize the value of the EarliestStamp variable, used to keep track of the buffer with the earliest timestamp. When running with only one buffer, it was possible under some circumstances for a block to be replaced before it was ever used; when this happened, the timestamp on the buffer it had occupied was left set to its initial value of ULONG_MAX. This initial value was significant because the search for the earliest timestamp also starts out by setting the TimeStamp variable to ULONG_MAX, which should be greater than any timestamp found in the search. If there were no blocks in the list with a "real" timestamp, the conditional statement that set the EarliestStamp value in the search loop was never executed. As a result, the EarliestStamp variable was left in an uninitialized state, which caused a wild access to a nonexistent block buffer. The fix was to initialize EarliestStamp to 0, so the first buffer will be selected under these circumstances; you can see this implemented in the current version of MakeBlockResident (Figure block.05). The QuantumFile::MakeBlockResident function The MakeBlockResident function (from quantum\block.cpp) (Figure block.05) codelist/block.05 We've already covered most of the tricky parts of this function, but let's go over it one more time just to be on the safe side. We begin by incrementing the counter that serves as a timestamp; if it turns over to 0, we set all of the timestamps to 0 to prevent the newest block from being replaced continually. 36 Then we look up the buffer number for the quantum in question, via the FindBuffer global function. If the quantum is found in a buffer, then we merely reset the timestamp on that buffer to indicate that it is the most recently used, and return a pointer to that buffer for use by the operator-> function that called MakeBlockResident. However, in the event that the quantum we want isn't yet in a buffer, then we have to figure out which buffer it should be read into. We do this by examining the timestamp for each buffer, and selecting the buffer with the earliest timestamp. Once we have found the buffer that we are going to reuse, we check whether it has been modified; if so, we write it back to the disk. Then we set its block number to the block number of the new quantum that is being read in, set its modified flag to FALSE, set the timestamp on the buffer to the current timestamp, and read the quantum into the buffer. Finally, we return a pointer to the buffer for use by the calling operator-> function. . initializations of these pointers caused a couple of bugs, I decided that discretion was the better part of optimization. As a result of this change, we no longer have to inform the caller of the. quantumlock.cpp) (Figure block.05) codelist/block.05 We've already covered most of the tricky parts of this function, but let's go over it one more time just to be on the safe side. We