HandBooks Professional Java-C-Scrip-SQL part 153 pot

case of the big pointer array itself. The problem is that a big pointer array can have a variable number of elements in it, so it would be all too easy for us to accidentally step off the end of the big pointer array, with potentially disastrous results. To prevent such a possibility, I have created the data type called AccessVector. The purpose of this data type is to combine the safety features of a normal SVector with the ability to specify the address where the data for the SVector should start, rather than relying on the run-time memory allocation library to assign the address. Because this data type is designed to refer to an existing area of memory, the copy constructor is defaulted, as is the assignment operator and the destructor, and there is no SetSize function such as exists for "regular" SVectors. This data type allows us to "map" a predefined structure onto an existing set of data, which is exactly what we need to access a big pointer array safely. As this suggests, the "big array header" is an AccessVector variable, which we can use just as though it were a normal SVector. 47 The LittlePointerBlock class The interface for LittlePointerBlock is shown in Figure blocki.10. The interface for the LittlePointerBlock class (from quantum\blocki.h) (Figure blocki.10) codelist/blocki.10 There's nothing in this class that isn't exactly analogous to the corresponding functions in the big pointer array class. Therefore, I won't waste either your time or mine by repeating the analysis of the big pointer array class here. Instead, let's move along to another class that is somewhat more interesting, if only because it seems to have no purpose for existing. The LeafBlock class The interface for LeafBlock is shown in Figure blocki.11. The interface for the LeafBlock class (from quantum\blocki.h) (Figure blocki.11) codelist/blocki.11 This is a real oddity: a class that defines no new member functions or member variables. Of what value could such a class possibly be? The answer is that it provides a "hook" for attaching a handle class, namely LeafBlockPtr. This allows us to use a leaf block just as we would any of the other quantum classes. If we did not have this class, we could always create another class called QuantumBlockPtr, which would have much the same effect as creating this class. So why did I create this class in the first place? The answer is that originally it did have some member functions, but they eventually turned out to be superfluous. At this point in the development of this project, it would probably be unwise for me to go through the code to root out all of the references to this class. And after all, a class that defines no new member functions or member variables certainly can't take up too much extra space; in fact, using this class should have absolutely no effect on the size of the program or its execution time, so I think I'll leave it just as it is, at least for now. The FreeSpaceArray class Finally, we're finished with the block classes. Our next target of opportunity will be the classes that maintain and provide access to the free space list. We'll start with FreeSpaceArray, whose interface is shown in Figure newquant.22. The interface for the FreeSpaceArray class (from quantum\newquant.h) (Figure newquant.22) codelist/newquant.22 The Normal Constructor for FreeSpaceArray The first function we'll look at is the normal constructor for FreeSpaceArray, whose code is shown in Figure newquant.23. The normal constructor for FreeSpaceArray (from quantum\newquant.cpp) (Figure newquant.23) codelist/newquant.23 As you can see, all this function does (as is common in the case of constructors) is to initialize a number of member variables. Most of these initializations are fairly straightforward, but we should go over them briefly. First, we set the current lowest free block number to 0, because we have no idea what blocks might be free in the free space list, as we have not looked through it yet. Then we get the free space list count, the number of blocks in the free space list, and the quantum number adjustment from the quantum file object; this last value is used when we need to convert between block numbers and quantum numbers. Next, we resize the block pointer SVector so it can hold block pointers for all of the free space list blocks in the quantum file. Finally, we assign free space block pointers to all the elements of that SVector. Now we are ready to access the free space list. The FreeSpaceArray::Get Function The next function we'll look at is FreeSpaceArray::Get, whose code is shown in Figure newquant.24. The FreeSpaceArray::Get function (from quantum\newquant.cpp) (Figure newquant.24) codelist/newquant.24 The operation of this function is fairly straightforward. First, we check whether we're trying to access something that is off the end of the free space list. If so, we return a value that indicates that there is no free space in the quantum for which information was requested. However, if the input argument is valid, we calculate which block and which element in that block contains the information we need. We then call the Get function of the block pointer to retrieve that element. Finally, we return the result to the caller. The FreeSpaceArray::Set Function The next function we'll look at is FreeSpaceArray::Set, whose code is shown in Figure newquant.25. The FreeSpaceArray::Set function (from quantum\newquant.cpp) (Figure newquant.25) codelist/newquant.25 This function is very similar to its counterpart, the Get function. However, there is one difference that we should look at: if the entry that we have just found in the array indicates that its quantum is completely empty (i.e., has the maximum available space) and this entry has a lower index than the current value of the "lowest free block" variable, then we reset the "lowest free block" variable to indicate that this is the lowest free block. Free the Quantum 16K! This is an optimization whose purpose is to avoid searching the entire free list every time we want to find a block that isn't committed to any particular main object. In both the previous C implementation and the current C++ one, we first check the last quantum to which we added an item; if that has enough space to add the new item, we use it. 48 In the old implementation, the free space list contained only a "free space code", indicating how much space was available in the quantum but not which object it belonged to. Therefore, when we wanted to find a quantum belonging to the current object that had enough space to store a new item, we couldn't use the free space list directly. As a substitute, the C code went through the current little pointer array, looking up each quantum referenced in that array in the free space list; if one of them had enough space, we used it. However, this was quite inefficient; since each quantum can hold dozens or hundreds of items, this algorithm might require us to look at the same quantum that many times! 49 Although this wasn't too important in the old implementation, where the free space list was held in memory, it could cause serious delays in the current one if we used the standard virtual memory services to access the free space list. The free space list in the old program took up 16K, one byte for each quantum in the maximum quantum file size allowed. In the new implementation, using 16K blocks of virtual memory, that same free space list would occupy only one block, so searching such a list would not require any extra disk accesses. However, the current implementation can handle much larger quantum files that might contain tens or hundreds of thousands of blocks, with correspondingly larger free space lists. Using the old method, searching the free space list from beginning to end could take quite a while, because the search routine would not access the list in a linear manner and therefore might require extra disk accesses to access the same free space list entries several times. At the very least, the free space blocks would be artificially promoted to higher levels of activity and would therefore tend to crowd other quanta out of the buffers. Even if the free space blocks were already resident, virtual memory accesses are considerably slower than "regular" accesses; it would be much faster to scan the free space list sequentially by quantum number than randomly according to the entries in the little pointer array. Of course, we could make a list of which quanta we had already examined and skip the check in those cases, but I decided to simplify matters by another method. The FreeSpaceArray::FindSpaceForItem Function In the current implementation, the free space list contains not just the free space for each quantum but also which object it belongs to (if any). 50 This lets us write a FreeSpaceArray::FindSpaceForItem routine that finds a place to store a new item by scanning each block of the free list sequentially in memory, rather than using a virtual memory access to retrieve each free space entry; we stop when we find a quantum that belongs to the current object and has enough free space left to store the item (Figure newquant.26). 51 The FreeSpaceArray::FindSpaceForItem function (from quantum\newquant.cpp) (Figure newquant.26) codelist/newquant.26 However, if there isn't a quantum in the free space list that belongs to our desired main object and also has enough space left to add the new item, then we have to start a new quantum; how do we decide which one to use? One way is to keep track of the first free space block we find in our search and use it if we can't find a suitable block already belonging to our object. However, I want to bias the storage mechanism to use blocks as close as possible to the beginning of the file, which should reduce head motion, as well as making it possible to shrink the file's allocated size if the amount of data stored in it decreases. My solution is to take a free block whenever it appears; if that happens to be before a suitable block belonging to the current object, so be it. This appears to be a self-limiting problem, since the next time we want to add to the same object, the newly assigned block will be employed if it has enough free space and is the first suitable block in the list. This approach solves another problem as well, which is how we determine when to stop scanning the free space list in the first place. Of course, we could also maintain the block number of the last occupied block in the file and stop there. However, I felt this was unnecessary, since stopping at the first free block provides a natural shortcut, without contributing any obvious problems of its own. However, as with many design decisions, my analysis could be flawed: there's a possibility that using this algorithm with many additions and deletions could reduce the space efficiency of the file, although I haven't seen such an effect in my testing. This mechanism did not mature without some growing pains. For example, the one-byte FreeSpaceCode code, used to indicate the approximate space available in a quantum, is calculated by dividing the size by a constant (32 in the case of 16K blocks) and discarding the remainder. As a result, the size code calculated for items . it would be all too easy for us to accidentally step off the end of the big pointer array, with potentially disastrous results. To prevent such a possibility, I have created the data type called. quantum ewquant.cpp) (Figure newquant.25) codelist/newquant.25 This function is very similar to its counterpart, the Get function. However, there is one difference that we should look at: if the entry that. searching the entire free list every time we want to find a block that isn't committed to any particular main object. In both the previous C implementation and the current C++ one, we first

HandBooks Professional Java-C-Scrip-SQL part 153 pot

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan