1. Trang chủ
  2. » Công Nghệ Thông Tin

O''''Reilly Network For Information About''''s Book part 125 doc

5 64 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 23,67 KB

Nội dung

I resolved to improve the speed of this conversion (from 1300 microseconds/12 character string) as much as was practical, before proposing further hardware upgrades to the system. The first problem was to determine which operation was consuming the most CPU time. Examination of the code (Figure ascrad1.cpp), disclosed that the toupper function was being called for every character in the string, every time the character was being examined). This seemed an obvious place to start. First version of ascii_to_Radix40 routine (from intro\ascrad1.cpp) (Figure ascrad1.cpp) codelist/ascrad1.00 The purpose of writing the loop in this way was to avoid making changes to the input string; after all, it was an input variable. However, a more efficient way to leave the input string unaltered was to make a copy of the input string and convert the copy to uppercase, as indicated in Figure ascrad2.cpp. This reduced the time to 650 microseconds/12 character string, but I suspected that more savings were possible. Second version of ascii_to_Radix40 routine (from intro\ascrad2.cpp) (Figure ascrad2.cpp) codelist/ascrad2.00 Another possible area of improvement was to reduce the use of dynamic string allocation to get storage for the copy of the string to be converted to uppercase. In my application, most of the strings would be less than 100 characters, so I decided to allocate room for a string of 99 characters (plus the required null at the end) on the stack and to call the dynamic allocation routine only if the string was larger than that. However, this change didn't affect the time significantly, so I removed it. I couldn't see any obvious way to increase the speed of this routine further, until I noticed that if the data had about the same number of occurrences of each character, the loop to figure out the code for a single character would be executed an average of 20 times per character! Could this be dispensed with? Yes, by allocating 256 bytes for a table of conversion values. 10 Then I could index into the table rather than searching the string of legal values (see Figure ascrad4.cpp). Timing this version revealed an impressive improvement: 93 microseconds/12 character string. This final version is 14 times the speed of the original. 11 Fourth version of ascii_to_Radix40 routine (from intro\ascrad4.cpp) (Figure ascrad4.cpp) codelist/ascrad4.00 The use of a profiler would have reduced the effort needed to determine the major causes of the inefficiency. Even without such an aid, attention to which lines were being executed most frequently enabled me to remove the major bottlenecks in the conversion to Radix40 representation. It is no longer a significant part of the time needed to access a record. Summary In this chapter, I have given some guidelines and examples of how to determine whether optimization is required and how to apply your optimization effort effectively. In the next chapter we will start to examine the algorithms and other solutions that you can apply once you have determined where your program needs improvement. Footnotes 1. If you don't have the time to read this book in its entirety, you can turn to Figures ioopt-processoropt in Chapter artopt.htm to find the algorithms best suited to your problem. 2. Actually, you will never be ahead; five minutes saved 23 years from now is not as valuable as five minutes spent now. This is analogous to the lottery in which you win a million dollars, but the prize is paid as one dollar a year for a million years! 3. This is especially true on a multiuser system. 4. My previous computer was also a Radio Shack computer, but it had only a cassette recorder/player for "mass storage"! 5. Microsoft is the most prominent example at the moment; the resource consumption of Windows NT TM is still a matter of concern to many programmers, even though the machines that these programmers own have increased in power at a tremendous rate. 6. This example is actually quite conservative. The program that took one hour to run on a timesharing terminal would probably take much less than that on a current desktop computer; we are also neglecting the time value of the savings, as noted above. 7. Of course, if your old machine is more than two or three years old, you might want to replace it anyway, just to get the benefit of the improved technology available today. 8. Perhaps this could be referred to as optimizing the design. 9. This is worse than it may sound; the actual hardware on which the system runs is much slower than the i386 development machine I was using at the time. 10. This table of conversion values can be found in Figure radix40.00. 11. I also changed the method of clearing the result array to use memset rather than a loop. A Supermarket Price Lookup System Introduction In this chapter we will use a supermarket price lookup system to illustrate how to save storage by using a restricted character set and how to speed up access to records by employing hash coding (or "scatter storage") and caching (or keeping copies of recently accessed records in memory). We will look items up by their UPC (Universal Product Code), which is printed in the form of a "bar code" on virtually all supermarket items other than fresh produce. We will emphasize rapid retrieval of prices, as maintenance of such files is usually done after hours, when speed would be less significant. Algorithms Discussed Algorithms discussed: Hash Coding, Radix40 Data Representation, BCD Data Representation, Caching Up the Down Staircase To begin, let us assume that we can describe each item by the information in the structure definition in Figure initstruct. Item information (Figure initstruct) typedef struct { char upc[10]; char description[21]; float price; } ItemRecord; One solution to our price-retrieval problem would be to create a file with one record for each item, sorted by UPC code. This would allow us to use a binary search to locate the price of a particular item. How long would it take to find a record in such a file containing 10,000 items? To answer this question, we have to analyze the algorithm of the binary search in some detail. We start the search by looking at the middle record in the file. If the key we are looking for is greater than the key of the middle record, we know that the record we are looking for must be in the second half of the file (if it is in the file at all). Likewise, if our key is less than the one in the middle record, the record we are looking for must be in the first half of the file (again, if it is there at all). Once we have decided which half of the file to examine next, we look at the middle record in that half and proceed exactly as we did previously. Eventually, either we will find the record we are looking for or we will discover that we can no longer divide the segment we are looking at, as it has only one record (in which case the record we are looking for is not there). Probably the easiest way to figure out the average number of accesses that would be required to find a record in the file is to start from the other end of the problem: how many records could be found with one access? Obviously, only the middle record. With another access, we could find either the record in the middle of the first half of the file or the record in the middle of the second half. The next access adds another four records, in the centers of the first, second, third, and fourth quarters of the file. In other words, each added access doubles the number of added records that we can find. Binary search statistics (Figure binary.search) Number of Number of Total accesses Total accesses newly accessible to find all records records records accessible 1 x 1 1 1 2 x 2 4 3 3 x 4 12 7 4 x 8 32 15 5 x 16 80 31 6 x 32 192 63 7 x 64 448 127 8 x 128 1024 255 9 x 256 2304 511 10 x 512 5120 1023 11 x 1024 11264 2047 12 x 2048 24576 4095 13 x 4096 53248 8191 14 x 1809 25326 10000 ______ ______ 10000 123631 Average number of accesses per record = 12.3631 accesses/record Figure binary.search shows the calculation of the average number of accesses for a 10,000 item file. Notice that each line represents twice the number of records as the one above, with the exception of line 14. The entry for that line (1809) is the number of 14-access records needed to reach the capacity of our 10,000 record file. As you can see, the average number of accesses is approximately 12.4 per record. Therefore, at a typical hard disk speed of 10 milliseconds per access, we would need almost 125 milliseconds to look up an average record using a binary search. While this lookup time might not seem excessive, remember that a number of checkout terminals would probably be attempting to access the database at the same time, and the waiting time could become noticeable. We might also be concerned about the amount of wear on the disk mechanism that would result from this approach. . To begin, let us assume that we can describe each item by the information in the structure definition in Figure initstruct. Item information (Figure initstruct) typedef struct { char upc[10];. loop to figure out the code for a single character would be executed an average of 20 times per character! Could this be dispensed with? Yes, by allocating 256 bytes for a table of conversion. dollar a year for a million years! 3. This is especially true on a multiuser system. 4. My previous computer was also a Radio Shack computer, but it had only a cassette recorder/player for "mass

Ngày đăng: 07/07/2014, 08:20