Lập trình Wrox Professional Xcode 3 cho Mac OS part 67 ppt

19 Performance Analysis WHAT'S IN THIS CHAPTER? Performance analysis best practices Analyzing code - level performance with Shark Analyzing everything else with Instruments Solving common performance problems Whether it ’ s a rollercoaster ride or a visit to the dentist, a lot of times you just want things to go faster — and the applications you build with Xcode are no exception. Speeding up an application entails fi nding and eliminating ineffi cient code, making better use of the computer ’ s resources, or just thinking differently. Neither Xcode nor I can help you with the last one. Sometimes the most dramatic improvements in a program are those that come from completely rethinking the problem. There are, sadly, no developer tools to simulate your creativity. The more mundane approaches are to reduce the amount of waste in your application and to harness more of the computer system ’ s power. To do that you fi rst need to know where your application is spending its time and learn what resources it ’ s using. In simple terms, to make your application run faster you fi rst need to know what ’ s slowing it down. This is a process known as performance analysis, and for that, Xcode offers a number of very powerful tools. Recent versions of Xcode consolidate a wide variety of performance analysis tools that, in prior versions, were individual applications. In Xcode 3.2, the former rag - tag band of developer gizmos has been gathered together under a single umbrella called Instruments. I do not exaggerate when I say that Instruments is the single greatest advancement in Macintosh debugging ever, but it hasn ’ t replaced every tool. You ’ ll still make regular use of the debugger (Chapter 18) and you ’ ll also want to use Shark, a dedicated code - level analysis tool. This chapter discusses some of the principles of performance analysis, Shark, and Instruments. The manual for Instruments is excellent — just search the Xcode documentation for “ Instruments ” — so this chapter won ’ t go into too much detail about every instrument and ➤ ➤ ➤ ➤ c19.indd 511c19.indd 511 1/21/10 4:16:47 PM1/21/10 4:16:47 PM Download at getcoolebook.com 512 ❘ CHAPTER 19 PERFORMANCE ANALYSIS display option. Instead, I show you how to use Instruments in an Xcode - centric workfl ow and walk you through the solution to three very common performance problems. PERFORMANCE BASICS The term performance means different things to different people. This chapter concentrates on the simplest and most direct kind of performance enhancement: getting your application to run faster. In general, this means getting your program to work more effi ciently, and therefore fi nish sooner. In the real world, this is too narrow of a defi nition, and performance solutions are sometimes counterintuitive. A graphics application that draws a placeholder image, and then starts a background thread to calculate successively higher resolution versions of the same image, might actually end up drawing that image two or three times. Measured both by the raw computations expended and by clock time, the application is doing much more work and is actually slower than if it had simply drawn the fi nal image once. From the user ’ s perspective, the perceived performance of the application is better. It appears to be more responsive and usable, and therefore provides a superior experience. These are the solutions that require you to think “ different, ” not just think “ faster. ” When considering the performance of your application, keep the following principles in mind: Optimize your code only when it is actually experiencing performance problems. People are notoriously bad at estimating the performance of a complex system. Combining the fi rst two principles, you probably have no idea what or where your performance problems are. Set performance goals for your program and record benchmarks before starting any performance enhancements. Continue making, saving, and comparing benchmark results throughout your development. If It isn ’ t Broken . . . The fi rst principle is key; don ’ t try to optimize things that don ’ t need optimizing. Write your application in a straightforward, well structured, easily maintained, and robust manner. This usually means writing your application using the simplest solution, employing high - level routines, objects, and abstractions. Given the choice of using a neatly encapsulated class and a collection of C functions that do the same thing, use the class. Now run your application and test it. If, and only if , its performance is unacceptable should you even consider trying to speed it up. Your current solution might be taking a simple 4 - byte integer, turning it into a string object, stuffi ng that into a collection, serializing that collection into a data stream, and then pushing the whole mess into a pipe — whereupon the entire process is promptly reversed. You might recoil in horror at the absurd ineffi ciency used to pass a single integer value, but computers today are mind - numbingly fast. All that work, over the lifetime of your application, might ultimately consume less CPU time than it takes you to blink. ➤ ➤ ➤ ➤ c19.indd 512c19.indd 512 1/21/10 4:16:50 PM1/21/10 4:16:50 PM Download at getcoolebook.com In addition, optimization makes more work for you now and in the future. Optimized code is notoriously more fragile and diffi cult to maintain. This means that future revisions to your project will take longer and you run a higher risk of introducing bugs. So remember, don ’ t fi x what ain ’ t broke. Every optimization incurs a cost, both today and in the future. Embrace Your Own Ignorance Modern computer languages, libraries, and hardware are both complex and subtle. It is very diffi cult to guess, with any accuracy, how much time a given piece of code will take to execute. Gone are the days where you could look at some machine code, count the instructions, and then tell how long it would take the program to run. Sure, sometimes it ’ s easy. Sorting a hundred million of anything is going to take some time. Those kinds of hot spots are the exception, not the rule. Most of the time the performance bottlenecks in your application will occur in completely unexpected places. Another common fallacy is to assume you can write something that ’ s faster than the system. Always start by using the system tools, collections, and math classes that you have available to you. Only if those prove to be a burden should you consider abandoning them. You might think you can write a better sorting routine, but you probably can ’ t. Many of the algorithms provided by Apple and the open source software community have been fi nely tuned and tweaked by experienced engineers who are a lot smarter than you or I. Combining the fi rst two principles, you can distill this simple rule: Don ’ t try to guess where your performance problems are. If you have a performance problem, let the performance analysis tools fi nd them for you. Improve Performance Systematically The last bit of advice is to have a goal and measure your progress. Know what the performance goals of your application should be before you even begin development. Your goals can be completely informal. You may simply want your application to feel responsive. If it does, then you ’ re done. Most of the time you will not need to do any performance analysis or optimization at all. If your application ’ s performance seems sub - optimal, begin by measuring its performance before you try to fi x anything. If you accept the earlier principles, this is a prerequisite: you must measure the performance of your application before you can begin to know where its problems are. Even if you already know, you should still benchmark your application fi rst. You can ’ t tell how much (or if any) progress has been made if you don ’ t know where you started from. Continue this process throughout the lifetime of your application. Applications, like people, tend to get heavier and slower over time. Keep the benchmarks you took when you fi rst began, and again later after optimization. Occasionally compare them against the performance of your application as you continue development. It ’ s just like leaving a set of scales in the bathroom. ➤ Performance Basics ❘ 513 c19.indd 513c19.indd 513 1/21/10 4:16:50 PM1/21/10 4:16:50 PM Download at getcoolebook.com 514 ❘ CHAPTER 19 PERFORMANCE ANALYSIS PREPARING FOR ANALYSIS Most of the performance analysis tools, especially ones that analyze the code of your application, need debug symbol information. They also need to analyze the code you plan to release, not the code you are debugging. If you ’ ve been building and testing your application using the debugger (covered in Chapter 18), you probably already have a build confi guration for debugging. The debugger needs debugging symbols on and all optimization turned off. Your released application should be fully optimized and free of any debugger data. Performance analysis sits right between these two points. It needs debugging symbols, but it should be fully optimized. Without the debugging symbols, it can ’ t do its job, and there ’ s no point in profi ling or trying to improve unoptimized code. If you ’ re using the modern DWARF with dSYM fi le debug information format, set up your Release build confi guration as follows: Debug Information Format ( DEBUG_INFORMATION_FORMAT ) = DWARF with dSYM Generate Debug Symbols ( GCC_GENERATE_DEBUGGING_SYMBOLS ) = YES The DWARF with dSYM fi le format writes the debugging information to a separate dSYM fi le that ’ s not part of your fi nal product. The end result is a fi nished application that ’ s (mostly) stripped of any debugging information, along with a companion dSYM fi le that provides the debugger and performance analysis tools with all of the information they need. This is the build confi guration you ’ ll fi nd if you created your project using any of the modern Xcode project templates. If you ’ ve inherited an older project, created one with an earlier version of Xcode, or can ’ t use the DWARF with dSYM fi le format for some reason, you ’ re going to need a special build confi guration for performance analysis. Create a new build confi guration specifi cally for profi ling by doing the following: 1. In the Confi gurations pane of the project ’ s Info window, duplicate your Release build confi guration, the confi guration used to produce your fi nal application. Name the new confi guration “ Performance. ” 2. In the Performance confi guration for all of your targets and/or the project, turn on Generate Debug Symbols. Turn off Strip Debug Symbols During Copy. Switch to this build confi guration whenever you need to do performance analysis and enhancements. If you change the code generation settings in your Release confi guration, remember to update your Performance confi guration to match. SHARK Shark is a code performance analysis tool, and is the premier tool for optimizing your code ’ s raw execution speed. It works by periodically sampling the state of your application from which it assembles an approximate overview of its performance. Though Instruments has similar tools, Shark is purpose - built for optimizing code and is my fi rst choice when that ’ s my singular focus. ➤ ➤ c19.indd 514c19.indd 514 1/21/10 4:16:51 PM1/21/10 4:16:51 PM Download at getcoolebook.com Shark can be used on its own in a variety of modes, but you ’ re going to start it directly from within Xcode. First, build your target using your performance build confi guration. From the menu choose Run ➪ Launch Using Performance Tool ➪ Shark. Remember that the items in the Run ➪ Launch Using Performance Tool menu do not build your project fi rst. Get into the habit of performing a build (Command+B) before starting any performance tool, or you ’ ll be analyzing old code. When Xcode starts Shark, it tells Shark which executable to analyze and describes its environment — saving you a lot of confi guration work in Shark. The initial window, shown at the top of Figure 19 - 1, selects the type of analysis that Shark will perform. The useful choices are: SHARK CONFIGURATION BEST USE Time Profi le Optimizing the raw performance of your code Time Profi le (All Thread States) Speeding up your application by looking for delays caused by either your code or system routines your code calls Time Profi le (WTF) Same as Time Profi le, but analyzes a fi xed time period ending when Shark is stopped Shark can perform many different kinds of analyses, but only a few make sense when you ’ re launching a single process from Xcode. Many of Shark ’ s other modes also overlap those in Instruments, and for those cases you ’ ll probably be better served by the latter. All the choices in the analysis menu are named, editable confi gurations supplied by Shark. You can change these confi gurations (Confi g ➪ Edit) or add your own (Confi g ➪ New), much the way you redefi ne batch fi nd options in the Project Find window. The Time Profi le measures the performance of your application ’ s code. The All Thread State variant includes time spent waiting for system routines. Think of it as optimizing process time versus optimizing elapsed time. Normally, Shark captures a fi xed number of samples or for a fi xed time period — parameters you can change by editing the confi guration — stops, and then performs its analysis. The Windowed Time Facility (WTF) mode captures samples indefi nitely into a circular buffer. When stopped, it analyzes the latest set of samples. In other words, regular analysis starts at some point and analyzes a fi xed time period beyond that point. WTF analysis ends at some point and analyzes the time period up to that point. Shark ❘ 515 c19.indd 515c19.indd 515 1/21/10 4:16:51 PM1/21/10 4:16:51 PM Download at getcoolebook.com 516 ❘ CHAPTER 19 PERFORMANCE ANALYSIS To profi le a Java application, choose one of the Java - specifi c analysis modes in Shark and pass the - XrunShark argument to the Java command that launches your program. The easiest method of accomplishing this is to add a - XrunShark argument in the Arguments pane of the executable ’ s Get Info window. You might want to create a custom executable just for Shark profi ling, or just disable this argument when you are done. Java and Xcode have an on - again, off - again romance — Apple alternately breaks Java support in Shark and then fi xes it again. Some combinations of the Java run time and Xcode analysis tools don ’ t work well together, if at all. Consult the Java release notes or inquire on the Xcode - users discussion list if you are having problems. To start a Shark analysis of your application simply click the Start button, or press Shark ’ s hot key (Option+Esc), and a launch process confi guration dialog box appears, as shown in Figure 19 - 1. The dialog has been preconfi gured by Xcode, so you shouldn ’ t have much to do. Before beginning your analysis, you can choose to modify the command - line parameters or environment variables passed to your application. This example passes an argument of “ 150000 ” to the program for testing. The menu button to the right of the Arguments fi eld keeps a history of recently used argument lists. FIGURE 19-1 c19.indd 516c19.indd 516 1/21/10 4:17:06 PM1/21/10 4:17:06 PM Download at getcoolebook.com You can also attach Shark to a currently running application. If you ’ ve already started your application from Xcode, launch Shark and choose your running process from the menu at the right of Shark ’ s control window. Click the Start button, or switch back to your application and use Shark ’ s hot key (Option+Esc) to start sampling at a strategic point. Profi le View Your program runs while Shark is analyzing it. Shark stops its analysis after your program has terminated, when you stop it yourself using the Stop button, by pressing Shark ’ s hot key (Option+Esc), or when Shark ’ s analysis buffer fi lls up — typically 30 seconds ’ worth. Shark then presents its profi le of your application ’ s performance, as shown in Figure 19 - 2. FIGURE 19-2 Figure 19 - 2 shows the Tree (Top - Down) view of your application ’ s performance in the lower pane. The columns Self and Total show how much time your program spent in each function of your application. The Self column records the amount of time it spent in the code of that particular function. The Total column calculates the amount of time spent in that function and any functions called by that function. The organization of the tree view should be obvious, because it parallels the calling structure of your application. Every function listed expands to show the functions that it called and how much time was spent in each. Shark ❘ 517 c19.indd 517c19.indd 517 1/21/10 4:17:12 PM1/21/10 4:17:12 PM Download at getcoolebook.com 518 ❘ CHAPTER 19 PERFORMANCE ANALYSIS The compiler ’ s optimization can confuse Shark as well as the debugger. Take the following code as an example: main() { calc(); } int calc( ) { return (subcalc()); } int subcalc( ) { return (x); } The compiler may convert the statement return (subcalc()) into a “ chain ” rather than a function call. It might pop the existing stack frame before calling subcalc , or jump to subcalc() and let it use the stack frame for calc() . Either way, it has avoided the overhead incurred by creating and destroying a stack frame, but when Shark examines the stack, it thinks that main called subcalc directly, and it includes subcalc in the list of functions called by main . The Heavy (Bottom - Up) view is shown in the upper pane of Figure 19 - 2. You can choose to see the Tree, Heavy, or both views using the View menu in the lower - right corner of the window. The Heavy view inverts the tree. Each item is now the amount of time spent executing the code within each function. This does not include any time spent in functions that this function might have called. When you expand a function group, you see the list of functions that called that function. The times for each break down the amount of time spent in that function while being called from each of the calling functions listed. The Tree view gives you a bird ’ s - eye view of where your application is spending its time, starting with the fi rst function called (where it spends all of its time), and subdividing that by the subroutines it calls, the subroutines called by those subroutines, and so on. The Heavy view determines which functions use up the most of your program ’ s CPU time. This view works backward to determine what functions caused those heavily used functions to be called, and how often. You can use this information in a variety of ways. The typical approach is to use the Heavy view to see which functions use the most time. Those are prime candidates for optimization. If you can make those functions run faster, anything that calls them will run faster too. Both the Heavy and Tree views are also useful for fi nding logical optimizations. This usually entails reengineering the callers of heavy functions so that they get called less often or not at all. Making a function faster helps, but not calling the function in the fi rst place is even quicker. At the bottom of the window are the Process and Thread selections. These two menus enable you to restrict the analysis to a single process or thread within that process. Set one of these menus to something other than All, and only samples belonging to that thread or process are considered. Because Xcode launched Shark for this particular application, there is only one process. If the c19.indd 518c19.indd 518 1/21/10 4:17:13 PM1/21/10 4:17:13 PM Download at getcoolebook.com application you launched runs in multiple threads, select the thread you want to analyze. The status line above these two controls displays the number of samples being considered. STATISTICAL SAMPLING At this point, you might be wondering how Shark obtained all of this information. Shark is one of a class of tools known as samplers . Shark interrupts your application about a thousand times every second and takes a (very) quick “ snapshot ” of what it ’ s doing. After these samples have been gathered, Shark combines all of the samples to produce a statistical picture of where your program spends its time. Think of trying to analyze the habits of a fi refl y. You set up a high - speed camera in the dark and fi lm the fi refl y as it fl ies around. After you ’ ve got several thousand frames, you could compose all of those frames into a single picture like a single long exposure. Points in the picture that are very bright are where the fi refl y spent most of its time, dimmer spots indicate places where it visited briefl y, and the black portions are where it (probably) never went at all. This is exactly what the Heavy view shows — the accumulated totals of every snapshot where Shark found your application in a particular function at that moment. The word “ statistical ” is important. Shark does not trace your program nor does it rigorously determine exactly when each subroutine was called and how long it took to run. It relies entirely on taking thousands of (essentially) random samples. Functions that execute very little of the time may not even appear. Shark can only be trusted when it has many samples of a function — which is perfect, because you ’ re usually only interested in the functions that take up a signifi cant amount of time. The ones that don ’ t aren ’ t worth optimizing. Code View The profi le view is useful for identifying the functions that are candidates for optimization, but what in those functions needs optimizing? Double - click any function name and Shark switches to a code view of that function as shown in Figure 19 - 3. FIGURE 19-3 Shark ❘ 519 c19.indd 519c19.indd 519 1/21/10 4:17:18 PM1/21/10 4:17:18 PM Download at getcoolebook.com 520 ❘ CHAPTER 19 PERFORMANCE ANALYSIS The code view shows the amount of time your application spent in each line of your function, or, to put it more precisely, it shows the percentage of time that it found your application at that line of source code when it captured a sample. Source view requires that you compiled your application with full debug information, and that information was not stripped from the executable. If there is no debug information available, you ’ ll see the machine code disassembly of the function instead. If you really want to get into the nitty - gritty details of your code, change the listing to show the machine code, or both the machine and source code side - by - side, using the buttons in the upper - right corner. Shark highlights the “ hot spots ” in your code by coloring heavily used statements in various shades of yellow. This shading continues into the scroll bar area so you can quickly scroll to hot spots above or below the visible listing. Even though the window will scroll through the entire source fi le that contains the function you are examining, only the statistics and samples for the function you selected are analyzed. If you want to scrutinize samples in a different function, return to the Profi le tab and double - click that function. The truly amazing power of Shark really becomes evident in the code view. Shark analyzes the execution fl ow of your code and makes suggestions for improving it. These suggestions appear as advice buttons — small square buttons with exclamation points — throughout the interface. One is shown in Figure 19 - 4. Click an advice button and Shark pops up a bubble explaining what it found and what you might do to improve the performance of your code. FIGURE 19-4 The advice feature is quite sophisticated; many of its suggestions require the analysis of code fl ow, instruction pipelining, and caching features of various processors. Shark can recognize when you are dividing where you might be better off with a bit shift instead, or where loops should be unrolled, aligned, and so on. It ’ s like having a CPU performance expert in a box. Every time you link to a code view from the summary window it creates another tab at the top of Shark ’ s analysis window. Close the tab or navigate back to the profi le pane. Jumping from one c19.indd 520c19.indd 520 1/21/10 4:17:20 PM1/21/10 4:17:20 PM Download at getcoolebook.com [...]... session To merge two sessions, choose the File ➪ Merge (Option+Command+M) command Shark will present two open fi le dialogs Choose the fi rst session fi le in one and the second session fi le in the other Shark merges those samples into a new untitled session This can take a while, so be patient If you need to merge more than two sessions, save the merged session as a new fi le, close it, and then merge the... thread than the previous sample As with the profile view, you can choose to restrict the graph to just those samples from a single process or thread FIGURE 19-5 Download at getcoolebook.com c19.indd 521 1/21/10 4:17:20 PM 522 ❘ CHAPTER 19 PERFORMANCE ANALYSIS Refining the Analysis Shark has a number of controls for refi ning what samples you choose to analyze and how Unfi ltered sample data can be overwhelming... save the sample data in a fi le for later perusal When you save a Shark session, it presents the dialog box shown in Figure 19-7 Download at getcoolebook.com c19.indd 5 23 1/21/10 4:17:21 PM 524 ❘ CHAPTER 19 PERFORMANCE ANALYSIS If you choose to embed the source fi le information, you’ll be able to browse the code view with the original source at a later date If you strip the information, you will have... available when you load the session) To load an old session, open the session fi le from the Finder or use Shark’s File ➪ Open or File ➪ Open Recent command You can also have Shark compare two sessions Choose the File ➪ Compare (Command+Option+C) command and open two Shark session fi les Shark presents a single profile browser showing the changes in the profi le values instead of absolute values for each... iPhone app or other process running on a remote system Xcode, however, won’t set this up for you automatically If you need to do code performance analysis on an iPhone or a remote application, fi rst consider using the CPU Sampler template in Instruments The CPU Sampler instrument is like a baby version of Shark that runs as an instrument, and Xcode will automatically set up and connect Instruments... performance of individual functions more accurately The options you’ll probably want to change, and why, are listed in the following table: Download at getcoolebook.com c19.indd 522 1/21/10 4:17:21 PM Shark ❘ 5 23 CALLSTACK DATA MINING OPTION EFFECT Charge System Libraries to Callers The amount of time calculated for function will include the time spent executing system calls Use this to zero in on time - consuming... samples If you don’t have any heavy hotspots, this can hide too much Turn it off if you can’t find the functions you’re looking for Flatten Recursion If your functions use recursion, this option applies the cost of the recursive call to the function’s weight Remove Supervisor Callstacks Ignores any samples that occur in the kernel (supervisor) state This includes interrupt handling The Charge System Libraries... is like a baby version of Shark that runs as an instrument, and Xcode will automatically set up and connect Instruments to an iPhone application Instruments is by far the easiest solution, if not the most powerful If you really want to use Shark instead, connect to your iPhone or remote process using the Sampling ➪ Network/iPhone Profi ling command See the Shark help document for the details Download . 7. ➤ ➤ ➤ ➤ Shark ❘ 5 23 c19.indd 523c19.indd 5 23 1/21/10 4:17:21 PM1/21/10 4:17:21 PM Download at getcoolebook.com 524 ❘ CHAPTER 19 PERFORMANCE ANALYSIS If you choose to embed the source fi. sessions, choose the File ➪ Merge (Option+Command+M) command. Shark will present two open fi le dialogs. Choose the fi rst session fi le in one and the second session fi le in the other. Shark merges those. are considered. Because Xcode launched Shark for this particular application, there is only one process. If the c19.indd 518c19.indd 518 1/21/10 4:17: 13 PM1/21/10 4:17: 13 PM Download at getcoolebook.com application

Định dạng
Số trang	15
Dung lượng	4,71 MB