Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 361 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
361
Dung lượng
8,12 MB
Nội dung
THE EXPERT’S VOICE
®
IN .NET
For your convenience Apress has placed some of the front
matter material after the index. Please use the Bookmarks
and Contents at a Glance links to access them.
v
Contents at a Glance
Foreword xv
About the Authors xvii
About the Technical Reviewers xix
Acknowledgments xxi
Introduction xxiii
Chapter 1: Performance Metrics ■ 1
Chapter 2: Performance Measurement ■ 7
Chapter 3: Type Internals ■ 61
Chapter 4: Garbage Collection ■ 91
Chapter 5: Collections and Generics ■ 145
Chapter 6: Concurrency and Parallelism ■ 173
Chapter 7: Networking, I/O, and Serialization ■ 215
Chapter 8: Unsafe Code and Interoperability ■ 235
Chapter 9: Algorithm Optimization ■ 259
Chapter 10: Performance Patterns ■ 277
Chapter 11: Web Application Performance ■ 305
Index 335
xxiii
Introduction
is book has come to be because we felt there was no authoritative text that covered all three areas relevant to
.NET application performance:
Identifying performance metrics and then measuring application performance to verify •
whether it meets or exceeds these metrics.
Improving application performance in terms of memory management, networking, I/O, •
concurrency, and other areas.
Understanding CLR and .NET internals in sucient detail to design high-performance •
applications and x performance issues as they arise.
We believe that .NET developers cannot achieve systematically high-performance software solutions without
thoroughly understanding all three areas. For example, .NET memory management (facilitated by the CLR
garbage collector) is an extremely complex eld and the cause of signicant performance problems, including
memory leaks and long GC pause times. Without understanding how the CLR garbage collector operates,
high-performance memory management in .NET is left to nothing but chance. Similarly, choosing the proper
collection class from what the .NET Framework has to oer, or deciding to implement your own, requires
comprehensive familiarity with CPU caches, runtime complexity, and synchronization issues.
is book’s 11 chapters are designed to be read in succession, but you can jump back and forth between topics
and ll in the blanks when necessary. e chapters are organized into the following logical parts:
Chapter 1 and Chapter 2 deal with performance metrics and performance measurement. •
ey introduce the tools available to you to measure application performance.
Chapter 3 and Chapter 4 dive deep into CLR internals. ey focus on type internals •
and the implementation of CLR garbage collection—two crucial topics for improving
application performance where memory management is concerned.
Chapter 5, Chapter 6, Chapter 7, Chapter 8, and Chapter 11 discuss specic areas of the •
.NET Framework and the CLR that oer performance optimization opportunities—using
collections correctly, parallelizing sequential code, optimizing I/O and networking
operations, using interoperability solutions eciently, and improving the performance of
Web applications.
Chapter 9 is a brief foray into complexity theory and algorithms. It was written to give you •
a taste of what algorithm optimization is about.
Chapter 10 is the dumping ground for miscellaneous topics that didn’t t elsewhere in the •
book, including startup time optimization, exceptions, and .NET Reection.
Some of these topics have prerequisites that will help you understand them better. roughout the course of the
book we assume substantial experience with the C# programming language and the .NET Framework, as well as
familiarity with fundamental concepts, including:
xxiv
■ IntroduCtIon
Windows: threads, synchronization, virtual memory•
Common Language Runtime (CLR): Just-In-Time (JIT) compiler, Microsoft Intermediate •
Language (MSIL), garbage collector
Computer organization: main memory, cache, disk, graphics card, network interface•
ere are quite a few sample programs, excerpts, and benchmarks throughout the book. In the interest of not
making this book any longer, we often included only a brief part—but you can nd the whole program in the
companion source code on the book’s website.
In some chapters we use code in x86 assembly language to illustrate how CLR mechanisms operate or to
explain more thoroughly a specic performance optimization. Although these parts are not crucial to the book’s
takeaways, we recommend dedicated readers to invest some time in learning the fundamentals of x86 assembly
language. Randall Hyde’s freely available book “e Art of Assembly Language Programming”
(http://www.artofasm.com/Windows/index.html) is an excellent resource.
In conclusion, this book is full of performance measurement tools, small tips and tricks for improving minor
areas of application performance, theoretical foundations for many CLR mechanisms, practical code examples,
and several case studies from the authors’ experience. For almost ten years we have been optimizing applications
for our clients and designing high-performance systems from scratch. During these years we trained hundreds of
developers to think about performance at every stage of the software development lifecycle and to actively seek
opportunities for improving application performance. After reading this book, you will join the ranks of
high-performance .NET application developers and performance investigators optimizing existing applications.
Sasha Goldshtein
Dima Zurbalev
Ido Flatow
1
Chapter 1
Performance Metrics
Before we begin our journey into the world of .NET performance, we must understand the metrics and goals
involved in performance testing and optimization. In Chapter 2, we explore more than a dozen profilers and
monitoring tools; however, to use these tools, you need to know which performance metrics you are interested in.
Different types of applications have a multitude of varying performance goals, driven by business and
operational needs. At times, the application’s architecture dictates the important performance metrics: for
example, knowing that your Web server has to serve millions of concurrent users dictates a multi-server
distributed system with caching and load balancing. At other times, performance measurement results may
warrant changes in the application’s architecture: we have seen countless systems redesigned from the ground
up after stress tests were run—or worse, the system failed in the production environment.
In our experience, knowing the system’s performance goals and the limits of its environment often guides
you more than halfway through the process of improving its performance. Here are some examples we have been
able to diagnose and fix over the last few years:
We discovered a serious performance problem with a powerful Web server in a hosted •
data center caused by a shared low-latency 4Mbps link used by the test engineers. Not
understanding the critical performance metric, the engineers wasted dozens of days
tweaking the performance of the Web server, which was actually functioning perfectly.
We were able to improve scrolling performance in a rich UI application by tuning the •
behavior of the CLR garbage collector—an apparently unrelated component. Precisely
timing allocations and tweaking the GC flavor removed noticeable UI lags that annoyed
users.
We were able to improve compilation times ten-fold by moving hard disks to SATA ports •
to work around a bug in the Microsoft SCSI disk driver.
We reduced the size of messages exchanged by a WCF service by 90 %, considerably •
improving its scalability and CPU utilization, by tuning WCF’s serialization mechanism.
We reduced startup times from 35 seconds to 12 seconds for a large application with •
300 assemblies on outdated hardware by compressing the application’s code and
carefully disentangling some of its dependencies so that they were not required
at load time.
These examples serve to illustrate that every kind of system, from low-power touch devices, high-end
consumer workstations with powerful graphics, all the way through multi-server data centers, exhibits unique
performance characteristics as countless subtle factors interact. In this chapter, we briefly explore the variety of
performance metrics and goals in typical modern software. In the next chapter, we illustrate how these metrics
can be measured accurately; the remainder of the book shows how they can be improved systematically.
CHAPTER 1 ■ PERFORMANCE METRICS
2
Performance Goals
Performance goals depend on your application’s realm and architecture more than anything else. When you
have finished gathering requirements, you should determine general performance goals. Depending on your
software development process, you might need to adjust these goals as requirements change and new business
and operation needs arise. We review some examples of performance goals and guidelines for several archetypal
applications, but, as with anything performance-related, these guidelines need to be adapted to your software’s
domain.
First, here are some examples of statements that are not good performance goals:
The application will remain responsive when many users access the Shopping Cart •
screen simultaneously.
The application will not use an unreasonable amount of memory as long as the number •
of users is reasonable.
A single database server will serve queries quickly even when there are multiple, •
fully-loaded application servers.
The main problem with these statements is that they are overly general and subjective. If these are your
performance goals, then you are bound to discover they are subject to interpretation and disagreements on their
frame-of-reference. A business analyst may consider 100,000 concurrent users a “reasonable” number, whereas
a technical team member may know the available hardware cannot support this number of users on a single
machine. Conversely, a developer might consider 500 ms response times “responsive,” but a user interface expert
may consider it laggy and unpolished.
A performance goal, then, is expressed in terms of quantifiable performance metrics that can be measured
by some means of performance testing. The performance goal should also contain some information about its
environment—general or specific to that performance goal. Some examples of well-specified performance goals
include:
The application will serve every page in the “Important” category within less than 300 ms •
(not including network roundtrip time), as long as not more than 5,000 users access the
Shopping Cart screen concurrently.
The application will use not more than 4 KB of memory for each idle user session.•
The database server’s CPU and disk utilization should not exceed 70%, and it should •
return responses to queries in the “Common” category within less than 75ms, as long as
there are no more than 10 application servers accessing it.
Note ■ These examples assume that the “Important” page category and “Common” query category are
well-known terms defined by business analysts or application architects. Guaranteeing performance goals for every
nook and cranny in the application is often unreasonable and is not worth the investment in development, hardware,
and operational costs.
We now consider some examples of performance goals for typical applications (see Table 1-1). This list is
by no means exhaustive and is not intended to be used as a checklist or template for your own performance
goals—it is a general frame that establishes differences in performance goals when diverse application types are
concerned.
CHAPTER 1 ■ PERFORMANCE METRICS
3
Note ■ Characteristics of the hardware on which the application runs are a crucial part of environment
constraints. For example, the startup time constraint placed on the smart client application in Table 1-1 may require
a solid-state hard drive or a rotating hard drive speed of at least 7200RPM, at least 2GB of system memory, and a
1.2GHz or faster processor with SSE3 instruction support. These environment constraints are not worth repeating for
every performance goal, but they are worth remembering during performance testing.
Table 1-1. Examples of Performance Goals for Typical Applications
System Type Performance Goal Environment Constraints
External Web Server Time from request start to full
response generated should not
exceed 300ms
Not more than 300 concurrently
active requests
External Web Server Virtual memory usage
(including cache) should not
exceed 1.3GB
Not more than 300 concurrently
active requests; not more than
5,000 connected user sessions
Application Server CPU utilization should not
exceed 75%
Not more than 1,000 concurrently
active API requests
Application Server Hard page fault rate should not
exceed 2 hard page faults per second
Not more than 1,000 concurrently
active API requests
Smart Client Application Time from double-click on desktop
shortcut to main screen showing
list of employees should not
exceed 1,500ms
Smart Client Application CPU utilization when the
application is idle should
not exceed 1%
Web Page Time for filtering and sorting the
grid of incoming emails should
not exceed 750ms, including
shuffling animation
Not more than 200 incoming
emails displayed on a single screen
Web Page Memory utilization of cached
JavaScript objects for the
“chat with representative”
windows should not exceed 2.5MB
Monitoring Service Time from failure event to alert generated
and dispatched should not exceed 25ms
Monitoring Service Disk I/O operation rate when alerts are
not actively generated should be 0
CHAPTER 1 ■ PERFORMANCE METRICS
4
When performance goals are well-defined, the performance testing, load testing, and subsequent
optimization process is laid out trivially. Verifying conjectures, such as “with 1,000 concurrently executing API
requests there are less than 2 hard page faults per second on the application server,” may often require access to
load testing tools and a suitable hardware environment. The next chapter discusses measuring the application to
determine whether it meets or exceeds its performance goals once such an environment is established.
Composing well-defined performance goals often requires prior familiarity with performance metrics,
which we discuss next.
Performance Metrics
Unlike performance goals, performance metrics are not connected to a specific scenario or environment.
A performance metric is a measurable numeric quantity that ref lects the application’s behavior. You can measure
a performance metric on any hardware and in any environment, regardless of the number of active users,
requests, or sessions. During the development lifecycle, you choose the metrics to measure and derive from them
specific performance goals.
Some applications have performance metrics specific to their domain. We do not attempt to identify these
metrics here. Instead, we list, in Table 1-2, performance metrics often important to many applications, as well as
the chapter in which optimization of these metrics is discussed. (The CPU utilization and execution time metrics
are so important that they are discussed in every chapter of this book.)
Table 1-2. List of Performance Metrics (Partial)
Performance Metric Units of Measurement Specific Chapter(s) in This Book
CPU Utilization Percent All Chapters
Physical/Virtual
Memory Usage
Bytes, kilobytes,
megabytes, gigabytes
Chapter 4 – Garbage Collection
Chapter 5 – Collections and Generics
Cache Misses Count, rate/second Chapter 5 – Collections and Generics
Chapter 6 – Concurrency and Parallelism
Page Faults Count, rate/second
Database Access
Counts/Timing
Count, rate/second, milliseconds
Allocations Number of bytes,
number of objects, rate/second
Chapter 3 – Type Internals
Chapter 4 – Garbage Collection
Execution Time Milliseconds All Chapters
Network Operations Count, rate/second Chapter 7 – Networking, I/O, and Serialization
Chapter 11 – Web Applications
Disk Operations Count, rate/second Chapter 7 – Networking, I/O, and Serialization
Response Time Milliseconds Chapter 11 – Web Applications
Garbage Collections Count, rate/second, duration
(milliseconds), % of total time
Chapter 4 – Garbage Collection
Exceptions Thrown Count, rate/second Chapter 10 – Performance Patterns
Startup Time Milliseconds Chapter 10 – Performance Patterns
Contentions Count, rate/second Chapter 6 – Concurrency and Parallelism
CHAPTER 1 ■ PERFORMANCE METRICS
5
Some metrics are more relevant to certain application types than others. For example, database access times
are not a metric you can measure on a client system. Some common combinations of performance metrics and
application types include:
For client applications, you might focus on startup time, memory usage, and CPU •
utilization.
For server applications hosting the system’s algorithms, you usually focus on CPU •
utilization, cache misses, contentions, allocations, and garbage collections.
For Web applications, you typically measure memory usage, database access, network •
and disk operations, and response time.
A final observation about performance metrics is that the level at which they are measured can often be
changed without significantly changing the metric’s meaning. For example, allocations and execution time can
be measured at the system level, at the single process level, or even for individual methods and lines. Execution
time within a specific method can be a more actionable performance metric than overall CPU utilization or
execution time at the process level. Unfortunately, increasing the granularity of measurements often incurs a
performance overhead, as we illustrate in the next chapter by discussing various profiling tools.
perFOrMaNCe IN the SOFtWare DeVeLOpMeNt LIFeCYCLe
Where do you fit performance in the software development lifecycle? This innocent question carries the
baggage of having to retrofit performance into an existing process. Although it is possible, a healthier
approach is to consider every step of the development lifecycle an opportunity to understand the
application’s performance better: first, the performance goals and important metrics; next, whether the
application meets or exceeds its goals; and finally, whether maintenance, user loads, and requirement
changes introduce any regressions.
1. During the requirements gathering phase, start thinking about the performance
goals you would like to set.
During the architecture phase, refine the performance metrics important for your 2.
application and define concrete performance goals.
During the development phase, frequently perform exploratory performance testing 3.
on prototype code or partially complete features to verify you are well within the
system’s performance goals.
During the testing phase, perform significant load testing and performance testing to 4.
validate completely your system’s performance goals.
During subsequent development and maintenance, perform additional load testing 5.
and performance testing with every release (preferably on a daily or weekly basis) to
quickly identify any performance regressions introduced into the system.
Taking the time to develop a suite of automatic load tests and performance tests, set up an isolated lab
environment in which to run them, and analyze their results carefully to make sure no regressions are
introduced is very time-consuming. Nevertheless, the performance benefits gained from systematically
measuring and improving performance and making sure regressions do not creep slowly into the system is
worth the initial investment in having a robust performance development process.
[...]... little effort to configure custom performance counters, and they can be of utmost importance when carrying out a performance investigation Correlating system performance counter data with custom performance counters is often all a performance investigator needs to pinpoint the precise cause of a performance or configuration issue 12 CHAPTER 2 ■ Performance Measurement ■■Note Performance Monitor can be used... experiment with Performance Monitor and familiarize yourself with the various options it has to offer 10 CHAPTER 2 ■ Performance Measurement CONFIGURING PERFORMANCE COUNTER LOGS To configure performance counter logs, open Performance Monitor and perform the following steps (We assume that you are using Performance Monitor on Windows 7 or Windows Server 2008 R2; in prior operating system versions, Performance. .. configuration steps Although performance counters offer a great amount of interesting performance information, they cannot be used as a high -performance logging and monitoring framework There are no system components that update performance counters more often than a few times a second, and the Windows Performance Monitor won’t read performance counters more often than once a second If your performance investigation... overhead 7 CHAPTER 2 ■ Performance Measurement Performance Counters Windows performance counters are a built-in Windows mechanism for performance and health investigation Various components, including the Windows kernel, drivers, databases, and the CLR provide performance counters that users and administrators can consume and understand how well the system is functioning As an added bonus, performance counters... System.Diagnostics.PerformanceCounter class Even better, you can create your own performance counters and add them to the vast set of data available for performance investigation Below are some scenarios in which you should consider exporting performance counter categories: • You are developing an infrastructure library to be used as part of large systems Your library can report performance information through performance. .. information Reading performance counter information from a local or remote system is extremely easy The built-in Performance Monitor tool (perfmon.exe) can display every performance counter available on the system, as well as log performance counter data to a file for subsequent investigation and provide automatic alerts when performance counter readings breach a defined threshold Performance Monitor... determine that it exhibits a memory leak using Performance Monitor and the performance counters discussed above 1 Open Performance Monitor—you can find it in the Start menu by searching for Performance Monitor” or run perfmon.exe directly 2 Run the MemoryLeak.exe application from this chapter’s source code folder 9 CHAPTER 2 ■ Performance Measurement 3 Click the Performance Monitor” node in the tree on... directory to that machine, open the Performance Monitor node, and click the second toolbar button from the left (or Ctrl + L) In the resulting dialog you can select the “Log files” checkbox and add log files using the Add button 11 CHAPTER 2 ■ Performance Measurement Custom Performance Counters Although Performance Monitor is an extremely useful tool, you can read performance counters from any NET application...CHAPTER 1 ■ Performance Metrics Summary This chapter served as an introduction to the world of performance metrics and goals Making sure you know what to measure and what performance criteria are important to you can be even more important than actually measuring performance, which is the subject of the next chapter Throughout the remainder of the book, we measure performance using a variety... manually.) You can also use Performance Monitor to configure a performance counter alert, which will execute a task when a certain threshold is breached You can use performance counter alerts to create a rudimentary monitoring infrastructure, which can send an email or message to a system administrator when a performance constraint is violated For example, you could configure a performance counter alert .
opportunities for improving application performance. After reading this book, you will join the ranks of
high -performance .NET application developers and performance. include
.NET CLR Memory, Processor Information, TCPv4, and PhysicalDisk.
• Performance counters are individual numeric data properties in a performance