Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 308 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
308
Dung lượng
7,49 MB
Nội dung
[...]... INTRODUCTION 1.1 Need for ComputingSystemReliabilityAnalysis 1 1.2 ComputingSystemReliability Concepts 2 1.3 Approaches to ComputingSystem Modeling 3 2 BASIC RELIABILITY CONCEPTS AND ANALYSIS 7 7 2.1 Reliability Measures 2.2 Common Techniques in ReliabilityAnalysis 12 2.3 Markov Process Fundamentals 19 2.4 Nonhomogeneous Poisson Process (NHPP) Models 36 3 MODELS FOR HARDWARE SYSTEMRELIABILITY 41 3.1... INTEGRATED SYSTEMS 113 5.1 Single-Processor System 113 5.2 Models for Modular System 122 5.3 128 5.4 Models for Clustered System A Unified NHPP Markov Model 139 5.5 Notes and References 143 6 AVAILABILITY ANDRELIABILITY OF DISTRIBUTED 145 COMPUTING SYSTEMS 6.1 6.2 Introduction to Distributed Computing Distributed Program andSystemReliability 146 148 6.3 Homogeneously Distributed Software/Hardware Systems... Heterogeneous Distributed Systems 171 6.5 Notes and References 176 7 7.1 RELIABILITY OF GRID COMPUTING SYSTEMS 179 Introduction of the Grid ComputingSystem 180 7.2 Grid Reliability of the Resource Management System 184 7.3 Grid Reliability of the Network 188 7.4 Grid Reliability of the Software and Resources 201 7.5 Notes and References 204 Contents xiii 8 207 MULTI-STATE SYSTEMRELIABILITY 8.1 Basic... to increase the performance of the computing systems and to improve the development process, a thorough analysis of their reliability is needed Based on the models and analysis, approaches to improve systemreliability can be further implemented 1.2 ComputingSystemReliability Concepts In general, the basic reliability concept is defined as the probability that a system will perform its intended function... such as “software reliability , systemreliability , “service reliability , system availability”, etc., for different purposes Computing SystemReliabilityAnalysis Most computing systems contain software programs to achieve various computing tasks Software reliability is an important metric to assess the software performance Similar to the general reliability concept, software reliability is defined... with general and specific issues of reliability are available, see e.g., Barlow & Proschan (1981), Shooman (1990), Hoyland & Rausand (1994), Elsayed (1996), and Blischke & Murthy (2000) Some basic and important reliability measures are introduced in this chapter Since computingsystemreliability is related to general system reliability, the focus will be on tools and techniques for system reliability. .. computingsystem is usually called distributed computingsystem The performance of a distributed computingsystem is determined not only by the software/hardware reliability but also by the reliability of the networks for communication Many models and algorithms have been presented for the distributed system reliability, see ComputingSystemReliabilityAnalysis e.g Hariri et al (1985), Kumar et al... complexity, the reliability of the grid computing systems begins to be of concern today Most of reliabilitymodels for computing systems assume only two possible states of the system In reality, many computing systems may contain more than two states (Lisnianski & Levitin, 2003), especially for those real-time systems For example, if some computing elements in a real-time system fail, the system may still... Component System 41 3.2 Parallel Configurations 48 3.3 Load-Sharing Configurations 58 3.4 Standby Configurations 61 3.5 Notes and References 69 4 4.1 MODELS FOR SOFTWARE RELIABILITY 71 71 Basic Markov Model xi xii ComputingSystemReliability 4.2 Extended Markov Models 76 4.3 Modular Software Systems 90 4.4 Models for Correlated Failures 94 4.5 Software NHPP Models 101 4.6 Notes and References 110 5 MODELS. .. Example 2.3 If a system has a lifetime distribution function and a maintainability function then and The MTBF is the sum of MTTF and MTTR and the steady-state availability is 2.2 Common Techniques in ReliabilityAnalysis There are many techniques in reliabilityanalysis The most widely used techniques in computing systems are reliability block diagrams, network diagrams, fault tree analysis and Monte Carlo . w0 h1" alt="" Computing System Reliability Models and Analysis This page intentionally left blank Computing System Reliability Models and Analysis Min Xie Yuan-Shun Dai and Kim-Leng Poh. coverage of tools and techniques for computing system reliability modeling and analysis. Reliability analysis is a useful tool in evaluating the performance of complex systems. Intensive studies. hardware/software reliability is studied. The distributed computing system is a common and widely-used networked system and hence a chapter is devoted to this. The reliability of grid computing systems,