Designing Enterprise Solutions with Sun™ Cluster 3.0 By Richard Elling, Tim Read Publisher: Prentice Hall PTR Pub Date: December 01, 2001 ISBN: 0-13-008458-1 Pages: 302

Designing Enterprise Solutions with Sun Cluster 3.0 is an introduction to architecting high available systems with Sun servers, storage, and the Sun Cluster 3.0 software Three recurring themes are used throughout the book: failures, synchronization, and arbitration These themes occur throughout all levels of system design The first chapter deals with understanding these relationships and recognizing failure modes associated with synchronization and arbitration The second and third chapters review the building blocks and describe the Sun Cluster 3.0 software environment in detail The remaining chapters discuss management servers and provide hypothetical case studies in which enterprise solutions are designed using Sun technologies Appendices provide a checklist for designing clustered solutions, additional information on Sun technologies used in many different types of clusters, guidelines for data center design best practices, and a brief description of some failure analysis tools used by Sun systems designers and architects

Table of Contents:
Copyright
Figures
Tables
Preface
Sun BluePrints Program
Who Should Use This Book
Before You Read This Book
How This Book Is Organized
Ordering Sun Documents
Accessing Sun Documentation Online
Related Books
Typographic Style
Shell Prompts in Command Examples
Acknowledgements
Richard Elling
Tim Read
Chapter Cluster and Complex System Design Issues
Business Reasons for Clustered Systems
Failures in Complex Systems
Data Synchronization Arbitration Schemes
Data Caches
Timeouts
Failures in Clustered Systems
Summary
Chapter Enterprise Cluster Computing Building Blocks
Data Repositories and Infrastructure Services
Business Logic and Application Service
User Access Services: Web Farms
Compute Clusters
Technologies for Building Distributed Applications
Chapter Sun Cluster 3.0 Architecture
System Architecture
Kernel Infrastructure
System Features
Cluster Failures
Synchronization Synchronization
Arbitration
Chapter Management Server Design Goals
Services
Console Services
Sun Ray Server
Sun StorEdge SAN Surfer
Sun Explorer Data Collector
Sun Remote Services
Software Stack
Hardware Components
Network Configuration
Systems Management
Backup, Restore, and Recovery
Summary
Chapter Case Study 1—File Server Cluster
Firm Description
Design Goals
Cluster Software
Recommended Hardware Configuration
Summary
Chapter Case Study 2—Database Cluster
Company Description RAC Oracle 9i Real Application Cluster
RAM random access memory
Reliability An abstract term defined as the probability that a product or system performs its intended function for a specified time period when operating under normal environmental conditions Reliability differs from availability in that reliability involves only one event, failure, whereas availability takes into account two events: failure and recovery A system can be highly available yet experience frequent periods of inoperability as long as the length of each period is short
RBAC role-based access control
RBD reliability block diagram
RDBMS relational database management system
RM-API Resource Management API
RM replica managers
RMA replica manager agents
RMM replica manager manager
RPN risk priority number
RSC remote service controller
RSM Remote Shared Memory
RTR resource type registration
RTS redundant transfer switch
RTU redundant transfer unit

S3L Sun Scalable Scientific Subroutine Library
SAN storage area network
SC system controller
SCI Scalable Coherent Interface
SCN system change number
SCSI Small Computer System Interface
SCSL Sun Community Software License
SGA system global area
SLA service level agreement
SMB Now called the Common Intenet File System
SMON system monitor
SMP symmetric multiprocessor
SPA service point architecture
SPOF single point of failure
SRS Sun Remote Services
SSP system service processor
SVM Solaris Volume Manager Formerly known as Solstice DiskSuite
SQE software query enable
Split brain Condition in which a cluster forms multiple partitions, with each partition forming without knowledge of the existence of any other partition
SMON system monitor
SRAM static random access memory
SMP symmetric multiprocessor
Systems engineering The engineering discipline concerned with the design of the whole as distinct from the design of the parts

TC terminal concentrator
TCP/IP Transmission Control Protocol/Internet Protocol

UDP User Datagram Protocol Built on top of IP at the transport layer, UDP provides a datagram-based service
UDLM UNIX distributed lock manager
UFS UNIX file system
UPS uninterruptible power supply
UTC universal time coordinated
UTP unshielded twisted pair

Vote A usually formal expression of opinion or will in response to a proposed Cable unplugged Physical NIC Yes, unless Software Query Enable (SQE) is enabled
Cable shorted Physical NIC Yes
Cable wired in reverse polarity Physical NIC Yes
Cable too long Physical NIC

Physical failures are a bounded set They are often detected by the network interface card (NIC) However, not all physical failures can be detected by a single NIC, nor can all physical