Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 34 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
34
Dung lượng
268,5 KB
Nội dung
Redundant Array of Independent Disks Running head: REDUNDANT ARRAY OF INDEPENDENT DISKS Redundant Array of Independent Disks Paul K Clifton University of Maryland University College Graduate School Redundant Array of Independent Disks Redundant Array of Independent Disks Abstract Since the inception of redundant arrays of independent disks (RAID) in 1988 the original five Berkley levels of RAID have been enhanced and expanded to over twelve Storage area networks (SAN) and virtualizations of physical storage bring even more capabilities to RAID technology First, this paper examines the weaknesses and the strengths of various implementations of RAID Second, techniques to improve RAID technologies are detailed with their advantages and disadvantages Third, the motivations for block level storage virtualization are explored Redundant Array of Independent Disks Table of Contents Introduction Formalizing RAID – the Five Berkley Levels RAID Level 0: Disk Striping Figure RAID Level Striping RAID Level 1: Disk Mirroring 10 RAID-2: 2-Bit Interleaving with Hamming Error Correction 11 RAID Level 3: Parity RAID with Dedicated Parity Drive 11 RAID Level 4: Data Striping with Dedicated Parity and Independent Access 13 Figure RAID Levels 2/3/4 .14 RAID Level 5: Data Striping with Distributed Parity 15 Figure RAID Level 16 RAID Level 6: Data Striping with Two-Dimensional Parity 17 Nested RAID Levels 18 RAID Level 01 (0+1): Mirrored RAID Level Stripe Sets 19 RAID Level 10 (1+0): Striped Mirrors (RAID level 1) 19 Figure RAID Level 10 20 RAID Level 50 (5+0): Striped RAID Level 20 RAID Level 100 (10+0) Striped RAID Level 10 .21 Techniques to Improve RAID Performance 21 Caching 21 Queue Management 23 Redundant Array of Independent Disks Storage Virtualization 25 Figure Virtualized Storage .26 Figure SAN Virtualization Internal Diagram .27 Conclusions 28 References 30 Redundant Array of Independent Disks Redundant Array of Independent Disks Introduction Twenty-eight years after Redundant Arrays of Inexpensive1 Disks (RAID) were first defined by Patterson, Gibson, and Katz (1988), “the pending I/O crisis” is still the major bottleneck restricting computing performance (p 109) Amdahl’s law implies that the performance of a computer is only as fast as its slowest component and that component continues to be the disk storage subsystem (Amdahl, 1967, p 483) The process of accepting an I/O request, translating that request into a mechanical disk drive read or write operation, and moving the data between the storage device and the central processing unit (CPU) continues to be constrained by the physical limitations of the disk drive Data seek and transfer performance times have barely doubled, while storage capacity and CPU performance has leapt forward The result is an I/O bottleneck, where data simply is not located and transferred as fast as it is requested The basis for RAID was first patented by Ken Ouchi of IBM U.S Patent 4,092,732 titled “System for recovering data stored in failed memory unit” was issued in 1978 Patterson, Gibson, and Katz (1988) formalized the acronym RAID and proposed this “By common consent … the industry has transmuted the RAID acronym to mean Redundant Arrays of Independent Disks this usage.” (Massiglia, 1997, p 18) The RAID Advisory Board follows Redundant Array of Independent Disks technology to work around the I/O bottleneck caused by mechanical disk drives and to provide redundancy in the case of drive failures Patterson, Gibson, and Katz (1988) defined five levels of RAID (later identified as the Berkley RAID levels) that varied in fault tolerance and also in the manner of disk access Some RAID levels require data to be updated using parallel drive access while other RAID levels require independent data access A sixth level of RAID was added to the original five levels by Katz, Gibson, and Patterson (1989) to provide protection from two concurrent drive failures in a RAID array Subsequent work to address the I/O bottleneck has continued in four main areas First, RAID technologies have developed into hybrid designs to address read and write performance and fault tolerance Second, caching and queue management technologies have been developed to work around the inherent write penalties associated with parity RAID Third, storage virtualization both at the file and block level has been developed to allow higher spindle counts and the use of different drives to improve performance And fourth, storage area networks (SANs) have been developed to increase availability, improve performance, provide easier management, improve cost effectiveness and improve data transfers Formalizing RAID – the Five Berkley Levels Experienced system and application engineers have long known how to work around the I/O crisis by using large numbers of disks in parallel Redundant Array of Independent Disks Literally hundreds of disks are commonly connected to computers, not for the aggregate storage capacity, but instead to leverage the large number of parallel I/O streams for better performance Aggregating large numbers of n drives for parallel access effectively yields I/O performance of n drives, i.e 10 drives capable of 400 I/Os per second in this arrangement yield 4000 I/Os per second The major issue with drive arrays used for parallel access is the loss of data when a single drive fails The formula to calculate the impact of multiple drives on the Mean Time to Failure (MTTF) is: MTTF of a Disk Array = MTTF of a single disk / number of Disks in the Array (Patterson, Gibson, & Katz, 1988, p 110) It should be apparent from the MTTF formula that multiple drive arrays become much less reliable as more array member drives are added Patterson, Gibson, and Katz (1988) sought to address both performance and availability of data in their paper: A Case for Redundant Arrays of Inexpensive Disks (RAID) which provided a formal definition of RAID levels through RAID Level 0: Disk Striping While not officially part of the original five Berkley RAID levels, RAID level is commonly used to refer to disk striping without data redundancy due to its similarity to the data striping used with parity RAID implementations2 In the truest Patterson, Gibson, and Katz (1988) noted that striping data was a desirable feature for any RAID level, including RAID-1 Redundant Array of Independent Disks sense of acronym for RAID, level is not redundant and therefore is not really RAID (see Figure 1) However, RAID level is recognized by the RAID Advisory Board (RAB) and the sixth edition of The RAID Book by the RAB uses the terms striping and RAID level interchangeably (Massiglia, 1997, p 15) Figure RAID Level Striping3 RAID level provides the highest performance of any RAID level This is because there is no overhead to compute and write or read parity information RAID level also provides 100 percent data storage efficiency RAID level is often used in environments where storage efficiency and I/O performance is the ultimate consideration RAID level exhibits several disadvantages First, if any member of a striped array fails the entire array is rendered unusable and all data is lost Data will be unavailable from the array until a new member is installed and a complete restoration is performed Second, striped arrays not perform well when Adapted from Redundancy is Good! by Bernhard Kuhn, p 60, Copyright 2000 by Linux Magazine Redundant Array of Independent Disks used with applications that read or write small amounts of sequentially located data This is because the member disks spend most of their I/O time waiting for disks to rotate to the starting location of the requested data Third, applications that make synchronous requests for small amounts of nonsequential data can essentially create a deadlock for the disk arms and heads or create an extremely large I/O request queue Performance of all striping variants is directly affected by the capabilities of the disk controller If the disk controller can scatter read and gather write, then all of the requested data that maps to one disk can be read or written with one array member I/O request Scatter reading is the ability to retrieve data blocks stored in consecutive disk addresses to nonconsecutive memory addresses (XIOTech Corporation, 1999, p 5) Gather writing is the ability of the disk subsystem to write blocks of data stored at non-consecutive memory address to consecutively addressed disk blocks (Massiglia, 1997, p 86) RAID Level 1: Disk Mirroring RAID level provides data redundancy by writing data to a primary disk and also to a secondary (mirrored) disk data is stored on both disks Identical RAID level (mirrored) arrays can implement either parallel or independent access to each member; however in practice most RAID level systems use independent access Redundant Array of Independent Disks 10 RAID level has two main advantages First, it provides high performance for read-intensive applications With independent access, if one drive is busy data can be accessed from the secondary disk, improving read performance For read- intensive applications, performance can approach a multiple of that of a single disk Second, RAID level arrays maintain a complete copy of all information allowing the system to remain available during a single drive failure This makes RAID level arrays very suitable for disk subsystems where reliability is paramount and the cost of storage is secondary Disadvantages of RAID level include storage requirements and write performance RAID level imposes a 50 percent storage penalty, essentially requiring double the number of disks to store the data For applications that create a large amount of write requests, RAID level arrays incur a significant write penalty amounting to a 25 percent slowdown This is because all write requests must be performed on both the primary and the secondary drives Doubling the number of writes results in offset write times and random positioning differences on the physical disks impacted by rotational latency RAID-2: 2-Bit Interleaving with Hamming Error Correction RAID level was defined by Patterson, Gibson, and Katz (1988) as interleaved data and hamming code information (p 112) RAID level arrays require disk drives with synchronized spindles (Murphy, 2005, Para 28) The specification for RAID Redundant Array of Independent Disks RAID Level 50 (5+0): Striped RAID Level RAID level 5+0 is often referred to as RAID level 50 20 It utilizes a number of RAID level (distributed parity RAID arrays) configured as a stripe set RAID level 50 is often implemented to obtain better write performance than possible with plain RAID level RAID level 50 provides better redundancy than RAID level because one drive in each RAID level array can fail, without impacting the availability of the RAID level 50 array (Solinap, 2001, Para 26) Storage efficiency of a RAID level 50 array incurs a 10 percent penalty per RAID level set (assuming 10 drive sets), so RAID level arrays in a stripe set would have a storage efficiency of 70 percent RAID Level 100 (10+0) Striped RAID Level 10 RAID level 10+0 is often referred to as RAID level 100 It utilizes a number of RAID level 10 (arrays of striped mirrors) also configured as a stripe The primary reason for RAID level 100 is to increase the number drives to improve I/O performance and to reduce the potential of disks with a high queue length (hot spots) RAID level 100 offers the best performance for writes and is well suited for I/O intensive applications with both long and short transaction lengths Techniques to Improve RAID Performance Techniques to improve RAID performance go beyond the hybrid designs such as RAID level 10, etc to address the way data is moved from and to the RAID set drive members Two such Redundant Array of Independent Disks 21 techniques are caching and queue management These techniques are not exclusive, so a given system may employ any combination of these approaches Caching Caches are a very effective technology to improve the I/O performance of disk access and RAID array I/O Parity RAID systems incur the largest write penalty, so they benefit the most from caching Treiber and Menon (1995) reported that write caches could reduce disk utilization for writes by an order of magnitude when compared to basic RAID level systems Write caches are commonly used to work around the write penalty seen by parity RAID Write caches come in two general forms: write- behind and write-back Write-behind caches temporarily hold data to be written allowing the application to continue executing The data in the write-behind cache is actually written to the target disk(s) at some later time, hopefully when the disk(s) would otherwise be sitting idle Data can also be read from the write-behind cache while it is waiting to be written Write-behind caches provide the most benefit if the RAID subsystem is heavily loaded Write-behind caches introduce a dangerous complexity into the RAID subsystem Since the data in a write-behind cache is volatile, there is a danger that it may not actually be written to the disk(s) This could result in the RAID member disks being in an inconsistent state The fact that multiple disks must be Redundant Array of Independent Disks 22 updated for requests in parity RAID arrays increases risk of data loss or corruption Write-back caches are non-volatile memory that preserves unwritten data even through unexpected shutdowns Write-back caches allow the use of alternative partial stripe updates that can significantly increase the write performance of parity RAID (Massiglia, 1997, p 187) RAID subsystems must often perform extra writes to other array members when partial stripes are written without a write-back cache The write-back cache allows the RAID system to reduce the reads and writes by a factor of two during updates to data This improves the responsiveness of the RAID array and improves concurrency Unfortunately, many of the benefits of write-back caches not carry through to heavy I/O load conditions The cache must eventually be written out and if the member disks are not idle, the disk queue length increases The result is the RAID array must still wait for the disk subsystem under heavy I/O loads (Kim, 2004) Write-back caches can still benefit the performance of RAID arrays – in heavy I/O load conditions First, the write-back cache may likely contain disk blocks frequently updated, saving I/O operations Second, write-caches can easily be used to hold the suspect parity flags used to guard against write holes by the RAID array Parity flags would normally require I/Os to mark each block without a write-back cache (Massiglia, 1997, p 190) Redundant Array of Independent Disks 23 Queue Management Queue management seeks to improve RAID performance by queuing up large numbers of I/O requests, optimizing them, then executing the requests in the most efficient manner Five techniques are commonly employed to optimize I/O queues: Elevatoring Prefetching Concatenation Scatter/Gather Elimination Elevatoring seeks to reorder I/O requests within the queue to make certain that those requests are processed as efficiently as possible Absent of elevatoring, the array member drive heads will likely move back and forth over a wide area across the platters to service each request Elevatoring reorders requests with an emphasis on minimizing seek times to reduce the mechanical action in the RAID array disk drives (Jin, Cortes, and Buyya, 2002, p 189) Prefetching is a technique used with sequential I/O operations Intelligent caches can read ahead to bring blocks into the I/O system before they are actually requested (Jin, Cortes, & Buyya, 2002, p 245) Prefetching significantly improves the performance of multiprocessor storage systems Redundant Array of Independent Disks Concatenation allows several queued I/O requests to be linked together and processed as a single write request 24 This approach is also known as collective buffering (Jin, Cortes, & Buyya, 2002, p 271) This minimizes the number of I/Os required to transfer the data and reduces the command overhead The concatenated requests must be sequential or they cannot be linked Scatter/gather allows sequential data to be gathered from non-contiguous locations or requests The data is then scattered into sequential locations on a RAID array disk members (XIOTech Corporation, 1999, p 5) This also minimizes both the I/Os to the array Elimination allows redundant write requests to be eliminated from being processed For example, a write request for a specific disk location is received; shortly thereafter another write request is received for the same location request will be eliminated The first write As long as the first write operation has not been performed and there has not been a read request issued between the two write requests, the first write request is eliminated Elimination allows the system to reduce the workload by not performing unnecessary I/Os (XIOTech Corporation, 1999, p 6) Storage Virtualization Storage virtualization represents the next step in the evolution of RAID storage Storage virtualization provides the Redundant Array of Independent Disks ability to stripe across and utilize all available space in a 25 centralized storage pool, enabling storage to be centrally managed and shared between nodes on a storage area network (SAN) (Flouris & Bilas, 2005, p 103) Storage virtualization can occur at the filesystem or at the block level In its most basic form, virtualization is just an aggregation of available blocks of storage that is then partitioned into virtual disk volumes (Riedel, 2003, p 39) Block level virtualization is most commonly used (XioTech Corporation, 2005) RAID systems are transitioning from internal storage to external storage devices located on a storage area network Once located on a SAN, external storage can be accessed by many computers through fibre channel adapters, or iSCSI adapters and switch equipment (see Figure 5) Centralized storage allows the number of drives to be increased to increase performance and availability Central pools of storage also allow global hot spare member drives instead of a separate hot spare for each server Redundant Array of Independent Disks Figure Virtualized Storage 26 A SAN storage device virtualizes its disk members into a central pool of storage by striping across all drives The layer of indirection provided by virtualization enables many additional functions, including interoperability of devices from different vendors, remote mirroring, read and write caching and scaling to large numbers of drives (Riedel, 2003, p 39) Special software on the appliance allows virtual disks to be defined within the pool of storage at any RAID level, facilitating the creation of hybrid RAID levels such as RAID level 10 that provide much better I/O performance Redundant Array of Independent Disks 27 Figure SAN Virtualization Internal Diagram7 The result is that every virtual disk has an extremely high spindle (drive) count High spindle counts mean that the virtual disk is capable of very high I/O Virtualized storage allows for numerous volume management functions such as expansion in production, snapshot copies of real-time volume data and movement From Storage Systems, Not Just a Bunch of Disks Anymore by Eric Riedel, p 40, Copyright 2003 by Queue Magazine Redundant Array of Independent Disks 28 of virtual disks from one computer to another while the systems are running (Flouris & Bilas, 2005, p 103) Conclusions RAID data storage has evolved considerably over the last three decades The advent of RAID brought about many improvements in how data was handled Striping data across a set of disk drives was a key concept to improve system performance because more of the disk drives’ actuators were available to process read/write requests Mirroring (RAID level 1) provided fault tolerance by maintaining copies of data on more than one drive Striping with parity (RAID levels 3, and 5) provided the improved performance of striping with some fault tolerance, albeit with a substantial write penalty The most advanced storage techniques now utilize hybrid RAID techniques such as RAID level 10 to gain the improved performance of striping with the fault tolerance of mirroring Storage area networks (SAN)s and storage virtualization represent the future of RAID storage Removing the storage subsystem from the server allows for centralized storage with extremely high performance, ease of management, storage efficiency and flexibility Storage virtualization at the block level facilitates extremely high spindle counts and very high I/O rates The combination of the SAN and virtualized storage will allow computer systems to continue to grow and allow the management of tremendous amounts of storage well into the future Redundant Array of Independent Disks 29 Redundant Array of Independent Disks 30 References Amdahl, G M (1967) Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities Proceedings AFIPS 1967 Spring Joint Computer Conference, 30, 483-485 Baek, S H., Kim, B W., Joung, E J., & Park, C W (2001) Reliability and Performance of Hierarchical RAID with Multiple Controllers Proceedings of the twentieth annual ACM symposium on Principles of distributed computing, 246254 Retrieved January 26, 2006, from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=384036&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN= 11545385 Clark, T (1999) Designing Storage Area Networks Upper Saddle River, NJ: Addison-Wesley Denehy, T E., Bent, J., Popovici, F I., Arpaci-Dusseau, A C., & Arpaci-Dusseau, R H (2004) Deconstructing Storage Arrays Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, 59-71 Retrieved January 26, 2006, from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=1024401&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN =11545385 Redundant Array of Independent Disks 31 Flouris, M D., & Bilas, A (2005) Violin: A Framework for Extensible Block-Level Storage Proceedings of th22nd IEEE/13th NASA Goddard Conference on Mass Storage Systems, , 103-115 Retrieved February 26, 2006, from http://storageconference.org/2005/papers/12_flourism_violin pdf Gibson, G A (1992) Redundant Disk Arrays Reliable, Parallel Secondary Storage Cambridge, MA: The MIT Press Gibson, G A (1995) Tutorial on Storage Technology: RAID and Beyond Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data 471 San Jose, CA: ACM Press Retrieved January 26, 2006, from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=223884&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN= 11545385 Jin, H., Cortes, T., & Buyya, R (2002) High Performance Mass Storage and Parallel I/O New York, NY: John Wiley & Sons, Inc Katz, R H., Gibson, G A., & Patterson, D A (1989) Disk System Architectures for High Performance Computing Proceedings of the IEEE, 77, 1842-1858 Kim, Youn (2004, January) Best Practice for Optimizing Microsoft SQL Server 2000 on Magnitude 3D (WP070274-0404) Retrieved February 25, 2006 from http://www.xiotech.com/xioapp/resources/pdf/partner/wp070274 -0404_Mag3D-SQL2000.pdf Redundant Array of Independent Disks 32 Kuhn, B (2000, October) Redundancy is Good Linux Magazine, 10, 58-61 Retrieved February 26, 2006, from http://www.linux-magazine.com/issue/01/RAID_Basics.pdf Lelii, Sonia R (2000, December 4) Technology Thwarts RAID Data Loss eWeek, 17(49), 52 Retrieved February 25, 2006 from Business Source Premier (3846035) Massiglia, P (1997) The RAID Book (6th ed.) St Peter, MN: The RAID Advisory Board Mellish, B., Albrecht, B., Fidelis, O H., & Struzinski, M (2002) Fault Tolerant Storage Multipathing and Clustering for Open Systems for the IBM ESS (1st ed.) San Jose, CA: IBM Corporation Murphy, I (2005, September 1) Making Sense of RAID Retrieved February 25, 2006, from Storage Times and PatroMark Publishing Web Site: http://www.enterprisetimes.com/st/f_st_09_2005_01.asp Patterson, D A., Gibson, G., & Katz, R H (1988) A Case for Redundant Arrays of Inexpensive Disks (RAID) ACM SIGMOD International Conference on Management of Data: (pp 109116) Chicago, IL: ACM Press Retrieved February 8, 2006, from Carnegie Mellon University Web Site: http://www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf Reddy, A L., & Banerjee, P (1989, December) An Evaluation of Multiple-Disk I/O Systems IEEE Transactions on Computers, 38(12), 1680-1690 Retrieved January 26, 2006, from http://csdl.computer.org.ezproxy.umuc.edu/dl/trans/tc/1989/1 2/t1680.pdf Redundant Array of Independent Disks 33 Riedel, E (2003, June) Storage Systems: Not Just a Bunch of Disks Anymore Queue, 1(4), 32-41 Retrieved January 26, 2006, from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=864059&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN= 11545385 Solinap, T (2001, January 24) RAID: An In-Depth Guide To RAID Technology Retrieved February 8, 2006, from SL Central Web Site: http://www.slcentral.com/articles/01/1/raid Stodolsky, D., Holland, M., Courtright, W V., & Gibson, G A (1994, August) Parity Logging Disk Arrays ACM Transactions on Computer Systems, 12(3), 206-235 Retrieved January 26, 2006, , from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=185516&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN= 11545385 Treadway, T (2005, November 7) A Tale of Multiple RAID-6s Retrieved February 25, 2006, from http://storageadvisors.adaptec.com/2005/11/07/a-tale-ofmultiple-raid-6s/ Treiber, K., & Menon, J (1995, January) Simulation Study of Cached RAID5 Designs, Proceedings of International Symposium on High Performance Computer Architectures: 186-197) Raleigh, NC (pp Redundant Array of Independent Disks 34 Varma, A., & Jacobson, Q (1995) Destage Algorithms for Disk Arrays with Non-volatile Caches Proceedings of the 22nd annual international symposium on Computer architecture, , 83-95 Retrieved January 26, 2006, from ACM Press Web Site: http://portal.acm.org.ezproxy.umuc.edu/ft_gateway.cfm? id=224042&type=pdf&coll=portal&dl=ACM&CFID=66779999&CFTOKEN= 11545385 Western Digital Corporation (2005, April 12) Answer ID 942: What are the benefits of Serial ATA vs SCSI for enterprise storage? Retrieved February 25, 2006, from http://wdc.custhelp.com/cgibin/wdc.cfg/php/enduser/std_adp.php?p_faqid=942 XIOTech Corporation (1999, September) REDI Storage Manager -The Magnitude's Virtualization Cornerstone (070020-000 Rev A) Eden Prairie, MN: XIOTech Corporation XioTech Corporation (2005, February) Storage Virtualization: How Much is Enough? (WP070337) Retrieved February 25, 2006 from http://www.xiotech.com/XioApp/Resources/PDF/whitepapers/wp07 0337_Virtualization.pdf Youssef, R (1995, August) RAID for Mobile Computers Master of Science Thesis, Carnegie Mellon University Retrieved January 26, 2006 from http://www.pdl.cs.cmu.edu/PDLFTP/MOBILE/thesis.pdf .. .Redundant Array of Independent Disks Redundant Array of Independent Disks Abstract Since the inception of redundant arrays of independent disks (RAID) in 1988 the original... References 30 Redundant Array of Independent Disks Redundant Array of Independent Disks Introduction Twenty-eight years after Redundant Arrays of Inexpensive1 Disks (RAID) were first defined... management of tremendous amounts of storage well into the future Redundant Array of Independent Disks 29 Redundant Array of Independent Disks 30 References Amdahl, G M (1967) Validity of the Single