Contingency Planning and Disaster Recovery A Small Business Guide Donna R Childs Stefan Dietrich John Wiley & Sons, Inc Contingency Planning and Disaster Recovery Contingency Planning and Disaster Recovery A Small Business Guide Donna R Childs Stefan Dietrich John Wiley & Sons, Inc This book is printed on acid-free paper Copyright © 2002 by John Wiley & Sons All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate percopy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4470, or on the web at www.copyright.com Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, 201748-6011, fax 201-748-6008, e-mail: permcoordinator@wiley.com Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may be created or extended by sales representatives or written sales materials The advice and strategies contained herein may not be suitable for your situation You should consult with a professional where appropriate Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages For general information on our other products and services, or technical support, please contact our Customer Care Department within the United States at 800-762-2974, outside the United States at 317-572-3993 or fax 317-572-4002 Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books Library of Congress Cataloging-in-Publication Data: Childs, Donna R Contingency planning and disaster recovery : a small business guide / by Donna R Childs, Stefan Dietrich p cm Includes bibliographical references and index ISBN 0-471-23613-6 (cloth : alk paper) Emergency management Small business—Planning I Dietrich, Stefan, 1963— II Title HV551.2.C45 2002 658.4'77—dc21 2002031115 Printed in the United States of America 10 ABOUT THE AUTHORS Donna R Childs is the founder, president, and chief executive officer of Childs Capital, LLC, a Wall Street firm dedicated to poverty alleviation through economic development She holds a B.S from Yale University, an M.A in International Economics and Finance from Brandeis University, and an M.B.A from Columbia Business School Prior to establishing Childs Capital, she had 15 years of experience in finance and risk management She began her career as a research associate in the finance department at the Harvard Business School, was an investment banker in the financial institutions group of Goldman, Sachs & Company, and, more recently, was a director and member of senior management of the Swiss Reinsurance Group in Zurich, Switzerland A recognized authority on risk finance, Ms Childs was the associate editor of Risk Financier and a frequent speaker at reinsurance industry conferences Stefan Dietrich majored in Computer Solutions and Aerospace Engineering as an undergraduate, and received a diploma and doctorate, summa cum laude, from the University of Stuttgart in Germany He was a lead developer of the hypersonic aircraft program of the German National Aerospace Establishment in Göttingen, Germany, and then served as the lead developer for one of the U.S National Science Foundation’s “Grand Challenge” supercomputer projects undertaken at Cornell University, then one of the world’s largest and most complex computer systems As a senior executive at Deutsche Bank, Dr Dietrich contributed to the disaster recovery and contingency planning for one of the largest trading floors in Europe as a consequence of the bomb attack on Bishopsgate in London Most recently, he served as the chief operating officer and executive vice president of a technology start-up company in New York City He currently advises small businesses with respect to their information technology infrastructures and disaster recovery procedures v CONTENTS Acknowledgments ix Assignment of Authors’ Royalties xiii Preface xv Introduction Chapter 1: Preparation 11 Chapter 2: Response 95 Chapter 3: Recovery 137 Chapter 4: Sample IT Solutions 191 Epilogue 209 Appendix: Basic Safety Practices 211 Resources 233 Glossary 239 Index 255 vii Glossary 245 Hacker An individual whose primary aim is to penetrate the security defenses of sophisticated computer systems “Benign hackers” are relatively rare Most unauthorized access can be traced with intrusion detection systems IDS An intrusion detection system inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system Insured A person or corporation entitled to receive benefits under the terms of an insurance policy The named insured refers to the policyholder Internet A global network connecting millions of computers, created as ARPANET in 1969 by the U.S Army, to be a “self-healing” communications network in the event of a serious attack on the United States That makes it valuable for disaster recovery purposes Even during the September 11, 2001, World Trade Center attack, although countless communication lines were severed, the Internet in Manhattan was completely functional The Internet evolved from this original network and is now controlled by regular businesses that provide the fast backbone communication lines Over 100 countries are directly connected to this network, which is decentralized by design Each Internet computer, called a host, is independent Its operators can choose which Internet services to use and which services to make available to the global Internet community You use an ISP to connect to the Internet ISDN Integrated services digital network, an international communications standard for sending voice, video, and data over digital telephone lines or normal telephone wires The 64-kbps data line is the base connection for all telephone and data communication lines When you order ISDN service, you typically get two of these lines, both running over the same telephone wire, allowing you to access the Internet at 128 kbps Or you can use one line for voice and the other for data You need to have an ISP that accepts connections via ISDN ISDN is still a good solution in areas where DSL or cable TV modems are not yet available and high-bandwidth data lines are too costly ISP Internet service provider, a company that provides access to the 246 Glossary Internet For a monthly fee, the ISP provides you with dial-up or a fast modem configuration, software that configures your system, and a username and password for authentication purposes The ISP also typically provides you with an e-mail address and often access to your personal web service We strongly recommend not using either of them If your ISP service fails in a disaster, or you simply want to change your ISP, you are out of luck, because these services are typically only accessible when connected through that particular ISP Key person insurance Key person insurance pays a benefit on the event of death or incapacitation of an owner or “key” employee of a business Latency The time it takes for a theoretical zero-length data packet to move from source to destination across a network connection While a packet is being sent, there is “latent” time, where the sending computer waits for a confirmation that the packet has been received Latency and, for large data packages, network bandwidth are the two factors that determine your connection speed LCD Liquid-crystal display LCDs are thin and flat displays that are used in a variety of products, from small portable devices, laptop computer screens, to desktop displays LCDs use much less power than tube monitors, and can therefore extend the time a system can run on batteries It has a lifetime about twice that of a regular monitor and thus fails less frequently Leased line An “always-on” network connection between two points set up by a phone company It can transfer both voice and data signals Such connections are used by large companies to connect their worldwide offices, and allow them to make phone calls and exchange data within the company at a fixed price per year depending on the required bandwidth and the work required at both end locations Today it is often replaced with a VPN connection over the Internet However, the Internet does not have a guaranteed bandwidth on which you can depend, but leased lines If you live in a major metropolitan area, you can often get leased lines between two offices within the city at attractive prices Liquidity Liquidity refers to ready availability of cash and cash equivalents A company that has ready access to cash, for example, is liquid A company that has little cash and must sell assets, such as real Glossary 247 estate, to generate cash is said to be illiquid In recovering from a disaster, a liquid business is at an advantage because it does not have to sell illiquid assets to generate cash to cover disaster-related expenses Linux Linux () is an operating system developed by Linus Torvalds, whose source code was freely distributed, and many people contributed to it While initially an operating system for computer enthusiasts, it is now widely accepted as a cost-effective substitute for other operating systems, especially for server platforms Today, you can buy IBM servers with Linux preinstalled, unthinkable 10 years ago Linux runs on a wide variety of processors from different manufacturers, such as Intel and Motorola Media The items that store computer data, either fixed, built-in, or removable Examples are hard disk, diskette, CD, ZIP, magnetic tapes, and so on Each of these has its own community of fans, so you will find varied opinions on when to use which media For data backups you should note that CDs have the longest storage time of up to 100 years for archival CDs Consumer CDs last about 30 years Mirroring When you mirror data, you write the same data to two or more devices at the same time You need this additional resilience many times for disaster preparedness You can this manually, or you can use RAID systems that automate the mirroring process Modem Stands for modulator-demodulator It allows the transfer of digital data over regular analog telephone lines Modems are rather slow, but offer the advantage that phone lines are maintained by the phone companies with high priority in case of disasters, and modems work virtually anywhere in this country and around the world Transferring data at high modem speeds (e.g., 56 kbps) requires a high-quality connection Monoline policy A monoline insurance policy provides a single line of insurance, such as liability or automobile insurance Murphy’s Law The original Murphy’s Law reads: “If there are two or more ways to something and one of those ways can result in a catastrophe, then someone will it.” The term originated with E.A Murphy, Jr., who was working for the U.S Air Force in 1949 and made this statement with regard to an experiment that he was working on Glossary 248 when many unlikely failures occurred until he finally succeeded The term “Murphy’s Law” spread quickly within the aerospace community and is today often used to highlight the possibility that an unlikely event can occur Network A network is two or more connected computers that are able to exchange data Typically, you will connect today’s computers via 100-mbps Ethernet cables NFIP National Flood Insurance Program Insurance coverage for floods provided by the Federal Emergency Management Agency Nonowned automobile coverage An insurance policy that covers the liability of a business for any damage caused when employees of the company use their personal automobiles for business purposes OEM Original equipment manufacturer Many PC sellers buy the same OEM products, but build them into different housings labeled with their own brand name Therefore, it does not really matter if you prefer buying PCs from Dell, HP, Gateway, Compaq, or other brands Inside the box, they are very much the same Your choice of the PC supplier should depend on your individual needs, such as for technical support or specific usage requirements PABX/PBX Commonly called a phone system, but the abbreviation stands for private automated branch exchange It is a unit that allows you to make internal phone calls and share your actual phone lines among various employees If you have one, you need to think about a UPS unit to ensure that some basic functionality will be available if you have an electrical power outage Package policy A package policy combines two or more monoline insurance policies to cover two or more lines of insurance for a single policyholder Partition Before you format your hard drive, you need to decide in how many logical sections you want to divide it This is important for performance reasons, and you might have different file systems on different partitions Each partition shows up as a separate logical disk drive Peril A peril is a cause of loss, such as fire or earthquake Physical security You need to have these protection measures in place Glossary 249 that will safeguard your assets in case you have to evacuate your building, and you are not certain when you are able to return, and who will have access to your files There are various forms of equipment that protect your assets against fires, tampering, theft, or vandalism Policyholder The person or business whose name appears on the insurance policy Premium The payment a company makes to obtain insurance coverage for certain risks Property insurance Property insurance protects the physical assets of a business against the risk of fire, theft, and other perils Professional liability insurance Professional liability insurance provides coverage against claims of malpractice or negligence brought against professionals, such as physicians, engineers, lawyers, or architects, as they render their professional services RAID Redundant array of independent disks RAID allows you to store your data automatically over various physical hard disks Logically, however, these disks will only appear as one drive to the user RAID systems can be used for performance enhancements, but they are typically used to protect data from hard disk failures If a hard disk fails, you can simply exchange it, and your RAID system will rebuild the data on that hard drive with the information from the other disks They have been in use for a long time by data centers, but in the past couple of years, low-cost PC cards have been introduced that allow you to have your own inexpensive RAID system in your PC RAM Random access memory The memory of the computer that holds temporary data accessible at high speeds You want to have sufficient RAM for your particular use of your PC for optimal performance Too little RAM, and your computer is really slow because it has to write and read data from the hard disk When buying a computer, it is generally a good idea to double or triple the RAM that is offered in its original configuration Recovery The process by which utility programs and disk software tools can “undelete” files that have been deleted accidentally or have been lost due to a hardware or software issue It is a time-consuming process with an uncertain outcome, and you should not rely on it Rely on your backups instead 250 Glossary Resilience Resilience refers to the ability of a system to withstand adverse conditions and remain stable Restore We have seen many people who made daily backups, but never attempted to retrieve their stored data It can indeed be a little tricky if you are using complex backup software, or certain tape drives Therefore, we recommend that small businesses use hard disks as backup media that hold an exact copy of the original disk Occasionally, a backup can be made from those hard disks in the form of a disk image that does not use a proprietary format Restoring files should always occur in a temporary space first, until you can check that all data have been correctly restored Otherwise, you might overwrite important changes that you recently made to your original file system Rider A document referenced in the insurance policy that amends the original policy Router A device that determines if data on the network are intended for an outside network, such as the Internet, and therefore passed on to it You can use routers for access control, auditing, and keeping statistics on your network traffic Consumer and small business units most often include firewall functionality Server A computer that shares the information stored on it with other computers in the network There are many types of servers, such as mail servers, web servers, and so forth Most of them belong in a professionally managed computer center In most instances, a small business needs only a file server on which to back up data When buying a server, the price of a server configuration is usually justified compared with a desktop PC Servers often use dual processors, highbandwidth bus systems, and fast hard disks Shareware Shareware refers to programs that are free to download from the Internet and that usually come with an evaluation period of about 30 days After that time, if you like the program and would like to continue to use it, you can purchase the full version often directly over the Internet Choose and try as many programs as you like, and then decide which one works best for you Shareware programs are usually best when you look for small system utilities, such as file synchronization tools You may wish to look at www.tucows.com or www.shareware.com Glossary 251 Special form coverage An insurance program that provides basic and broad form coverages and other losses that are not specifically excluded from the policy Software inventory A detailed list of all software licensed to the organization, cataloging the license numbers, program name, version/ release number, cost, locations of installation, and the employees authorized to use this software The software inventory should be part of a large asset control mechanism You will need the software inventory for auditing purposes and to claim insurance benefits in the event of disaster Stability A computer that is unreliable because it does not operate in a stable manner is a nightmare for users and system administrators alike Computers become unstable for a variety of reasons caused by either software or hardware You would see the system simply crashing, or freezing, or hanging in an infinite loop Typically, you have to restart the system and you lose all the changes that you made on a document since the time you last saved it It is a good idea to use only operating systems that are used in large deployments, and install only compatible applications Surge suppressor An electrical device that protects electronic equipment from surges in electricity It contains a fast-reacting circuit breaker, and usually you must replace the whole unit after a surge event System administrator The individual who manages a computer system to provide services to users on a day-to-day basis It is not a good idea to use generic or built-in administrative accounts You should always use administrative rights that you assign to user IDs to allow auditing of who made which changes T-1 A dedicated data line that transfers digital signals at 1.544 mbps A T-1 line can support about 50 people browsing the Internet, depending on usage, fewer if you are running a busy web server internally A T-1 connection is about five times the price of a similar DSL connection However, the greater stability and reliability justifies the extra expense In rural areas, T-1 lines are assessed surcharges according to the distance to the next data communication center Often you have the opportunity to share a T-1 line with several businesses 252 Glossary around you, called fractional T-1 access, something that might be a cost-effective solution for your small business T-3 A T-3 is about 30 times faster than a T-1 line and supports a data transfer rate of about 44 mbps and bundles 672 individual data channels at 64 kbps each Large companies and Internet backbone providers use these lines Tape drive Tape drives are primarily used for backing up data The drive acts like a tape recorder, reading data from the computer and writing it onto the tape Since tape drives have to scan through lots of tape just to read small amounts of randomly scattered data, they are slow for retrieving specific data This is why they are used almost exclusively for data backup However, reasonably fast tape drive devices are fairly expensive, so a tape drive makes sense only if you are storing hundreds of gigabytes of data Umbrella policy An umbrella policy provides excess liability protection to a business and pays a benefit to the insured only when the limits of the basic, underlying insurance policy are exhausted Unauthorized An insurance company not licensed in a state or jurisdiction is an “unauthorized” or “unlicensed” or “nonadmitted” insurer UNIX The UNIX operating system was created in the 1960s at Bell Laboratories It became popular in the 1980s for scientific computing Since Internet hosting is often done on UNIX machines, the platform gained popularity in the 1990s There are a variety of UNIX derivative operating systems available They all are known for their excellent performance and stability UPS Uninterruptible power supply You should have at least one UPS unit that ensures that a critical piece of hardware has continuous power during a power outage The UPS unit will initiate an orderly shutdown of the hardware shortly before its battery is depleted Version control Version control has been used for software developers for decades Now it is also often used in companies to include version control of documents written by a group of people The advantage is that you can always roll back to an earlier document state because version control systems store the changes that you made to a base document From time to time, you want to rebuild the base document to incorporate all changes to date and to reconcile con- Glossary 253 flicting changes that might have been made independently by two or more work groups Voice mailbox It works like an answering machine, but the message is digitally recorded by a third party and sent to your e-mail address via the Internet It is essential that you have at least two such services in place so that you can listen to your messages from any Internet terminal, even if your office has been destroyed in a fire, for example VPN Virtual private network It is an emulation of a private network over the Internet using sophisticated authentication and encryption methods combined with a “tunneling” network protocol Workers’ compensation Workers’ compensation provides a benefit to workers who have experienced a job-related accident or illness The insurance pays for medical costs and disability income to the injured workers as well as death benefits to the dependents of a worker whose death was job-related ZIP In the context of this book we use the term “ZIP” to refer to a product from Iomega The company makes a removable storage device called a Zip drive It holds 100- and 250-MB Zip disks, and has a wide distribution Zip drives are less frequently used for backup, but are often for transferring large files or to keep data stored at a secure location when not in use INDEX A lessons, 174–180 securing coverage, 157–163 Business owner’s policy, 74 Agent, 87 All-risk policies definition, 76 misnomer, 180 American Express, xviii American Red Cross counseling assistance, 128–129 data on small business recovery, xxiii safety resources, training courses, 211–12 Anthrax definition, 17 exposure routes, 17–18 symptoms of exposure, 18 Aon Corporation, xviii Apple Computers, 44 Arbitration, 158–159 Assessment process, 11 C Cables, 100 Catastrophe, 108 Cellular phones, 53–54 Centers for Disease Control (CDC), 19 Cipro®, 19 Compaq, 44 Consultant, 87 Corporate Response Group, xxv Credit bureaus, 21–22 Critical functionality, 38 D Dell, 44 Demilitarized zone, 69 Dianne Sawyer, 60 Director’s and officer’s liability (D&O) insurance, 82–83 Disaster definition, xxiii frequency, major disasters, 228–232 relief programs, 119–131 supplies kit, 225 unemployment assistance, 128 Documents classification, 62–63 version control, 98–99 Dow Jones, xix B Bank of New York, xxii Basic contingency, 192 Basic form coverage, 76 Boiler and machinery insurance, 135 Broad form coverage, 76 Broker, 87 Business assets definition, inventory, 75 Business interruption insurance definition, 85–86 endorsement, 85 255 Index 256 E Earthquakes, 220–222 Economic Injury Disaster Loans, 122–128 Economics of the insurance cycle, 164–172 Emotional responses, 182–190 Employee displacement, 118–119 Environmental hazards definition, preparation for, 59–64 queries, 30 recovery from, 148–149 response to, 102–103 Equifax, 22 Equipment failures data backups to prevent, 39–42 definition, equipment failures, 29 equipment quality, 44–48 Network reliability, 43–44 preparation for, 35–38 recovery from, 144–146 response to, 99–101 Excess and surplus lines, 89 F Federal Emergency Management Agency (FEMA) declaration of disaster, 118–120 notification, Federal National Mortgage Agency (Fannie Mae), 120 File synchronization tool, 35–36 Fires definition, evacuation procedure, 214–215 preparation for, 64–68 queries, 30 recovery from, 149 response to, 103 safety practices, 215–216 Firewall, 69–70 Floods, 216–217 Foreclosure, 120 Foreign Corrupt Practices Act of 1977, FreeBSD, 35 G Gateway, 44 H Hacker attacks protection against, 68–70 Hard disk replacement warranty, 42 Heat waves, 218–220 Hewlett-Packard (HP), 44 High-availability (HA) configuration, 37 Hired automobile coverage, 81 Home office insurance, 81 IT setup, 194–199 Housing and Urban Development (HUD), 130 Human Errors data backups, 33–36 definition, preparation for, 31–32 queries, 29 recovery from, 143–144 response to, 98–99 user training to reduce, 32–33 Hurricanes, 217–218 I Important functionality, 38 Information technology assets, 15 disaster-related losses, Index xviii-xxi inspection of, 109 sample solutions, 191–208 Info-stress, 19 Institute for the Future, 19 IBM, 44 Implicated policies, 113–115 Individual and Family Grant Program, 124 Internet Service Provider (ISP), 49– 52 ISDN, 57 J J.P Morgan Chase Manhattan, xxii Johnson & Johnson, xxiii JustGive, xiii K Key person insurance, 83 L LCD display, 46 Leasehold insurance, 181 Liquidity management, 155–156 Linux, 35, 40 Loss mitigation, 104–106 257 Morgan Stanley, xix Murphy’s Law, 57–58 N National Association of Women Business Owners (NAWBO), 88–89 National Federation of Independent Business (NFIB), 88–89 National Flood Insurance Program, 89 Negligence, 79 Network administrator, 71 New York City Partnership, xvii New York Property Insurance Underwriting Association, 90 Non-owned automobile coverage, 80–81 Nonprofit sector, xxiv Notice, 104–107 O Occupational Safety and Health Agency (OSHA), Oppenheimer Funds, xx Optional functionality, 39 Original equipment manufacturer (OEM), 47 P M Mail handling procedures, 16–26 mailing lists, 21 mail preference services, 20 postal meters, 22 x-ray screening and, 25 May Davis Group, xix, xxi Mean-time-before-failure (MTBF), 45 Military Reservist Economic Injury Disaster Loan, 125 Mold, Password user accounts, 63–64 security, 71 Pentium chip, 46 Perils definition, 75 endorsements, 76 PowerPC chip, 46 Property-casualty insurance definition, 76, 79 endorsements to, 76 Property restoration, 190 Index 258 R Reconstructing insurance policies, 115–118 Redundant array of inexpensive disks (RAID) alternative methods, 41 configuration, 40 mirroring functionality, 40–41 Remote operations stage one, 60–62 stage two, 67–68 Replacement cost, 78 Resolving insurance disputes, 151– 157 Resources, 233–238 Restricted access documents, 63 Risk management, 90–92 Robust contingency, 192 S Sabotage definition, queries, 30 recovery from, 150 response to, 103 Securing coverage, 110–113 Securities and Exchange Commission (SEC), xxi Service Corps of Retired Executives (SCORE), 87–88 Small Business Administration disaster criteria, 121 Economic Injury Disaster Loans, 122–128 employment capacity, xxii Individual and Family Grant Program, 124 lending criteria, 126–128 Military Reservist Economic Injury Disaster Loan Program, 125 regional offices, 123–124 Social Security Administration, 114 Space shuttle, 37 Stakeholder communication, 131– 134 Standard market, 89 Strand Bookstore, 52 Sun Microsystems, 44 Supply chains, xxii System partition backup, 41 System security document retrieval, 62–64 T T–1 circuit, 58 Terrorism definition, queries, 30 recovery from, 150 response to, 103 Third-Party Failures definition, electrical power, 55–56 preparation for, 49–52 queries, 30 recovery from, 146–148 response to, 102 telephone service, 52–54 Thunderstorms, 223–224 Time deposits, 125 Tornadoes, 222–223 Trans Union, 22 Triggering event, 159 Trojan horse, 70 TRW, 22 Tylenol®, xxiii U Unemployment assistance, 128 UNIX operating system, 35, 40, 48 Uninterruptible power supply unit (UPS), 56 U.S Food and Drug Administration (FDA), 18 User Training, 32–33 Index V Version control system, 98–99 Virtual Private Network (VPN), 62 Voice mail, 54 W Windows 2000 suitability, 35 robustness, 48 VPN functionality, 62 259 Windows XP suitability, 35 robustness, 48 VPN functionality, 62 Workers’ compensation insurance definition, 81 reducing costs, 82 World Trade Center attacks on, xv economic losses following terrorist attacks, xvii small businesses and, 12 ... to have contingency and disaster recovery plans in place and to make them available to their regulators The contingency and disaster recovery plans of mutual fund companies, for example, are... that, to put in place an appropriate contingency and disaster recovery plan Before we continue, we must be clear about what we mean by disaster We define a disaster as an event that disrupts business. .. definition of small business) at Ground Zero and about 34,800 small businesses in Lower Manhattan This affects all of us: according to the Small Business Administration, small businesses collectively