24 Part I: Oracle Database Security New Features A plethora of books are available on the market discussing cryptography (my personal favorite is Bruce Schneier’s Applied Cryptography: Protocols, Algorithms, and Source Code in C, John Wiley & Sons). The mathematics involved and the issues and nuances of cryptography are staggering in number and complexity and well beyond the scope of this book. Fortunately, you don’t need to understand all aspects of encryption. This chapter defines only what you need to know to make the critical decisions about how and when to use encryption within the database. Encryption Choices Although data can be encrypted in many ways, there are fewer ways to do it effectively. Many people are inclined to write their own encryption, just as Julius Caesar did. However, unless they are geniuses or very lucky, chances are their encryption will be poor. Today, effective encryption implies the use of standard and proven encryption algorithms. The proven part is important because it ensures that the encryption doesn’t have some fatal flaw that would allow an unauthorized person to determine the contents of the sensitive data. Since you want to use standard encryption algorithms, you have quite a few from which to choose. Before you start picking algorithms to use in the database, you need to understand a little more about how encryption works. The Algorithm and the Key To encrypt data, two things are required: an encryption algorithm and an encryption key. The high-level description of encrypting data is quite simple: plaintext data is fed into the encryption algorithm. An encryption key is also provided. The algorithm uses the key and very sophisticated logic to encrypt the data. The process of decryption is analogous. It also requires a key and an algorithm. Figure 2-1 illustrates how basic symmetric key encryption works. A plaintext message, “Applied Oracle Security,” is encrypted using an algorithm and a key. To recover the original message, the same key and algorithm must be used. The overall strength of the encryption is not determined by just the choice of algorithm or the key size. The strength is determined by the combination of the two. A common misconception is that larger keys for one algorithm mean that algorithm is stronger than another algorithm that uses a smaller key size. Some algorithms demand larger keys to make them as strong as other algorithms that use smaller key sizes. However, in some cases, larger keys used within the same algorithm do make the encryption stronger. By studying Figure 2-1, you may see a challenge to effective encryption. If Caesar is sending General Suchnsuch an encrypted message, the general needs to know both the algorithm and the key that was used to encrypt the message. Studies of cryptography have shown that with today’s algorithms, the only thing that needs to remain a secret is the key. Public knowledge of the algorithm doesn’t aid the attacker in recovering the sensitive data. Obscuring the algorithm may seem like good security, but it’s only a nuisance to a determined attacker. NOTE This point can’t be overemphasized: Many cycles are wasted on “protecting the algorithm” in the real world. If knowledge of the algorithm is sufficient to break your code, your algorithm isn’t an encryption scheme at all. Chapter 2: Transparent Data Encryption 25 Symmetric Key Encryption Two categories of encryption are used today: symmetric key encryption and asymmetric key encryption or public key encryption (PKE). The algorithms for symmetric key encryption use the same key for both the encryption and decryption process; they are symmetric! A message encrypted with one key can be decrypted only with that exact same key. Symmetric key algorithms are very secure and very efficient at encrypting and decrypting data. Some popular examples are RC4, RC5, DES, triple-DES (3DES), and the new Advanced Encryption Standard (AES). Because of their strength and efficiency, these are the algorithms used for “bulk encryption”—they encrypt large amounts of data. When two people want to use symmetric key encryption, they need to have either a preestablished key or a secure way to transport the key. When two parties already know each other, it’s possible that they will both already know the encryption key. For two parties that have never met and that want to share data securely, the problem of getting the key between the two parties becomes the major challenge. You can’t leave the key in plaintext because an attacker could see it. If you encrypt the key, you have to do so with another key, which only moves the problem elsewhere. This motivated the development of the second variation of encryption. Public Key Encryption With PKE, two keys act in a complementary manner. The PKE algorithms are mathematical inverses: Whatever one key does, the other key undoes. Furthermore, knowing the algorithm and having one of the keys doesn’t give the attacker an advantage in determining the other key or in recovering the encrypted data. With PKE, the two keys are usually called the private key and the public key. FIGURE 2-1 Symmetric key encryption requires the use of the same key for both the encryption and decryption processes. 26 Part I: Oracle Database Security New Features NOTE Data encrypted with the private key can be decrypted only with the public key, and vice versa. Private and public are used to describe the keys because it is typical for the public key to be accessible to many people. The private key remains a secret known only to the owner. As long as the private key remains private, this option works beautifully. PKE, therefore, solves the key distribution problem. For two parties to communicate, they need access only to each other’s public keys. Figure 2-2 illustrates how PKE can be used to send a secret message between two parties. To ensure that the recipient (the server in the figure) is the only one that receives the message, the message is encrypted with the recipient’s public key. As such, only the recipient, the General, will be able to decrypt the message because the only key that can be used is his private key (which only he has). Trying to decrypt the message with an incorrect key yields gibberish. An interloper will be unsuccessful in decrypting the message because he or she will not have the private key. Note that the public key can’t be used to decrypt the message that was also encrypted with the public key. PKE provides another complementary capability. The private key can be used as an authentication method from the sender. As Figure 2-3 illustrates, a sender can encrypt a message with his or her private key. The recipient can use the sender’s public key to decrypt the message. If the message decrypts, the sender’s identity is authenticated because only the sender has access to his private key and so only he could have encrypted the message. Since our general was able FIGURE 2-2 PKE uses two complementary keys to pass sensitive data securely. Chapter 2: Transparent Data Encryption 27 to decrypt Caesar’s message using Caesar’s public key, then he is assured that the message was sent by Caesar (provided he is keeping his private key private!). Symmetric Key and Public Key Unfortunately, the public key algorithms require larger keys to achieve the same strength received from their symmetric key counterparts. Consequently, the public key algorithms perform more slowly and are more computationally expensive. Today, public key and symmetric key encryption are used together as a part of the standard SSL network protocol. SSL is the de facto standard encryption mechanism for data on the Internet. Due to its superior performance characteristics, symmetric key encryption is used within SSL for bulk data encryption. To transport the symmetric keys securely between the two parties, PKE is used to encrypt the symmetric keys. In Figures 2-2 and 2-3, the secret message is actually the symmetric encryption key. Public key technology gets more than its fair share of the attention considering that proportionately it actually encrypts a lot less data than symmetric key encryption. This is because the administrators and users have to interface directly with the public key technology. The symmetric key algorithms are neatly concealed and hidden from view. FIGURE 2-3 PKE can be used to authenticate parties to one another. 28 Part I: Oracle Database Security New Features Understanding and acknowledging the use of public key and symmetric key encryption is important to the Oracle Database because the database supports only symmetric key algorithms. The performance and efficiency of symmetric key algorithms make them a natural choice for the database. Unfortunately, this leaves open the issue of key management, which is addressed later in this chapter. Encrypting Data Stored in the Database Understanding that the primary goal of encryption is to protect data in an unprotected medium, you might be wondering if it makes sense to encrypt data in the database at all. As you probably expected, this book emphasizes making the database a more secure medium. So, if it is very secure, why encrypt? It turns out there are valid reasons for wanting to encrypt data stored in the database. First, you might be forced to comply with a regulation (legal, industrial, or organizational directive) that states that certain classes of data must be stored using encryption. This is the case with PCI for credit card data, and many companies have developed internal rules for what data must be encrypted when stored. In addition, the privacy laws of several states, such as California’s SB 1386, remove the requirement for notification of victims of data privacy breaches if the data in question was encrypted. So in some cases, we are told we must encrypt out data, and in others it may be in our best interest in protecting corporate reputation, brand value, and customer relationship. A second valid requirement for encryption is assurance that data can be protected throughout its life cycle. Think about data storage as a life cycle: data is created, stored, modified, moved, backed up, and deleted over time. At some points in the life cycle, data can be found outside of the protected medium of the database, as data is moved from the database to tape or another system for backup or disaster recovery purposes. Employing encryption in the database provides a level of security that can be maintained throughout the entire storage life cycle. Again, we must think of encryption as one “layer” in a defense-in-depth strategy that has multiple layers of defense. By storing sensitive data in the clear, we have missed a critical part of a comprehensive security strategy. For example, historically, it might be the case that policy dictates that database administrators (DBAs) must not have access to sensitive data. This is often a requirement when trying to protect privacy-related data and is a valid security concern. Before version 10.2, encryption was the only way to keep data protected from DBAs. Their privileges allowed them to select data from any table and could be controlled only with the use of auditing (a compensation, rather than proactive control). Many organizations used DBMS_CRYPTO to selectively encrypt and store highly-sensitive data in the database, protecting the contents of the encrypted data from the DBA. This was a difficult and costly method to protect data and can now be addressed with the release of Database Vault, which separates the administrative functions data into access realms, as you will learn more about in Chapter 4. In summary, the two driving requirements for storing data in the database are mandates and protection throughout the data life cycle. Next, you will see the technical vulnerability that’s at the root of both requirements. Where the Data “Sleeps” To understand why encrypting data stored in the database is important, consider this example. In a table named customer, you store sensitive data—a customer’s name and address (both personally identifiable information, or PII), and their credit card’s primary account number (protected by the PCI-DSS). In this example, encryption is not used; instead, you will be able Chapter 2: Transparent Data Encryption 29 to see the data that needs to be protected. To make the example easier to visualize, if you try this at home, you will want to create a small (300K or so) working tablespace for the customer table that has one datafile (named customer_info.dbf in the example that follows). Data housed within an Oracle database is stored using a proprietary format that makes efficient, high-performance data access possible. Datafiles contain header information, extent details and offset information, and the actual columns of data as they were inserted. It should be noted that while VARCHAR and VARCHAR2 are human-readable within a datafile, some other TDE-supported databases are stored using nonreadable, but reversible, methods such as HEX. Until the release of TDE, no protection mechanism was available from the database to protect the contents of these data structures without your having to write and maintain code, often wreaking havoc on existing applications. Protecting the Data Another significant challenge arises for backup files. Backups of large, mission-critical databases are often performed using backup software that writes data from many systems over the network to tapes. The centralization of the backup function can increase reliability and reduce the cost of performing backups with economies of scale. As a result, you may find that your database’s backup copies of datafiles are on tapes being handled by operators who may not fully understand (or worse yet, do understand) the value of data such as credit card numbers. As the number of individuals with access to these backup media increases, and as we move these tapes offsite for greater disaster recoverability, our ability to control access to this media is reduced. Simple Example of the Technical Requirement for Encryption Let’s illustrate the point by looking at an example. We’ll base this on the customer table described earlier. You can see the elements that are in need of attention: sh@AOS> desc customer Name Null? Type CUST_ID NOT NULL NUMBER(6) CUST_FIRSTNAME VARCHAR2(20) CUST_LASTNAME VARCHAR2(20) CUST_ADDRESS VARCHAR2(40) CUST_CITY VARCHAR2(15) CUST_STATE VARCHAR2(2) CUST_ZIP VARCHAR2(10) CUST_CREDIT_CARD_NO VARCHAR2(19) CUST_CREDIT_CARD_EXP VARCHAR2(4) sh@AOS> A look at the table shows the values of the data: 1* select * from customer where rownum < 2 sh@AOS> / CUST_ID CUST_FIRSTNAME CUST_LASTNAME CUST_ADDRESS CUST_CITY CU CUST_ZIP CUST_CREDIT_CARD_NO CUST 1001 David Knox 202 Peachtree Rd. Reston VA 20190 5466-1112-2233-9342 1008 30 Part I: Oracle Database Security New Features As is shown here, the credit card number and expiration data are clearly visible within the context of the database (as they should through SQL), so we know exactly what we are looking for in the datafile. Look closely at the datafile (customer_info.dbf in the example code) and you can pick out the good bits of data using commonly available editors or with operating system tools such as grep or strings. NOTE While this looks all too easy to accomplish, remember that such access to a datafile requires access to the operating system and read permissions on the datafile itself. Figure 2-4 shows the datafile (customer_info.dbf) opened in an editor common on nearly every *nix variant, VI. You can pick out the David Knox record easily; see that 202 Peachtree Rd., Reston, VA 20190 is his address; and even jot down his credit card number and expiration (for later use, of course). Viewing the Data You can accomplish this in a variety of ways in addition to using VI. On Windows machines, many text editors or third-party binary editors will allow you to view the datafile. Similarly, on Linux or UNIX machines, the strings command lets you strip away nonreadable data and control characters, allowing you to focus on the meaningful data. Using strings, you can filter out all the readable ASCII characters and see the data. Unlike opening the datafile in an editor (some editors will not open files that are locked), this technique can also be used against open datafiles without disrupting database operations, making it a handy tool for hackers as well. [oracle@aosdb aos] (aos-db)$ strings customer_info.dbf | more Now that the sample data has been distilled down to readable text, your results will look similar to this: }|{z CUSTOMER_INFO @ s MGMT_SYS_JOB metrics . . . 1209, David Knox 202 Peachtree Rd. Reston 20190 5466-1112-2233-9342 1008< Richard . . More Chapter 2: Transparent Data Encryption 31 Using the strings command, you won’t see control characters or many “structural” components of the datafile, only the data itself. If a backup tape with company data, such as credit card numbers, were to fall into the hands of someone who shouldn’t have it, a simple shell script that parsed through datafiles looking for strings of cleartext would expose credit card data (personal account numbers, expiration dates, addresses, and so on). In addition, with the help of regular expressions and utilities included with most operating systems, it would be trivial to script the process of looking for patterns matching, such as credit cards, Social Security numbers or other national identifiers, or other interesting data, or a regular interval. It is also important to note that the structure of the datafile is not operating system–specific with respect to viewing the underlying data. Windows datafiles are susceptible to the same sort of file-level reads. With access to the datafiles on disk or on tape, data can be viewed on Windows systems with something as simple as Windows Notepad, as shown in Figure 2-5. Data stored in a tablespace’s underlying datafile is readable and potentially exploitable. This is a security risk that must be managed like any other risk, based on the value of the data, the likelihood of someone gaining unauthorized access to the data, and the amount of risk that you and your organization are willing to accept. While your first layer of defense in ensuring that the database’s datafiles are well protected and that backup media is maintained using established policies, it becomes apparent that additional controls should be applied to any and all high-value data. Applied Example Imagine that your example datafile exists on a physical disk drive in a production system, perhaps in a disk array. Disk arrays provide a great deal of value to organizations by making certain that data is highly available. To do this, disk arrays generally employ RAID (Redundant Arrays of Inexpensive Disks) technology, which allows a single datafile to exist on several physical devices at one time for redundancy. If one copy of the data were lost or corrupt, the overall data availability would be FIGURE 2-4 Data in the datafile is human readable. FIGURE 2-5 Viewing a datafile with Windows Notepad. 32 Part I: Oracle Database Security New Features maintained using the duplicate copies. With many such data storage devices, it is conceivable that a drive could be removed, making data on the drive vulnerable without the database ever being shut down. Another possibility with many of these storage arrays is the hardware sensing that a drive is close to failure (showing errors, and so on). Often, technicians will replace these failing drives with new ones, making it important to know the disposition of the replaced drives since they contain real, potentially sensitive data. Likewise, those who can copy the datafiles to tape or other media could simply walk out of a data center with useable data. As the good custodians of data, we create backup copies of this data in the event of disaster or loss of the data center, often shipping copies of data to remote locations. This means that local and offsite backup media are also potential targets for would-be data thieves. Disaster recovery locations are required by many customers, yet these locations often maintain more copies of your sensitive data that must also be protected by some combination of technology, policy, and procedure. As has been determined, a database’s datafile may sometimes be outside of your security controls, and it may occasionally exist in the unprotected medium so elegantly secured by encryption. Encrypting in the Database You know that valid reasons exist for requiring data stored in a database to be encrypted. Now let’s look at three approaches to remedy cleartext data being stored in the database’s datafiles on a file system, and in subsequent copies used for redundancy or moved to backup media. The choices customers make are influenced by the complexity, cost, performance, openness, and portability of the solutions. You might choose to use an encrypted file system, building a custom encryption strategy or making use of a feature that is built into the database. When using an encrypted file system, everything that is written to disk undergoes encryption to protect it from unauthorized viewing. This approach deals with the problem of cleartext appearing on disk and backup media by taking a blanket approach—that is, everything gets encrypted. While this approach does work in many situations, encrypted file systems are generally considered to be expensive, proprietary implementations, and you rely completely on the operating system of the host machine to make access control decisions. In fact, the PCI-DSS (as we will discuss later in this chapter) calls out disk encryption specifically, requiring that logical access “be managed independently of native operating system access control mechanisms,” which effectively takes most file-system encryption out of the possible solution set. As another potential solution to this vulnerability, you might choose to encrypt your data programmatically before inserting it into the database (perhaps using DBMS_CRYPTO or by writing your own scheme). When reading the data from the database, you then must decrypt the data to make it available to other applications and users. Programmatic encryption can provide selective encryption at relatively little to moderate costs (for development and testing), but it requires specialized skills and good design. In Effective Oracle Database 10g Security By Design, Knox offers some great examples of using DBMS_CRYPTO. With some development effort and use of function-based views, you can make DBMS_CRYPTO fairly transparent to developers, making this an attractive solution in the short term. Look at the longer term impacts of such a solution—issues of character set conversions, potential forced use of RAW datatypes, long-term code maintenance, and key storage make programmatic encryption potentially complex and expensive. Chapter 2: Transparent Data Encryption 33 The Transparent Data Encryption Solution With the shortcomings of other solutions, you need a straightforward, no-code solution to database encryption to protect data throughout the full data life cycle. If you are planning to store the data in an Oracle 10g Release 2 or later database, you might consider the Database Security option that includes TDE. TDE provides declarative encryption within the DDL with basic key management. TDE is an implementation of standards-based encryption algorithms built into the database engine, where data stored in a database is encrypted upon write (inserts) and decrypted upon read (select). The cryptographic key that makes this possible, the Master Key, is stored in an Oracle Wallet and can be opened or closed, making it possible to control the decryption of data by essentially flipping a switch. The keys for all tables containing encrypted columns are encrypted using the database Master Key and stored in the data dictionary. TDE provides a no-code solution to this “cleartext on disk” vulnerability by allowing data architects or DBAs to choose individual columns (introduced in 10g R2) or entire tablespaces (introduced in 11g) and specify that they be stored in datafiles after encrypting the data elements. Since 10g TDE applies only to columns and 11g broadens support to entire tablespaces, column- level and tablespace-level TDE will be used generically to refer to each feature, with the understanding that tablespace-level TDE was not available in 10g. This proves to be a straightforward, elegant solution, because TDE manages the keys and the implementation of the encryption rather than putting this task on the developer, as other programmatic encryption strategies do. The key management problem is important, because when you want to turn plaintext into ciphertext, you need to use a key. If you then want to decrypt your ciphertext, you must have the key available. How and where this key is stored is part of the challenge in developing a programmatic approach or using the DBMS_CRYPTO solution, as are key rotation, key backup, and recoverability. Key storage is particularly challenging, because if the key were to be stored in the database, it might be vulnerable on the file system or backups just like the data you are protecting. In an attempt to remedy this, you could encrypt the key, but then you are left with the same question: Where do I securely store this key? Transparent data encryption provides the answer to this question by using the Oracle Wallet to store the encryption key. TDE as Part of the Advanced Security Option TDE is available with the Enterprise Edition of Oracle Database as the Advanced Security option, or as it’s known in 11g, Oracle Database 11g Advanced Security. The Advanced Security option provides two other security features in addition to TDE: network encryption and strong authentication. The network encryption provided by the Advanced Security option provides assurance that data is not read or altered between an Oracle .NET client (application, Java Database Connectivity client, and so on) and the database listener of the server, which often makes sense when dealing with any sort of protected data. Network encryption can be configured by adding a couple of lines to the SQLNET.ora file of the client and server, effectively encrypting the entire communication channel between the two. The third feature provided by Advanced Security, strong authentication, enables the use of smart cards, RADIUS, and certificate-based authentication of the client to the server. Using the strong authentication feature can limit the ability to connect the database to a particular set of machines that have certificate-based authentication configured. . another. 28 Part I: Oracle Database Security New Features Understanding and acknowledging the use of public key and symmetric key encryption is important to the Oracle Database because the database supports. algorithms are very secure and very efficient at encrypting and decrypting data. Some popular examples are RC4, RC5, DES, triple-DES (3DES), and the new Advanced Encryption Standard (AES). Because. Protocols, Algorithms, and Source Code in C, John Wiley & Sons). The mathematics involved and the issues and nuances of cryptography are staggering in number and complexity and well beyond the