Password recovery for encrypted ZIP archives using GPUs

Password recovery for encrypted ZIP archives using GPUs Pham Hong Phong Phan Duc Dung Hanoi University of Technology Hanoi University of Technology phongph@it-hut.edu.vn ducdung872001@gmail.com Duong Nhat Tan Nguyen Huu Duc Nguyen Thanh Thuy Hanoi University of Technology Hanoi University of Technology Hanoi University of Technology dn.tan7388@gmail.com ducnhfit@mail.hut.edu.vn thuynt@it-hut.edu.vn ABSTRACT Protecting data by passwords in documents such as DOC, PDF or RAR, ZIP archives has been demonstrated to be weak under dictionary attacks Time for recovering the passwords of such documents mainly depends on two factors: the size of the password search space and the computing power of the underline system In this paper, we present an approach using modern multi-core graphic processing units (GPUs) as computing devices for finding lost passwords of ZIP archives The combination of GPU’s extremely high computing power and the state-of-the-art password structure analysis methods would bring us a feasible solution for recovering ZIP file password We first apply password generation rules[9] in generating a reasonable password space, and then use GPUs for exhaustively verifying every password in the space The experimental results have shown that the password verification speed increases about from 48 to 170 times (depends on the number of GPUs) compared to sequential execution on the Intel Core Quad Q8400 2.66 Ghz These results have demonstrated the potential applicability of GPUs in this cryptanalysis field Categories and Subject Descriptors E.3 [Data]: Data encryption—code breaking General Terms Security, Performance Keywords GPU, ZIP, password recovery INTRODUCTION Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee SoICT ’10, August 27-28, 2010, Hanoi, Vietnam Copyright 2010 ACM 978-1-4503-0105-3/10/08 $10.00 Figure 1: ZIP archive encryption and decryption processes While cryptographic researches intend to protect against information leakage in data storage and data communication, cryptanalysis, in contrast, tries to discover information protected by encryption These two research branches are considered as two sides of the same information security issue The development of one branch is promoting the development of the other In the history, the advance of a branch against the other sometimes brings great benefits for life, even decides the fate of a nation For example, the success in the Zimmermann Telegram cryptanalysis in the First World War made the United States plunge into the war, or the success in the Germanic cryptanalysis contributed to shorten the Second World War a few months[5] From the same point of view, in this paper, we aim to research a cryptanalysis method for ZIP archives protected by passwords Originally, compression methods such as PKZip, Deflate, LZMA are used to reduce data size, making data storage and data communication effective Because the information security often comes with data storage and information exchange techniques, popular compression tools such as WinZip, implementing efficient compression algorithms, often integrate with data encryption functions which typically use common symmetric encryption systems such as DES or AES For the convenience, an encryption key is generated from the sender’s password by a hash function This key is used to encrypt the document Then the password is transferred to the recipient via a secure channel and used to generate the same key for decoding the document The process of encoding and decoding a protected ZIP file is shown in Figure In this paper, we try to recover the content of an encrypted ZIP file without knowing its protection password In fact, for weak encryption systems such as RC4 or DES, cryptanalysis can be conducted by an exhaustive attack on the whole key space in an acceptable time The results of [4], [6] have proved this method For strong encryption systems such as AES (which is commonly used in new versions of WinZip – 9.0 or higher), such an exhaustive attack is almost impossible AES[1], Advanced Encryption Standard, is a block cipher AES works with 128-bit data blocks (4x4 bytes) with the key length is either 128, 192 or 256 bits AES can easily be implemented at high speed by software or hardware and does not require much more memory AES is a strong encryption system, AES-128/AES-256 with the key size of 128-bit/256-bit, there are up to 2128 /2256 cases in the key space for testing to find out the original password Courtois and Pieprzyk’s XSL algorithm [2], reduces the key space from 2128 keys to 2100 keys But even so, trying all the possibilities still is not acceptable for common computing systems Our approach in this paper does not directly attack on AES key space Instead, we found that the encryption key of a protected ZIP file is generated from a user password by a hash function which is published in the ZIP file specification[8] Although password spaces are also large but attacking on a password space is much more feasible than attacking on AES key space, since dictionary attack methods can be applied The major obstacle of this cryptanalysis method is that the computational complexity of the hash function is quite high This is even more difficult, because password salting techniques implemented in new versions of ZIP tools prevent us from using pre-computed attack methods To overcome these difficulties, we first employ the recent password structure analysis method of Weir, Aggarwal, Medeiros, and Glodek[9] to reduce the size of the password search space Then we use the extremely high computing power of modern muti-core GPUs for implementing the complex hash function to concurrently verify passwords from the password search space and to generate AES encryption keys for all possibly-correct passwords Finally, we apply plaintext recognition techniques to find one correct answer from the set of possibly-correct passwords Our experimental program, which was written in CUDA running on a PC with an Intel Quad Core 2.66Ghz and two NVIDIA GeForce GTX 295 graphics cards installed, achieves the password checking speed about 5,011 passwords per second, 170 times faster than the CPU-based sequential program on same system With a good password structure and a large dictionary, this result shows that the proposed algorithm would allow us to recover ZIP file password in a reasonable time compared to the CPU-based version In the rest of this paper, we briefly introduce GPU and CUDA technologies for general-purpose applications, describe details of the proposed algorithm and show the experimentation results CUDA AND GPGPU In recent years, computing power of graphics processors(GPU) has increased significantly compared to CPU Until June 2008, NVIDIA’s GPU GT200 generation has reached the threshold of 933GFLOPS, more than 10 times over dualcore processor the Intel Xeon 3.2 GHz at the same time Figure shows a massive increase in computing power of the nVidia graphics processors compared to Intel proces- Figure 2: NVIDIA GPU-Intel CPU performance comparison sors This superiority in performance does not imply the superiority in technology GPU and CPU are developed in two different directions: while CPU technology speeds up a single task, GPU technology tries to increase the number of tasks that can be performed in parallel Thus, while the number of cores in common CPUs has not reached cores yet, the number of cores in single GPU has reached 240 and also promises to continue to increase to 500 cores in 2010 As a penalty for the computing power, GPUs lose the flexibility of processing cores Currently, all processing cores on one single GPU can only execute a single piece of code at a time, so GPU is only suitable for data parallel problems, in which the same program code will be executed in parallel for several different data sets Fortunately, most problems that require large computing power can be converted to a type of data parallelism Beside the effort of improving GPU computing power, GPU manufacturers are also interested in providing better application development environments for common developers to easily program on GPUs NVIDIA CUDA[7] is a good example of such effort With CUDA, programmers can exploit GPU computing power for not only graphics processing applications but also general-purpose applications This technology is one of important factors for the opening of the recent GPGPU(General-Purpose computation on Graphics Processing Units) era The followings are some key features of the programming language supported by CUDA (called CUDA language): • CUDA language is an extension of C language, so familiar to most developers • CUDA code is divided into two parts: one executed on CPU and the other executed on GPU The part executed on GPU, also known as parallel kernel, when called, can be executed in parallel on thousands of execution threads Each thread has a unique identifier used to determine its task • CUDA allows programmers to define an arbitrary number of parallel threads, but to avoid the dependence on hardware , threads are devided into blocks with the number not exceeding 768(GT200 generation) This allows a programmer to design his parallel program effectively without caring about the hardware capability • Memory is hierarchically organized for effective usage – Main memory: the memory area for CPU code Only this code can access and modify information here common computing systems The random value salt is used to prevent from pre-computed attacks – Global memory: the memory area that all GPU threads can access to it Programmers can move data from main memory to global memory by using functions from a CUDA basic library This memory is often used to store inputs and outputs for parallel threads on GPUs Decrypt and decompress the encrypted ZIP file using the obtained AES key in the previous step This algorithm also generates a checking value which will be compared to a MAC (message authentication code) value stored in the archive to decide whether the password is correct or not – Shared memory: the memory area that only threads in one block can access This memory is integrated on-chip, therefore, the speed of accessing data on it is much higher than on global memory This memory is often used to store temporary shared data among threads in a block to speed up the process of memory usage – Local memory: the memory area allocated to local variables of each thread and one GPU thread can not access to those from others With the ability to perform data parallelism on such a lot of threads, GPU is an appropriate choice to the problem of ZIP file cryptanalysis, where each thread can take one password from the password search space to check The next section of the paper explains details of our GPU-based password recovery algorithm using CUDA for protected ZIP files RECOVERING THE ZIP FILE PASSWORD ON GRAPHICS PROCESSORS As introduced in Section 1, our approach in ZIP file cryptanalysis is to attack on the password space, instead of directly attacking on the AES key space The whole password recovery algorithm is devided into three main steps: Apply the password structure analysis algorithm from[9] to find an appropriate password structure, and then use the password structure for reducing the size of password search space Exploit the computing power of GPUs for accelerating the preliminary password checking process The result of this process is a set of possibly-correct passwords (called candidate passwords) whose sizes are much more smaller than the size of password search space Apply a plaintext recognition technique for verifying each candidate password to find out one correct password of the ZIP file In this paper, we concentrate on step one and step two of the algorithm 3.1 Strategy According to the specification of the ZIP file[8], the process of checking one password consists of the following steps: Generate an AES key from a given password by using the hash function described in the specification PBKDF2(pw, salt, dkLen) where pw is the given password to check, salt is a random value stored in the compressed file, and dkLen is the AES key size The function PBKDF2 has a large computational complexity In fact, this function performs the HMAC-SHA1 algorithm for 1,000 times, thus preventing attacks from With such the specification, for each of passwords if we completely implement these steps, the time needed to check the entire password space is very large because decrypting and decompressing the entire encrypted ZIP file is extremely expensive Instead, in step two we can apply decrypting and decompressing techniques for a part of the ZIP file, and then use a plaintext recognition technique to quickly check the validation of the generated AES key The time of password recovery for a protected ZIP file tightly depends on the size of the given password space An exhaustive approach is to enumerate all possibly-correct passwords whose lengths are shorter than a specific number For example the set of passwords composed of the set of characters {a-z,A-Z,0-9} with the maximum length of will have totally 57,731,386,987 passwords Due to the high computational complexity of PKBDF2 function, checking such password search space would require a huge computational resource Instead of the exhaustive approach, we employ a new result of password structure analysis from [9] to reduce the size of the password search space This research uses the statistic results in psychology about ability to maintain password memory of users to construct a password structure which presents a much smaller password space containing the ones with highest occurrence probabilities Another strategy we have considered in this paper is to take advantage of friendly usage features of compression tools For example, WinZip allows to detect incorrect passwords rapidly by storing a two-byte password verification value (PVV) in the header of the ZIP file This value will be compared to a part of the output of the function PBKDF2 for quickly rejecting most incorrect passwords If a password is accepted, it is not yet guaranteed that the password is correct However, the number of passwords which can be accepted is significantly smaller than the size of initial password space We call them candidate passwords Since the execution of the hash function PBKDF2 takes the main workload of the checking process, our strategy is to implement this hash function on GPUs to effectively check passwords in the given password space in parallel In the next two sub-sections, we are going to explain in details the algorithm of password structure analysis to reduce password search space, and the implementation of PBKDF2 on GPUs 3.2 Reducing password search space using the password structure analysis technique There are two factors in evaluating the quality of a password space: the number of passwords and the probability of the success in finding the correct password Let us analyse two approaches in forming a password space: Full space This space contains all passwords It always meets the second condition since the correct password can always be found by a exhaustive search algorithm However, the size of the full space is normally too large, the implementation of the exhaustive search algorithm becomes impractical Partial space Instead of exhaustively searching in the full space, we can choose a subset of the space for the searching algorithm Since the occurrence probability of the correct password in the space is an important factor for the success of the password searching process, we should consider a good subset so that the occurrence probability of the correct password in the subset is as high as possible In the second approach, external knowledge about password structure is normally used to determine the subset Using a smaller set of characters for the derivation of the correct password, limiting the length of the correct password are two naive techniques for choosing a partial space This is, however, not so practical since the size of the resulting space is normally too large to contain the long correct password Instead, we can construct the password search space in the order of occurrence probabilities of passwords The order of generating passwords is very important It helps to find the correct password quickly and then can immediately end the searching process without having to check the rest of the password space There are many criteria that change the order of checking passwords For example, the correct password would contain no more than 10% of uppercase characters, or 20% special characters and digit characters in the total number of characters We call all of such criteria password generation rules Clearly, passwords which are generated by rules are more “quality” than the ones generated by the naive techniques In this paper, we refer to the approach using password generation rules, based on the descending occurrence probabilities of password structures This technique was originally proposed by Weir, Aggarwal, Medeiros, and Glodek in their work [9] The naive techniques consider the occurrence probabilities of user passwords to be similar In fact, according to statistics of occurrence probabilities of actual passwords, this is not so practical For example, the password “password12” has a higher occurrence probability than the password “P@$$W0rd!12” Assume that a password is a combination of alphabet, numeric and special characters We denote alphabet characters as L, numeral characters as D, and special characters as S Then the password “$password12” can be structurally denoted as SLD This structure is called the simple structure If we add the information about the number of characters to a simple structure, we will obtain a base structure, e.g S1 L8 D2 One important type of the base structures is the pre-terminal structure, which can be generated from the base structure by filling in specific values for the D and S parts of the base structure For example, one instance of the pre-terminals of S1 L8 D2 is $L8 12 We calculate the probability of a pre-terminal as the product of the probability of the base structure, the occurrence probabilities of special characters and the occurrence probabilities of numeric characters, which can be pre-computed by using a meaning dictionary The algorithm published in [9] can be briefly described as followings: • Given a set of password generation rules in form of a context-free grammar G=(V,Z,F,P) where V, Z are finite sets of variables and terminals, S is the start Table 1: An example of Production S → D1 L S D1 S → S L D1 S D1 → D1 → D1 → S1 →! S1 → % S1 → # S2 → $$ S2 → ∗∗ a password grammar Probability 0.75 0.25 0.60 0.20 0.20 0.65 0.30 0.05 0.70 0.30 variable, P is a finite set of productions of the form α → β where α is a single variable and β is a sequence of variables and terminals Table gives an example of a grammar together with probabilities of rules • Pre-terminal structures are generated in order of decreasing probability by the following tree-buiding steps – Put S as the root of the tree – Children of the root are pre-terminals with highest probabilities, derived from the base structures that are immediately obtained from S Note that a pre-terminal can be generated from a base structure by substituting all occurrences of S and D with the corresponding special charaters and numbers as shown in the grammar – The tree advances to each leaf by substituting a higher probability special character or number in a pre-terminal with a lower probability one – Figure shows the corresponding generated tree to get pre-terminals in order of decreasing probability • A password can be generated from a pre-terminal structure by substituting L meta character in the structure with a meaning word from the given dictionary The set of passwords generated by this approach is reasonably small in comparison to those generated by naive techniques It will be considered as the input password space for the password verification process on GPUs described in the next sub-section 3.3 Verifying candidate passwords on GPUs Assuming that the input password space includes n passwords In theory, n passwords can be checked - to confirm whether each password is a candidate password or not - at the same time, by calling p (p = n) corresponding GPU threads However, this number of threads p is limited by hardware resources, usually p is much smaller than n Therefore, to check n passwords, we need to sequentially call (n/p) times, each time a batch of p passwords is feeded to check p threads in parallel, thus, the password search space should be divided into the corresponding batches Figure describes such inspection of passwords The algorithm code is divided into two parts: the sequential execution on CPU and parallel execution on GPUs The first part generates pre-terminal structures as shown in the Figure 3: The corresponding generated tree Figure 4: Checking passwords in parallel previous section This part is executed sequentially on CPU The computation cost mainly depends on the second part For each of obtained pre-terminal structures, we transplant it with words from the given meaning dictionary to generate a set of passwords, then use the function PBKDF2 to check whether each password of the set is a candidate password or not Because the number of words in the dictionary is very large, the number of passwords generated from a single preterminal is large as well Thus, we can take advantage of GPU computing power for this checking task We denote the set of words in the meaning dictionary as W , among them, the set of k-length words is denoted as Wk with k >= |Wk | represents the number of words in Wk Pseudocode of the algorithm as followings: for each pre-terminal S { k = llength(S); m = |Wk|; l = ceil(m / p); /* l - a number of batches */ for j = to l - { for id = to p - in parallel { base = j * p; guess_password = transplant(S, Wk(base + id); TestPVV = PBKDF2(guess_password, salt, dkLen); if (TestPVV == PVV) markCandidatePassword(guess_password); } } } In the above code, the function llength(S) returns the length of the consecutive letter area denoted by the meta symbol L in the pre-terminal structure (for the convenience of the presentation, we assume that there is only one meta symbol L in the pre-terminal structure) This meta symbol will be substituted with a k-length word in the dictionary to form a test password Since each GPU can only perform maximum p threads at the same time, the set of words Wk is devided into l batches, each of them is proceeded in parallel Thus, for the batch j, words from j ∗ p to (j + 1) ∗ p − in Wk will be merged together with the structure S to form p passwords to test by PBKDF2 function, and the candidate passwords are finally marked by markCandidateP assword EXPERIMENTAL RESULTS The algorithms described in the previous two sections have been implemented and tested on a system consisting of • CPU: Intel Core Quad Q8400 2.66 Ghz • RAM: 8GB • GPU: two dual graphic cards NVIDIA GeForce GTX 295 (total of GPUs) • OS: CentOS 5.3 Inputs of the program include two dictionaries: • A specialized dictionary which is a set of actual passwords This dictionary serves the calculation of occurrence probabilities of the password structures, special characters, and numeric characters However, for the reason of information security, achieving real passwords is not so easy In our experiment, we have created some forms of base structures to generate passwords and then manually set the probabilities of them, as well as of digit and special characters • The dictionary dic-0294 contains 869,229 words for the substitution of meta symbol L in the result preterminal structures This dictionary is considered as the current largest one Table 2: Performance comparison of generating candidate passwords using the exhaustive search algorithm The limited length of passwords The number of passwords CPU 1GPU 2GPUs 4GPUs 62 4s 22s 22s 22s 3,906 8m 22s 22s 22s 242,234 2.25h 180s 92s 48s 15,018,570 5.25d 168m 85m 43m Table 3: Comparison of the password checking speeds on different environments Implemetation CPU-based algorithm Single GPU-based algorithm 2GPU-based algorithm 4GPU-based algorithm Time 2.9h 4m 2m16s 1m15s In[3] we applied the exhaustive approach to demonstrate superior computing power of GPGPU technology in the problem of recovering ZIP file password This experiment uses the set of uppper, lower and numeric characters S = {az,A-Z,0-9} and the optimal number of threads can run in parallel p = 32,768 The result showed that the speed of generating AES keys on GPU increases from 48 to 170 times compared to that on the sequential CPU-based program (approximately 48 times on single GPU, and 170 times on four GPUs) Table demonstrates this result The disadvantage of this approach is that the password search space may increase exponentially when the maximum length of passwords increases Therefore, with the password of ZIP file such as “6class$$4”, it is almost impossible to recover password if using this approach Our experiment in this paper generates the password space based on the occurrence probabilities of pre-terminal structures We can overcome this problem while still taking advantage of great computing power of GPUs The experiment assumes that the correct password of the given ZIP file can be derived from one of pre-terminal structures generated by the rules in Table The correct password of the given ZIP file is ”6class$$4” which is generated from the pre-terminal structure ”6L5 $$4”, the fifth one on the priority tree Table depicts the performance comparison chart of different implementations of the checking algorithms CONCLUSIONS We have introduced the two main steps in our approach for the problem of recovering protected ZIP file password: reducing the password search space using the password structure analysis technique, and verifying candidate passwords by using the high computing performance of GPUs The experimental results shows that the speed of recovering password by this approach gives a significantly greater performance than that by the CPU-based algorithm With a very large initial password space, the number of candidate passwords is also not small (it can be reduced 65,536 times if using the two-byte value PVV) Thus, our next step for completely solving the problem of recovering password for protected ZIP file would be the implementation of decryption, decompression and plaintext recognition algorithms on GPUs In addition, we will also consider implementing the proposed algorithms on GPU cluster to exploit the power of such computing system Finally, the solution proposed in this paper can be customized to apply in cryptanalysis problems on other kinds of protected files such as DOC, PDF REFERENCES [1] F I P S P 197 Advanced encryption standard (aes), 2001 [2] N T Courtois and J Pieprzyk Cryptanalysis of block ciphers with overdefined systems of equations, 2002 Preprint is available at http://eprint.iacr.org/2002/044/ [3] P Dung, D Tan, P Phong, N Duc, and N Thuy Applying cuda computing technology in the problem of recovering zip file password In FAIR09: Proceedings of the 4th National Symposium of Fundamental and Applied Information Technology Research, 2009 [4] E F Foundation Cracking DES: Secrets of Encryption Research, Wiretap Politics and Chip Design O’Reilly & Associates, Inc, 1998 [5] D Kahn The Codebreakers - The Story of Secret Writing 1967 [6] A Klein Attacks on the rc4 stream cipher Des Codes Cryptography, 48(3):269–286, 2008 [7] NVIDIA http://www.nvidia.com/object/cuda home new.html [8] PKWARE Zip file format specification, 2007 [9] M Weir, S Aggarwal, B d Medeiros, and B Glodek Password cracking using probabilistic context-free grammars In SP09: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pages 391–405, Washington, DC, USA, 2009 IEEE Computer Society ... our GPU-based password recovery algorithm using CUDA for protected ZIP files RECOVERING THE ZIP FILE PASSWORD ON GRAPHICS PROCESSORS As introduced in Section 1, our approach in ZIP file cryptanalysis... input password space for the password verification process on GPUs described in the next sub-section 3.3 Verifying candidate passwords on GPUs Assuming that the input password space includes n passwords... approach for the problem of recovering protected ZIP file password: reducing the password search space using the password structure analysis technique, and verifying candidate passwords by using

Định dạng
Số trang	6
Dung lượng	207,1 KB