Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 58 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
58
Dung lượng
1,19 MB
Nội dung
Hash Luong The Nhan, Tran Giang Son Chapter Hash Basic concepts Hash functions Data Structures and Algorithms Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Luong The Nhan, Tran Giang Son Faculty of Computer Science and Engineering University of Technology, VNU-HCM Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.1 Hash Outcomes Luong The Nhan, Tran Giang Son • L.O.5.1 - Depict the following concepts: hashing table, key, collision, and collision resolution • L.O.5.2 - Describe hashing functions using pseudocode and give examples to show their algorithms • L.O.5.3 - Describe collision resolution methods using pseudocode and give examples to show their algorithms Basic concepts Hash functions Direct Hashing Modulo division • L.O.5.4 - Implement hashing tables using C/C++ Digit extraction Mid-square • L.O.5.5 - Analyze the complexity and develop experiment (program) to evaluate methods supplied for hashing tables • L.O.1.2 - Analyze algorithms and use Big-O notation to characterize the computational complexity of algorithms composed by using the following control structures: sequence, branching, and iteration (not recursion) Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.2 Hash Contents Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Collision resolution Open addressing Linked list resolution Bucket hashing Open addressing Linked list resolution Bucket hashing 9.3 Hash Luong The Nhan, Tran Giang Son Basic concepts Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.4 Hash Basic concepts Luong The Nhan, Tran Giang Son Sequential search: O(n) • Binary search: O(log n) • Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square → Requiring several key comparisons before the target is found Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.5 Hash Basic concepts Luong The Nhan, Tran Giang Son Search complexity: Size Binary Sequential (Average) 16 50 25 256 128 1,000 10 500 10,000 14 5,000 100,000 17 50,000 1,000,000 20 500,000 Sequential (Worst Case) 16 50 256 1,000 10,000 100,000 1,000,000 Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.6 Hash Basic concepts Luong The Nhan, Tran Giang Son Is there a search algorithm whose complexity is O(1)? Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.7 Hash Basic concepts Luong The Nhan, Tran Giang Son Is there a search algorithm whose complexity is O(1)? YES Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.7 Hash Basic concepts Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing Hình: Each key has only one address 9.8 Hash Basic concepts Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.9 Hash Open Addressing Luong The Nhan, Tran Giang Son There are different methods: Basic concepts • • • • Linear probing Quadratic probing Double hashing Key offset Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.40 Hash Linear Probing • When a home address is occupied, go to the next address (the current address + 1): hp(k, i) = (h(k) + i) mod m Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.41 Hash Linear Probing • When a home address is occupied, go to the next address (the current address + 1): hp(k, i) = (h(k) + i) mod m Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.41 Hash Linear Probing Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.42 Hash Linear Probing Luong The Nhan, Tran Giang Son • Advantages: • quite simple to implement • data tend to remain near their home address (significant for disk addresses) Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square • Disadvantages: • produces primary clustering Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.43 Hash Quadratic Probing Luong The Nhan, Tran Giang Son • The address increment is the collision probe number squared: hp(k, i) = (h(k) + i2 ) mod m Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.44 Hash Quadratic Probing Luong The Nhan, Tran Giang Son • Advantages: • works much better than linear probing Basic concepts Hash functions • Disadvantages: • time required to square numbers • produces secondary clustering h(k1 ) = h(k2 ) → hp(k1 , i) = hp(k2 , i) Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.45 Hash Double Hashing Luong The Nhan, Tran Giang Son Basic concepts • Using two hash functions: hp(k, i) = (h1 (k) + ih2 (k)) mod m Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.46 Hash Key Offset Luong The Nhan, Tran Giang Son • The new address is a function of the collision address and the key Basic concepts Hash functions of f set = [key/listSize] newAddress = (collisionAddress + of f set) mod listSize Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.47 Hash Key Offset Luong The Nhan, Tran Giang Son • The new address is a function of the collision address and the key Basic concepts Hash functions of f set = [key/listSize] newAddress = (collisionAddress + of f set) mod listSize Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution hp(k, i) = (hp(k, i − 1) + [k/m]) mod m Open addressing Linked list resolution Bucket hashing 9.47 Hash Open addressing Luong The Nhan, Tran Giang Son Hash and probe function: hp : U ×{0, 1, 2, , m−1} → {0, 1, 2, , m−1} Basic concepts Hash functions Direct Hashing Modulo division set of keys probe numbers addresses Digit extraction Mid-square Mid-square {hp(k, 0), hp(k, 1), , hp(k, m − 1)} is a permutation of {0, 1, , m − 1} Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.48 Hash Linked List Resolution Luong The Nhan, Tran Giang Son • Major disadvantage of Open Addressing: each collision resolution increases the probability for future collisions → use linked lists to store synonyms Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.49 Hash Linked list resolution Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.50 Hash Bucket hashing Luong The Nhan, Tran Giang Son • • Hashing data to buckets that can hold multiple pieces of data Each bucket has an address and collisions are postponed until the bucket is full Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.51 Hash Bucket hashing Luong The Nhan, Tran Giang Son Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square Mid-square Folding Rotation Pseudo-random Collision resolution Open addressing Linked list resolution Bucket hashing 9.52 ... before hashing original key rotated key 60 010 1 16 0 010 60 010 2 260 010 60 010 3 360 010 60 010 4 460 010 60 010 5 560 010 Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square... Bucket hashing 9 .19 Hash Digit extraction Luong The Nhan, Tran Giang Son Address = selected digits f rom Key Basic concepts Example: 379452→394 12 1267? ?11 2 378845→388 16 0252? ?10 2 04 512 8→0 51 Hash. .. the key Example: 379452: 379 * 379 = 14 36 41? ??364 12 1267: 12 1 * 12 1 = 014 6 41? ??464 04 512 8: 045 * 045 = 002025→202 Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid-square