Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 54 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
54
Dung lượng
2,59 MB
Nội dung
Data Structure and Algorithms [CO2003] Chapter - Hash Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Faculty of Computer Science and Engineering Hochiminh city University of Technology Contents Basic concepts Hash functions Collision resolution Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Outcomes • L.O.5.1 - Depict the following concepts: hashing table, key, collision, and collision resolution • L.O.5.2 - Describe hashing functions using pseudocode and give examples to show their algorithms • L.O.5.3 - Describe collision resolution methods using pseudocode and give examples to show their algorithms • L.O.5.4 - Implement hashing tables using C/C++ • L.O.5.5 - Analyze the complexity and develop experiment (program) to evaluate methods supplied for hashing tables • L.O.1.2 - Analyze algorithms and use Big-O notation to characterize the computational complexity of algorithms composed by using the following control structures: sequence, branching, and iteration (not recursion) Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Basic concepts • Sequential search: O(n) • Binary search: O(log2 n) → Requiring several key comparisons before the target is found Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Search complexity: Size Binary 16 50 256 1,000 10,000 100,000 1,000,000 10 14 17 20 Sequential (Average) 25 128 500 5,000 50,000 500,000 Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Sequential (Worst Case) 16 50 256 1,000 10,000 100,000 1,000,000 Data Structure and Algorithms [CO2003] / 44 Basic concepts Is there a search algorithm whose complexity is O(1)? Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Is there a search algorithm whose complexity is O(1)? YES Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Figure 1: Each key has only one address Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Open Addressing There are different methods: • Linear probing • Quadratic probing • Double hashing • Key offset Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 32 / 44 Linear Probing • When a home address is occupied, go to the next address (the current address + 1): hp(k, i) = (h(k) + i) mod m Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 33 / 44 Linear Probing • When a home address is occupied, go to the next address (the current address + 1): hp(k, i) = (h(k) + i) mod m Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 33 / 44 Linear Probing Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 34 / 44 Linear Probing • Advantages: • quite simple to implement • data tend to remain near their home address (significant for disk addresses) • Disadvantages: • produces primary clustering Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 35 / 44 Quadratic Probing • The address increment is the collision probe number squared: hp(k, i) = (h(k) + i2 ) mod m Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 36 / 44 Quadratic Probing • Advantages: • works much better than linear probing • Disadvantages: • time required to square numbers • produces secondary clustering h(k1 ) = h(k2 ) → hp(k1 , i) = hp(k2 , i) Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 37 / 44 Double Hashing • Using two hash functions: hp(k, i) = (h1 (k) + ih2 (k)) mod m Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 38 / 44 Key Offset • The new address is a function of the collision address and the key of f set = [key/listSize] newAddress = (collisionAddress + of f set) mod listSize Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 39 / 44 Key Offset • The new address is a function of the collision address and the key of f set = [key/listSize] newAddress = (collisionAddress + of f set) mod listSize hp(k, i) = (hp(k, i − 1) + [k/m]) mod m Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 39 / 44 Open addressing Hash and probe function: hp : U × {0, 1, 2, , m − 1} → {0, 1, 2, , m − 1} set of keys probe numbers addresses {hp(k, 0), hp(k, 1), , hp(k, m − 1)} is a permutation of {0, 1, , m − 1} Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 40 / 44 Linked List Resolution • Major disadvantage of Open Addressing: each collision resolution increases the probability for future collisions → use linked lists to store synonyms Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 41 / 44 Linked list resolution Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 42 / 44 Bucket hashing • Hashing data to buckets that can hold multiple pieces of data • Each bucket has an address and collisions are postponed until the bucket is full Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 43 / 44 Bucket hashing Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 44 / 44 ... Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic concepts Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] / 44 Basic... Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 10 / 44 Basic concepts Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 11 /... nddung@hcmut.edu.vn Data Structure and Algorithms [CO2003] 13 / 44 Direct Hashing The address is the key itself: hash(Key) = Key Lecturer: Duc Dung Nguyen, PhD Contact: nddung@hcmut.edu.vn Data Structure and Algorithms