|
1 | 1 | # HashMap Design |
2 | 2 |
|
3 | | -Constraints and assumptions |
4 | | -For simplicity, are the keys integers only? |
5 | | -Yes |
6 | | -For collision resolution, can we use chaining? |
7 | | -Yes |
8 | | -Do we have to worry about load factors? |
9 | | -No |
10 | | -Can we assume inputs are valid or do we have to validate them? |
11 | | -Assume they're valid |
12 | | -Can we assume this fits memory? |
13 | | -Yes |
| 3 | +Design a HashMap without using any built-in hash table libraries. |
| 4 | + |
| 5 | +Implement the HashMap class: |
| 6 | + |
| 7 | +- HashMap() initializes the object with an empty map. |
| 8 | +- void `put(int key, int value)` inserts a (key, value) pair into the HashMap. If the key already exists in the map, |
| 9 | + update the corresponding value. |
| 10 | +- `int get(int key)` returns the value to which the specified key is mapped, or -1 if this map contains no mapping for |
| 11 | + the key. |
| 12 | +- `void remove(key)` removes the key and its corresponding value if the map contains the mapping for the key. |
| 13 | + |
| 14 | +## Example |
| 15 | + |
| 16 | +Example 1: |
| 17 | + |
| 18 | +```text |
| 19 | +Input |
| 20 | +["MyHashMap", "put", "put", "get", "get", "put", "get", "remove", "get"] |
| 21 | +[[], [1, 1], [2, 2], [1], [3], [2, 1], [2], [2], [2]] |
| 22 | +Output |
| 23 | +[null, null, null, 1, -1, null, 1, null, -1] |
| 24 | +
|
| 25 | +Explanation |
| 26 | +MyHashMap myHashMap = new MyHashMap(); |
| 27 | +myHashMap.put(1, 1); // The map is now [[1,1]] |
| 28 | +myHashMap.put(2, 2); // The map is now [[1,1], [2,2]] |
| 29 | +myHashMap.get(1); // return 1, The map is now [[1,1], [2,2]] |
| 30 | +myHashMap.get(3); // return -1 (i.e., not found), The map is now [[1,1], [2,2]] |
| 31 | +myHashMap.put(2, 1); // The map is now [[1,1], [2,1]] (i.e., update the existing value) |
| 32 | +myHashMap.get(2); // return 1, The map is now [[1,1], [2,1]] |
| 33 | +myHashMap.remove(2); // remove the mapping for 2, The map is now [[1,1]] |
| 34 | +myHashMap.get(2); // return -1 (i.e., not found), The map is now [[1,1]] |
| 35 | +``` |
| 36 | + |
| 37 | +## Constraints |
| 38 | + |
| 39 | +- 0 <= key, value <= 106 |
| 40 | +- At most 104 calls will be made to put, get, and remove. |
| 41 | + |
| 42 | +## Topics |
| 43 | + |
| 44 | +- Array |
| 45 | +- Hash Table |
| 46 | +- Linked List |
| 47 | +- Design |
| 48 | +- Hash Function |
| 49 | + |
| 50 | +## Solution |
| 51 | + |
| 52 | +A hash map is a fundamental data structure found in various programming languages. Its key feature is facilitating fast |
| 53 | +access to a value associated with a given key. Designing an efficient hash map involves addressing two main challenges: |
| 54 | + |
| 55 | +1. **Hash function design**: The hash function serves to map a key to a location in the storage space. A good hash |
| 56 | + function ensures that keys are evenly distributed across the storage space, preventing the clustering of keys in |
| 57 | + certain locations. This even distribution helps maintain efficient access to stored values. |
| 58 | + |
| 59 | +2. **Collision handling**: Despite efforts to evenly distribute keys, collisions—where two distinct keys map to the same |
| 60 | + storage location—are inevitable due to the finite nature of the storage space compared to the potentially infinite |
| 61 | + key space. Effective collision-handling strategies are crucial to ensure data integrity and efficient retrieval. To |
| 62 | + deal with collisions, we can use methods like chaining, where we link multiple values together at that location, or |
| 63 | + open addressing, where we find another empty location for the key. |
| 64 | + |
| 65 | +### Step-by-step solution construction |
| 66 | + |
| 67 | +The first step is to design a hash function using the modulo operator, particularly suitable for integer-type keys. |
| 68 | +The modulo operator, denoted by %, is a mathematical operation that returns the remainder of dividing one number by |
| 69 | +another. When selecting a modulo base, it’s advisable to choose a prime number. This is because choosing a prime number |
| 70 | +as the modulo base helps minimize collisions. Since prime numbers offer better distribution of hash codes, reducing the |
| 71 | +likelihood of collisions (where two different keys hash to the same value). |
| 72 | + |
| 73 | +Here’s the implementation of a hash function using a prime number, 2069, as the modulo base. This particular prime number |
| 74 | +is likely chosen because it is relatively large, offering a wide range of possible hash codes and reducing the chance |
| 75 | +of collisions. |
| 76 | + |
| 77 | +```python |
| 78 | +def calculate_hash(key): |
| 79 | + key_base = 2069 |
| 80 | + return key % key_base |
| 81 | + |
| 82 | +def main(): |
| 83 | + # Example usage: |
| 84 | + keys = [1, 2068, 2070] |
| 85 | + i = 0 |
| 86 | + for key in keys: |
| 87 | + i+=1 |
| 88 | + hashed_value = calculate_hash(key) |
| 89 | + print(i, ".\tKey:", key) |
| 90 | + print("\tHashed value:", hashed_value) |
| 91 | + |
| 92 | +main() |
| 93 | +``` |
| 94 | + |
| 95 | +```text |
| 96 | +1 . Key: 1 |
| 97 | + Hashed value: 1 |
| 98 | +
|
| 99 | +2 . Key: 2068 |
| 100 | + Hashed value: 2068 |
| 101 | + |
| 102 | +3 . Key: 2070 |
| 103 | + Hashed value: 1 |
| 104 | +``` |
| 105 | + |
| 106 | +In the code provided above, collisions occur because when taking the modulo of keys with the base value of 2069, both |
| 107 | +keys 1 and 2070 yield the same hash value of 1, leading to a collision. |
| 108 | + |
| 109 | +Now, let’s look at a visual representation of hash collision: |
| 110 | + |
| 111 | + |
| 112 | + |
| 113 | +In the scenario illustrated in the diagram above, when two distinct keys are assigned to the same address, it results in |
| 114 | +a collision. Therefore, the second step is to handle collision by using a storage space where each element is indexed by |
| 115 | +the output of the hash function. To address this, we use a container, bucket, designed to store all values that are |
| 116 | +assigned the same hash value by the hash function. |
| 117 | + |
| 118 | +Let’s look at the diagram below to visualize collision handling through the use of buckets: |
| 119 | + |
| 120 | + |
| 121 | + |
| 122 | +Now, let’s design a Bucket for collision handling supporting primary operations: Get, Update, and Remove. These operations |
| 123 | +allow for efficient management of key-value pairs within each bucket, accommodating cases where multiple keys hash to |
| 124 | +the same index. |
| 125 | + |
| 126 | +- **Get(key)**: Searches the bucket for a key-value pair where the key matches the provided argument. If such a pair is |
| 127 | + found, the method returns the corresponding value. If the key does not exist within the bucket, the method returns |
| 128 | + −1. This functionality is crucial for retrieval operations in a hash table, allowing for efficient access to stored |
| 129 | + data based on keys. |
| 130 | +- **Update(key, value)**: Looks for the specified key in the bucket. If the key is found, the method updates the existing |
| 131 | + key-value pair with the new value provided. If the key is not found, the method adds a new key-value pair to the bucket. |
| 132 | + This dual functionality ensures that the bucket can dynamically adjust to changes in data, either by updating existing |
| 133 | + entries or adding new ones to accommodate new keys. |
| 134 | +- **Remove(key)**: Searches the bucket for a key-value pair matching the specified key. If such a pair is found, the |
| 135 | + method removes it from the bucket, effectively handling the deletion of entries. |
| 136 | + |
| 137 | +Collision handling occurs implicitly within the Update function of the Bucket. It effectively handles collisions by |
| 138 | +allowing multiple key-value pairs with the same hash value (i.e., the same bucket index) to coexist within the bucket. |
| 139 | + |
| 140 | +Moving forward, the third step involves designing a hash map by utilizing the hash function and the Bucket designed earlier. |
| 141 | + |
| 142 | +To design a hash map, the core operation involves locating stored values by key. Therefore, for each hash map method— |
| 143 | +Get, Put, and Remove—the primary task revolves around locating stored values by key. This process involves two steps: |
| 144 | + |
| 145 | +1. Applying the hash function to generate a hash key for a given key value, determining the address in the main storage |
| 146 | + and finding the corresponding bucket. |
| 147 | +2. Iterating through the bucket to check if the desired key-value pair exists. |
| 148 | + |
| 149 | + |
| 150 | + |
| 151 | + |
| 152 | + |
| 153 | + |
| 154 | + |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | + |
| 160 | + |
| 161 | +### Solution Summary |
| 162 | + |
| 163 | +1. Choose a prime number for the key space size (preferably a large one). |
| 164 | +2. Create an array and initialize it with empty buckets equal to the key space size. |
| 165 | +3. Generate a hash key by taking the modulus of the input key with the key space size. |
| 166 | +4. Implement the following functions: |
| 167 | + - Put(key, value): Inserts the value into the bucket at the computed hash key index |
| 168 | + - Get(key): Searches for the key in the bucket and returns the associated value |
| 169 | + - Remove(key): Deletes the element at the specified key from the bucket and the hash map |
| 170 | + |
| 171 | +### Time Complexity |
| 172 | + |
| 173 | +Each method of the hash map has a time complexity of O(N/K), where N represents the total number of possible keys, and |
| 174 | +K represents the key space size, which in our case is 2069. |
| 175 | + |
| 176 | +In an ideal scenario with evenly distributed keys, the average size of each bucket can be considered as N/K. However, in |
| 177 | +the worst-case scenario, we may need to iterate through an entire bucket to find the desired value, resulting in a time |
| 178 | +complexity of O(N) for each method. |
| 179 | + |
| 180 | +### Space Complexity |
| 181 | + |
| 182 | +The space complexity is O(K+M), where K denotes the key space size, and M represents the number of unique keys that have |
| 183 | +been inserted into the hashmap. |
0 commit comments