left-icon

Data Structures Succinctly® Part 2
by Robert Horvick

Previous
Chapter

of
A
A
A

CHAPTER 2

Hash Table

Hash Table


Hash Table Overview

Hash tables are a collection type that store key–value pairs in a manner that provides fast insertion, lookup, and removal operations. Hash tables are commonly used in, but certainly not limited to, the implementation of associative arrays and data caches. For example, a website might keep track of active sessions in a hash table using the following pattern:

HashTable<string, SessionState> _sessionStateCache;

...

public SessionState LoadSession(string sessionId)

{

    SessionState state;

    if (!_sessionStateCache.TryGetValue(sessionId, out state))

    {

        state = new SessionState(sessionId);

        _sessionStateCache[sessionId] = state;

    }

    return state;

}

In this example, a hash table is being used to store session state using the session ID as the key and the session state as the value. When the session state is sought in the hash table, if it is not found, a new session state object is added and, in either case, the state that matches the session ID is returned.

Using a hash table in this manner allows fast insertion and retrieval (on average) of the session state, regardless of how many active sessions are occurring concurrently.

Hashing Basics

Overview

The Key and Value

To understand how a hash table works, let’s look at a conceptual overview of adding an item to a hash table and then finding that item.

The object we’ll be storing (shown in JSON format) represents an employee at a company.

{

    "Name":"Robert Horvick",

    "Hire Date":"11/2/2010",

    "Department":"Engineering"

}

Recall that to store an item in a hash table, we need to have both a key and a value. Our object is the value, so now we need to pick a key. Ideally we would pick something that can uniquely represent the object being stored; however, in this case we will use the employee name (Robert Horvick) to demonstrate that the key can be any data type. In practice, the Employee class would contain a unique ID that distinguishes between multiple employees who share the same name, and that ID would be the key we use.

The Backing Array

For fast access to items, hash tables are backed by an array (O(1) random access) rather than a list (O(n) random access). At any given moment, the array has two properties that are interesting:

  1. Capacity
  2. Fill Factor

Capacity is the number of items the array could possibly hold. For example, the following empty array has a capacity of 10:

An array with a capacity of 10

Figure 12: An array with a capacity of 10

Fill factor is the percentage of array items that are filled (in use). For example, the following array has a fill factor of 0.40 (40%):

An array with a capacity of 10 and fill factor of 0.40 (40%).

Figure 13: An array with a capacity of 10 and fill factor of 0.40 (40%).

Notice that the array is filled in an apparently random manner. While the array contains four items, the items are not stored in indexes 0–3, but rather 1, 2, 4, and 6. This is because the index at which an item is stored is determined by a hash function which takes the key component—Robert Horvick, in our example—and returns an integer hash code. This hash code will then be fit into the array’s size using the modulo operation. For example:

int index = hash(“Robert Horvick”) % Capacity;

Hashing Algorithms

The previous code sample makes a call to a function named hash, which accepts a string and returns an integer. This integer is the hash code of the provided string.

Before we go further, let’s take a moment to consider just how important hash codes are. The .NET framework requires that all classes derive from the base type System.Object. This type provides the base implementation of several methods, one of which has the following signature:

int GetHashCode()

Putting this method on the common base type ensures that every type will be able to produce a hash code and therefore be capable of being stored in a collection type that requires a hash code.

The question, then, is what should the hash code for any given object instance be? How does a System.String with a value like "Robert Horvick" produce an integer value suitable for being used as a hash code?

The function needs to have two properties. First, the hash algorithm must be stable. This means that given the same input, the same hash value will always be returned. Second, the hash algorithm must be uniform. This means that hash function maps input values to output values in a manner that is evenly (uniformly) distributed through the entire output range.

Here is a (bad) example:

public int LengthHashCode(string input)

{

    return input.Length;

}

This hash code method returns the length of the string as the hash code. This method is stable. The string "Robert Horvick" will always return the same hash code (14). But this method does not have uniform distribution. What would happen if we had one million unique strings, each of which was 50 characters long? Each of them would have a hash code of 50. This is not a uniform distribution, and therefore not a suitable hash algorithm for strings.

Here’s a slightly better (bad) example:

private int AdditiveHash(string input)

{

    int currentHashValue = 0;

    foreach (char c in input)

    {

        unchecked

        {

            currentHashValue += (int)c;

        }

    }

    return currentHashValue;

}

This hash function has only slightly better uniformity than the length-based hash. While an additive hash does allow same-length strings produce different hashes, it also means that “Robert Horvick” and “Horvick Robert” will both produce the same hash value.

Now that we know what a poor hashing algorithm looks like, let’s take a look at a significantly better string hashing algorithm. This algorithm was first reported by Dan Bernstein (http://www.cse.yorku.ca/~oz/hash.html) and uses an algorithm that, for each character in the value to hash (c), sets the current hash value to hash = (hash * 33) + c.

// Hashing function first reported by Dan Bernstein.

// http://www.cse.yorku.ca/~oz/hash.html

private static int Djb2(string input)

{

    int hash = 5381;

    foreach (int c in input.ToCharArray())

    {

        unchecked

        {

            /* hash * 33 + c */

            hash = ((hash << 5) + hash) + c;

        }

    }

    return hash;

}

Just for fun, let’s look at one more hash algorithm. This hash algorithm, known as a folding hash, does not process the string character by character, but rather in 4-byte blocks. Let’s take a look at how the ASCII string “Robert Horvick” would be hashed. First, the string is broken up into 4-byte blocks. Since we are using ASCII encoding, each character is one block, and so the segments are:

[Robe]

[rt H]

[orvi]

[ck]

Each of those characters is represented by a 1-byte numeric ASCII code. Those bytes are:

[0x52 0x6F 0x62 0x65]

[0x72 0x74 0x20 0x48]

[0x6F 0x72 0x76 0x69]

[0x63 0x6B]

These bytes are then stuffed into 32-bit values (the bytes are reversed here due to how they are loaded into the resulting integer. See the GetNextBytes method in the sample code.)

0x65626F52

0x48207472

0x6976726F

0x00006B63

The values are summed, allowing overflow to occur, and we are given the final hash value: 0x16F9C196.

// Treats each four characters as an integer, so

// "aaaabbbb" hashes differently than "bbbbaaaa".

private static int FoldingHash(string input)

{

    int hashValue = 0;

    int startIndex = 0;

    int currentFourBytes;

    do

    {

        currentFourBytes = GetNextBytes(startIndex, input);

        unchecked

        {

            hashValue += currentFourBytes;

        }

        startIndex += 4;

    } while (currentFourBytes != 0);

    return hashValue;

}

// Gets the next four bytes of the string converted to an

// integer. If there are not enough characters, 0 is used.

private static int GetNextBytes(int startIndex, string str)

{

    int currentFourBytes = 0;

    currentFourBytes += GetByte(str, startIndex);

    currentFourBytes += GetByte(str, startIndex + 1) << 8;

    currentFourBytes += GetByte(str, startIndex + 2) << 16;

    currentFourBytes += GetByte(str, startIndex + 3) << 24;

    return currentFourBytes;

}

private static int GetByte(string str, int index)

{

    if (index < str.Length)

    {

        return (int)str[index];

    }

    return 0;

}

The last two hashing functions are conceptually simple and also simple to implement. But how good are they? I created a simple test that generated one million unique values by converting GUIDs to strings. I then hashed those one million unique strings and recorded the number of hash collisions, which occur when two distinct values have the same hash value. The results were:

DJB2 unique values: 99.88282%

Folding unique values: 97.75495%

As you can see, both hash algorithms distributed the hash values relatively evenly with DJB2 having slightly better distribution than the folding hash.

Handling Collisions

As we saw in the previous section, a good hashing algorithm is one that will distribute the hashed values evenly over the possible range of hash values, but we also saw that even a good algorithm will likely produce a collision. Further, we know that the hash value will eventually be fit into the backing array size using the modulo operator, so even a perfect hashing algorithm may eventually have collisions when the hash value is fit into the backing array size.

Next Open Slot

The next open slot method walks forward in the backing array searching for the next open slot and places the item in that location. For example, in the following figure, the values V1 and V2 have the same hash value. Since V1 is already in the hash table, V2 moves forward to the next open slot in the hash table.

Collisions of the hash values for V1 and V2

Figure 14: Collisions of the hash values for V1 and V2

During the look-up process, if the value V2 is sought, the index for V1 will be found. The value of V1 and V2 will be compared and they will not match. Since next-slot collision handling is being used, the hash table needs to check the next index to determine if a collision was moved forward. In the next slot, the value V2 is found and compared to the sought value, V2. Since they are the same, the appropriate backing array index has been found.

We can see that this method has simple insertion and search rules, but unfortunately has complex removal logic.

Consider what would happen if V1 were removed: The third index, which V1 was in, is now empty. If a search for V2 were performed, the expected index would be empty so it would be assumed V2 is not in the hash table, even though it is. This means that during the removal process, all values adjacent to the item being removed need to be checked to see if they need to be moved.

One trade-off of this collision handling algorithm is that removals are complex, but the entire hash table is stored in a single, contiguous backing array. This might make it attractive on systems where memory resources are limited, or where data locality in memory is extremely important.

Linked List Chains

Another method of handling collisions is to have each index in the hash table backing array be a linked list of nodes. When a collision occurs, the new value is added to the linked list. For example:

V2 is added to the linked list after V1

Figure 15: V2 is added to the linked list after V1

With a language that supports union types (e.g., C++), the array index will typically contain a value that can be either the single value when there have not been any collisions, or a linked list. The sample code in the next section will always create a linked list, but will only do so when an item is added at the index.

HashTableNodePair Class

The HashTableNodePair class is the key–value pair stored within the hash table array. Unlike the .NET Framework KeyValuePair class, the HashTableNodePair does not allow the key member to be assigned after construction, because that value is used to determine the index in the hash table where the key–value pair is stored.

/// <summary>

/// A node in the hash table array.

/// </summary>

/// <typeparam name="TKey">The type of the key of the key/value pair.</typeparam>

/// <typeparam name="TValue">The type of the value of the key/value pair.</typeparam>

public class HashTableNodePair<TKey, TValue>

{

    /// <summary>

    /// Constructs a key/value pair for storage in the hash table.

    /// </summary>

    /// <param name="key">The key of the key/value pair.</param>

    /// <param name="value">The value of the key/value pair.</param>

    public HashTableNodePair(TKey key, TValue value)

    {

        Key = key;

        Value = value;

    }

    /// <summary>

    /// The key. The key cannot be changed because it would affect the

    /// indexing in the hash table.

    /// </summary>

    public TKey Key { get; private set; }

    /// <summary>

    /// The value.

    /// </summary>

    public TValue Value { get; set; }

}

HashTableArrayNode Class

The HashTableArrayNode class represents a single node within the hash table. It performs a lazy initialization of the linked list used for handling collisions. It provides methods for adding, removing, updating, and retrieving the key–value pairs stored in the node. Additionally, it provides enumeration of the keys and values in order to support the hash table’s enumeration requirements.

internal class HashTableArrayNode<TKey, TValue>

{

    // This list contains the actual data in the hash table. It chains together

    // data collisions.

    LinkedList<HashTableNodePair<TKey, TValue>> _items;

    public void Add(TKey key, TValue value);

    public void Update(TKey key, TValue value);

    public bool TryGetValue(TKey key, out TValue value);

    public bool Remove(TKey key);

    public void Clear();

    public IEnumerable<TValue> Values { get; }

    public IEnumerable<TKey> Keys { get; }

    public IEnumerable<HashTableNodePair<TKey, TValue>> Items { get; }

}

Add

Behavior

Adds the key–value pair to the node, lazily initializing the linked list when adding the first value. If the key being added already exists, an exception is thrown.

Performance

O(1)

/// <summary>

/// Adds the key/value pair to the node. If the key already exists in the

/// list, an ArgumentException will be thrown.

/// </summary>

/// <param name="key">The key of the item being added.</param>

/// <param name="value">The value of the item being added.</param>

public void Add(TKey key, TValue value)

{

    // Lazy init the linked list.

    if (_items == null)

    {

        _items = new LinkedList<HashTableNodePair<TKey, TValue>>();

    }

    else

    {

        // Multiple items might collide and exist in this list, but each

        // key should only be in the list once.

        foreach (HashTableNodePair<TKey, TValue> pair in _items)

        {

            if (pair.Key.Equals(key))

            {

                throw new ArgumentException("The collection already contains the key");

            }

        }

    }

    // If we made it this far, add the item.

    _items.AddFirst(new HashTableNodePair<TKey, TValue>(key, value));

}

Update

Behavior

Finds the key–value pair with the matching key and updates the associated value. If the key is not found, an exception is thrown.

Performance

O(n), where n is the number of values in the linked list. In general this will be an O(1) algorithm because there will not be a collision.

/// <summary>

/// Updates the value of the existing key/value pair in the list.

/// If the key does not exist in the list, an ArgumentException

/// will be thrown.

/// </summary>

/// <param name="key">The key of the item being updated.</param>

/// <param name="value">The updated value.</param>

public void Update(TKey key, TValue value)

{

    bool updated = false;

    if (_items != null)

    {

        // Check each item in the list for the specified key.

        foreach (HashTableNodePair<TKey, TValue> pair in _items)

        {

            if (pair.Key.Equals(key))

            {

                // Update the value.

                pair.Value = value;

                updated = true;

                break;

            }

        }

    }

    if (!updated)

    {

        throw new ArgumentException("The collection does not contain the key.");

    }

}

TryGetValue

Behavior

Sets the out parameter value to the value associated with the provided key and returns true if the key is found. Otherwise it returns false.

Performance

O(n), where n is the number of values in the linked list. In general, this will be an O(1) algorithm because there will not be a collision.

/// <summary>

/// Finds and returns the value for the specified key.

/// </summary>

/// <param name="key">The key whose value is sought.</param>

/// <param name="value">The value associated with the specified key.</param>

/// <returns>True if the value was found, false otherwise.</returns>

public bool TryGetValue(TKey key, out TValue value)

{

    value = default(TValue);

    bool found = false;

    if (_items != null)

    {

        foreach (HashTableNodePair<TKey, TValue> pair in _items)

        {

            if (pair.Key.Equals(key))

            {

                value = pair.Value;

                found = true;

                break;

            }

        }

    }

    return found;

}

Remove

Behavior

Finds the key–value pair with the matching key and removes the key–value pair from the linked list. If the pair is removed, the value true is returned. Otherwise it returns false.

Performance

O(n), where n is the number of values in the linked list. In general, this will be an O(1) algorithm because there will not be a collision.

  

/// <summary>

/// Removes the item from the list whose key matches

/// the specified key.

/// </summary>

/// <param name="key">The key of the item to remove.</param>

/// <returns>True if the item is removed; false otherwise.</returns>

public bool Remove(TKey key)

{

    bool removed = false;

    if (_items != null)

    {

        LinkedListNode<HashTableNodePair<TKey, TValue>> current = _items.First;

        while (current != null)

        {

            if (current.Value.Key.Equals(key))

            {

                _items.Remove(current);

                removed = true;

                break;

            }

            current = current.Next;

        }

    }

    return removed;

}

Clear

Behavior

Removes all the items from the linked list.

Note: This implementation simply clears the linked list; however, it would also be possible to assign the _items reference to null and let the garbage collector reclaim the memory. The next call to Add would allocate a new linked list.

Performance

O(1)

/// <summary>

/// Removes all the items from the list.

/// </summary>

public void Clear()

{

    if (_items != null)

    {

        _items.Clear();

    }

}

Enumeration

Keys

Behavior

Returns an enumerator that enumerates the keys in the linked list.

Performance

O(1)

/// <summary>

/// Returns an enumerator for all of the keys in the list.

/// </summary>

public IEnumerable<TKey> Keys

{

    get

    {

        if (_items != null)

        {

            foreach (HashTableNodePair<TKey, TValue> node in _items)

            {

                yield return node.Key;

            }

        }

    }

}

Values

Behavior

Returns an enumerator that enumerates the values in the linked list.

Performance

O(1)

/// <summary>

/// Returns an enumerator for all of the values in the list.

/// </summary>

public IEnumerable<TValue> Values

{

    get

    {

        if (_items != null)

        {

            foreach (HashTableNodePair<TKey, TValue> node in _items)

            {

                yield return node.Value;

            }

        }

    }

}

Items

Behavior

Returns an enumerator that enumerates the key–value pairs in the linked list.

Performance

O(1)

/// <summary>

/// Returns an enumerator for all the key/value pairs in the list.

/// </summary>

public IEnumerable<HashTableNodePair<TKey, TValue>> Items

{

    get

    {

        if (_items != null)

        {

            foreach (HashTableNodePair<TKey, TValue> node in _items)

            {

                yield return node;

            }

        }

    }

}

HashTableArray Class

The HashTableArray class is the backing array of the HashTable class. It handles the jobs of finding the appropriate backing array index and deferring to the HashTableArrayNode class.

class HashTableArray<TKey, TValue>

{

    HashTableArrayNode<TKey, TValue>[] _array;

    /// <summary>

    /// Constructs a new hash table array with the specified capacity.

    /// </summary>

    /// <param name="capacity">The capacity of the array.</param>

    public HashTableArray(int capacity)

    {

        _array = new HashTableArrayNode<TKey, TValue>[capacity];

    }

    public void Add(TKey key, TValue value);

    public void Update(TKey key, TValue value);

    public bool Remove(TKey key);

    public bool TryGetValue(TKey key, out TValue value);

    public int Capacity { get; }

    public void Clear();

    public IEnumerable<TValue> Values  { get; }

    public IEnumerable<TKey> Keys { get; }

    public IEnumerable<HashTableNodePair<TKey, TValue>> Items { get; }

    // Maps a key to the array index based on the hash code.

    private int GetIndex(TKey key);

}

Add

Behavior

Adds the key–value pair to the node array. If the key already exists in the node array, an exception will be thrown.

Performance

O(1)

The main purpose of this method is to lazily allocate the HashTableArrayNode instance so that only hash table entries that actually hold a value allocate an instance.

/// <summary>

/// Adds the key/value pair to the node. If the key already exists in the

/// node array, an ArgumentException will be thrown.

/// </summary>

/// <param name="key">The key of the item being added.</param>

/// <param name="value">The value of the item being added.</param>

public void Add(TKey key, TValue value)

{

    int index = GetIndex(key);

    HashTableArrayNode<TKey, TValue> nodes = _array[index];

    if (nodes == null)

    {

        nodes = new HashTableArrayNode<TKey, TValue>();

        _array[index] = nodes;

    }

    nodes.Add(key, value);

}

Update

Behavior

Updates the value of the key–value pair whose key matches the provided key. If the key does not exist, an exception is thrown.

Performance

O(n), where n is the number of items stored in the HashTableNodeArray instance. This will typically be an O(1) operation.

/// <summary>

/// Updates the value of the existing key/value pair in the node array.

/// If the key does not exist in the array, an ArgumentException

/// will be thrown.

/// </summary>

/// <param name="key">The key of the item being updated.</param>

/// <param name="value">The updated value.</param>

public void Update(TKey key, TValue value)

{

    HashTableArrayNode<TKey, TValue> nodes = _array[GetIndex(key)];

    if (nodes == null)

    {

        throw new ArgumentException("The key does not exist in the hash table", "key");

    }

    nodes.Update(key, value);

}

TryGetValue

Behavior

Finds the value associated with the provided key and sets the out parameter to that value (else the default value for the contained type). Returns true if the value is found. Otherwise it returns false.

Performance

O(n), where n is the number of items stored in the HashTableNodeArray instance. This will typically be an O(1) operation.

/// <summary>

/// Finds and returns the value for the specified key.

/// </summary>

/// <param name="key">The key whose value is sought.</param>

/// <param name="value">The value associated with the specified key.</param>

/// <returns>True if the value is found; false otherwise.</returns>

public bool TryGetValue(TKey key, out TValue value)

{

    HashTableArrayNode<TKey, TValue> nodes = _array[GetIndex(key)];

    if (nodes != null)

    {

        return nodes.TryGetValue(key, out value);

    }

    value = default(TValue);

    return false;

}

Remove

Behavior

Removes the key–value pair whose key matches the provided key. Returns true if the key is found and removed. Otherwise it returns false.

Performance

O(n), where n is the number of items stored in the HashTableNodeArray instance. This will typically be an O(1) operation.

/// <summary>

/// Removes the item from the node array whose key matches

/// the specified key.

/// </summary>

/// <param name="key">The key of the item to remove.</param>

/// <returns>True if the item was removed; false otherwise.</returns>

public bool Remove(TKey key)

{

    HashTableArrayNode<TKey, TValue> nodes = _array[GetIndex(key)];

    if (nodes != null)

    {

        return nodes.Remove(key);

    }

    return false;

}

GetIndex

Behavior

Returns the index in the backing array the key hashes to.

Performance

O(1)

// Maps a key to the array index based on the hash code.

private int GetIndex(TKey key)

{

    return Math.Abs(key.GetHashCode() % Capacity);

}

Clear

Behavior

Removes all items from the hash table array.

Performance

O(n), where n is the number of nodes in the table that contain data.

/// <summary>

/// Removes every item from the hash table array.

/// </summary>

public void Clear()

{

    foreach (HashTableArrayNode<TKey, TValue> node in _array.Where(node => node != null))

    {

        node.Clear();

    }

}

Capacity

Behavior

Returns the capacity of the hash table array. Note: it is important to remember that the capacity of the hash table array is not the same as the hash table’s item count.

Performance

O(1)

/// <summary>

/// The capacity of the hash table array.

/// </summary>

public int Capacity

{

    get

    {

        return _array.Length;

    }

}

Enumeration

Keys

Behavior

Returns an enumerator of all the keys contained in the hash table array.

Performance

O(n), where n is the total number of items contained in the hash table array and all of its contained nodes.

/// <summary>

/// Returns an enumerator for all of the keys in the node array.

/// </summary>

public IEnumerable<TKey> Keys

{

    get

    {

        foreach (HashTableArrayNode<TKey, TValue> node in

                _array.Where(node => node != null))

        {

            foreach (TKey key in node.Keys)

            {

                yield return key;

            }

        }

    }

}

Values

Behavior

Returns an enumerator of all the values contained in the hash table array.

Performance

O(n), where n is the total number of items contained in the hash table array and all of its contained nodes.

/// <summary>

/// Returns an enumerator for all of the values in the node array.

/// </summary>

public IEnumerable<TValue> Values

{

    get

    {

        foreach (HashTableArrayNode<TKey, TValue> node in

                _array.Where(node => node != null))

        {

            foreach (TValue value in node.Values)

            {

                yield return value;

            }

        }

    }

}

Items

Behavior

Returns an enumerator of all the key–value pairs contained in the hash table array.

Performance

O(n), where n is the total number of items contained in the hash table array and all of its contained nodes.

/// <summary>

/// Returns an enumerator for all of the Items in the node array.

/// </summary>

public IEnumerable<HashTableNodePair<TKey, TValue>> Items

{

    get

    {

        foreach (HashTableArrayNode<TKey, TValue> node in

                _array.Where(node => node != null))

        {

            foreach (HashTableNodePair<TKey, TValue> pair in node.Items)

            {

                yield return pair;

            }

        }

    }

}

HashTable Class

public class HashTable<TKey, TValue>

{

    // If the array exceeds this fill percentage, it will grow.

    const double _fillFactor = 0.75;

    // The maximum number of items to store before growing.

    // This is just a cached value of the fill factor calculation.

    int _maxItemsAtCurrentSize;

    // The number of items in the hash table.

    int _count;

    // The array where the items are stored.

    HashTableArray<TKey, TValue> _array;

    /// <summary>

    /// Constructs a hash table with the default capacity.

    /// </summary>

    public HashTable()

        : this(1000)

    {

    }

    /// <summary>

    /// Constructs a hash table with the specified capacity.

    /// </summary>

    public HashTable(int initialCapacity)

    {

        if (initialCapacity < 1)

        {

            throw new ArgumentOutOfRangeException("initialCapacity");

        }

        _array = new HashTableArray<TKey, TValue>(initialCapacity);

        // When the count exceeds this value, the next Add will cause the

        // array to grow.

        _maxItemsAtCurrentSize = (int)(initialCapacity * _fillFactor) + 1;

    }

    public void Add(TKey key, TValue value);

    public bool Remove(TKey key);

    public TValue this[TKey key] { get; set; }

    public bool TryGetValue(TKey key, out TValue value);

    public bool ContainsKey(TKey key);

    public bool ContainsValue(TValue value);

    public IEnumerable<TKey> Keys { get; }

    public IEnumerable<TValue> Values { get; }

    public void Clear();

    public int Count { get; }

}

Add

Behavior

Adds a key–value pair to the hash table, throwing an exception if the key already exists in the table.

Performance

O(1) on average. O(n + 1) when array growth occurs.

This method provides a level of abstraction over the HashTableArray class to deal with growing the backing array when the add operation exceeds the maximum capacity. When growing the backing array, it uses the fill factor to determine the maximum item count given the current backing array capacity.

/// <summary>

/// Adds the key/value pair to the hash table. If the key already exists in the

/// hash table, an ArgumentException will be thrown.

/// </summary>

/// <param name="key">The key of the item being added.</param>

/// <param name="value">The value of the item being added.</param>

public void Add(TKey key, TValue value)

{

    // If we are at capacity, the array needs to grow.

    if (_count >= _maxItemsAtCurrentSize)

    {

        // Allocate a larger array

        HashTableArray<TKey, TValue> largerArray = new HashTableArray<TKey, TValue>(_array.Capacity * 2);

        // and re-add each item to the new array.

        foreach (HashTableNodePair<TKey, TValue> node in _array.Items)

        {

            largerArray.Add(node.Key, node.Value);

        }

        // The larger array is now the hash table storage.

        _array = largerArray;

        // Update the new max items cached value.

        _maxItemsAtCurrentSize = (int)(_array.Capacity * _fillFactor) + 1;

    }

    _array.Add(key, value);

    _count++;

}

Indexing

Behavior

Retrieves the value with the provided key. If the key does not exist in the hash table, an exception is thrown.

Performance

O(1) on average; O(n) in the worst case.

/// <summary>

/// Gets and sets the value with the specified key. ArgumentException is

/// thrown if the key does not already exist in the hash table.

/// </summary>

/// <param name="key">The key of the value to retrieve.</param>

/// <returns>The value associated with the specified key.</returns>

public TValue this[TKey key]

{

    get

    {

        TValue value;

        if (!_array.TryGetValue(key, out value))

        {

            throw new ArgumentException("key");

        }

        return value;

    }

    set

    {

        _array.Update(key, value);

    }

}

TryGetValue

Behavior

Finds the value associated with the provided key and sets the out parameter to that value (else the default value for the contained type). Returns true if the value is found. Otherwise it returns false.

Performance

O(1) on average; O(n) in the worst case.

/// <summary>

/// Finds and returns the value for the specified key.

/// </summary>

/// <param name="key">The key whose value is sought.</param>

/// <param name="value">The value associated with the specified key.</param>

/// <returns>True if the value is found; false otherwise.</returns>

public bool TryGetValue(TKey key, out TValue value)

{

    return _array.TryGetValue(key, out value);

}

Remove

Behavior

Removes the key–value pair whose key matches the provided key. Returns true if the key is found and removed. Otherwise it returns false.

Performance

O(1) on average; O(n) in the worst case.

/// <summary>

/// Removes the item from the hash table whose key matches

/// the specified key.

/// </summary>

/// <param name="key">The key of the item to remove.</param>

/// <returns>True if the item is removed; false otherwise.</returns>

public bool Remove(TKey key)

{

    bool removed = _array.Remove(key);

    if (removed)

    {

        _count--;

    }

    return removed;

}

ContainsKey

Behavior

Returns true if the specified key exists in the hash table. Otherwise it returns false.

Performance

O(1) on average; O(n) in the worst case.

/// <summary>

/// Returns a Boolean indicating whether the hash table contains the specified key.

/// </summary>

/// <param name="key">The key whose existence is being tested.</param>

/// <returns>True if the value exists in the hash table; false otherwise.</returns>

public bool ContainsKey(TKey key)

{

    TValue value;

    return _array.TryGetValue(key, out value);

}

ContainsValue

Behavior

Returns true if the hash table contains a value matching the provided value.

Performance

O(n)

Note: It is important to remember that while a hash table does not contain conflicting keys, it could contain multiple instances of the same value.

/// <summary>

/// Returns a Boolean indicating whether the hash table contains the specified value.

/// </summary>

/// <param name="value">The value whose existence is being tested.</param>

/// <returns>True if the value exists in the hash table; false otherwise.</returns>

public bool ContainsValue(TValue value)

{

    foreach (TValue foundValue in _array.Values)

    {

        if (value.Equals(foundValue))

        {

            return true;

        }

    }

    return false;

}

Clear

Behavior

Removes all items from the hash table.

Performance

O(1)

/// <summary>

/// Removes all items from the hash table.

/// </summary>

public void Clear()

{

    _array.Clear();

    _count = 0;

}

Count

Behavior

Returns the number of items contained in the hash table.

Performance

O(1)

Note: The capacity of the backing array and the number of items stored in the hash table are not the same (and with a fill factor less than 1 will never be the same).

/// <summary>

/// The number of items currently in the hash table.

/// </summary>

public int Count

{

    get

    {

        return _count;

    }

}

Enumeration

Keys

Behavior

Returns an enumerator of all the keys contained in the hash table.

Performance

O(n)

/// <summary>

/// Returns an enumerator for all of the keys in the hash table.

/// </summary>

public IEnumerable<TKey> Keys

{

    get

    {

        foreach (TKey key in _array.Keys)

        {

            yield return key;

        }

    }

}

Values

Behavior

Returns an enumerator of all the values contained in the hash table.

Performance

O(n)

/// <summary>

/// Returns an enumerator for all of the values in the hash table.

/// </summary>

public IEnumerable<TValue> Values

{

    get

    {

        foreach (TValue value in _array.Values)

        {

            yield return value;

        }

    }

}

Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.