Week 14: Open-Addressed Hashmaps
Today we’ll work with open-addressed hashmaps—doing this should help give you some intuition for the chained hashmaps on your homework. We’ll use OAHashMap.java as a base. You may also need Entry.java.
We’ve given you a start, and we’d like you to implement three methods: containsKey(K key)
, remove(K key)
, and iterator()
.
You do not need to support null
keys (unlike your homework).
A guided tour
The OAHashMap class implements most of an open-addressed hashtable. Let’s walk through the code.
First, a pretty standard preamble.
package cs284;
import java.util.Iterator;
import java.util.Objects;
Next, we declare the class:
public class OAHashMap<K, V> implements Iterable<Entry<K, V>> {
Note that we take parameters, and say that we will be Iterable
. That’s for the last TODO.
In open addressing, the table underlying our map is just an array of entries. We define that type as an inner class:
class OAEntry {
public K key;
public V val;
public boolean deleted;
public OAEntry(K key, V val) {
this.key = key;
this.val = val;
this.deleted = false;
}
}
Note that OAEntry
does not implement Entry<K,V>
. (A different implementation might have it do so, but we don’t.)
Next, we define the members. In addition to the backing array, we store some metadata about the elements in the array. We also define a constant that specifies our initial size.
private static final int INITIAL_SIZE = 16;
private Object[] table;
private int size;
private int deleted;
private int rehashes;
Notice that we define the table as an Object[]
rather than an
OAEntry[]
. This is because of limits in Java’s type system: you
can’t allocate a generic array (or an array of things that have
generic members). We’ll have to use casts later whenever we read from
the array.
Next up is the default constructor, which initializes the array; by default, each entry in a new array will be null
.
public OAHashMap() {
table = new Object[INITIAL_SIZE];
}
Next, we define some accessors for the metadata:
public int size() {
return size;
}
public int deleted() {
return deleted;
}
public int capacity() {
return table.length;
}
public int rehashes() {
return rehashes;
}
Okay: so much for the boring stuff. Next up is the method for looking up values in the table, along with two helpers.
private int hash(K key) {
int hash = Objects.hashCode(key);
return Math.abs(hash) % table.length;
}
private int probe(K key) {
int hash = hash(key);
int index = hash;
do {
@SuppressWarnings("unchecked")
OAEntry e = (OAEntry) table[index];
if (e == null || e.key.equals(key)) {
return index;
}
index = (index + 1) % table.length;
} while (index != hash);
throw new IllegalStateException("wrapped around table");
}
public V get(K key) {
@SuppressWarnings("unchecked")
OAEntry e = (OAEntry) table[probe(key)];
if (e == null || e.deleted) {
return null;
}
assert e.key.equals(key);
return e.val;
}
First, let’s look at get(K key)
. It uses the probe(K key)
private
helper method to index into the table, casting it to an OAEntry
. If the entry is null
or marked as deleted (recall that ||
is short-circuiting!), it returns null
. If the entry isn’t null and isn’t deleted, we return the corresponding value… after assert
ing that the key should match.
Aside: assert expr;
is a statement that by default
does… nothing. When running with assertions enabled, Java will
ensure that expr
evaluated to true
without raising an exception;
if it doesn’t, it raises an AssertionError
. Using assert
s to check
your assumptions is a good way to help yourself debug.
So the get(K key)
method is pretty much a wrapper around probe(K key)
, which finds the index in the table where key
would be if it
were present. The probe
method hashes the key (using the hash(K key)
helper) and then walks until it finds a null
or matching
entry
, cycling around the array as it goes. Note that falling out of
the loop leads to an exception!
Note that we’ll return the index of a deleted entry with the same key if it’s there.
If you’re not sure how it works, write a main
method that inserts a few elements, and step through in the debugger.
Implementing containsKey(K key)
Your first task is to implement the containsKey(K key)
method. It
should return true
if there’s an entry for key
in the map and
false
otherwise. You should be able to follow the logic of get(K key)
closely.
The tour continues
Next, we’ll look at put(K key, V val)
and its helper.
private boolean rehashing = false;
@SuppressWarnings("unchecked")
private void maybeRehash() {
int newSize = size + deleted + 1; // only called on put/remove
if (newSize < 0.7 * capacity()) {
return;
}
assert !rehashing;
rehashing = true;
Object[] oldTable = table;
table = new Object[table.length * 2];
size = 0;
deleted = 0;
for (var o : oldTable) {
OAEntry e = (OAEntry) o;
if (e != null && !e.deleted) {
put(e.key, e.val);
}
}
rehashes += 1;
rehashing = false;
}
public V put(K key, V val) {
maybeRehash();
int index = probe(key);
@SuppressWarnings("unchecked")
OAEntry old = (OAEntry) table[index];
V oldValue = old == null ? null : old.val;
if (old != null) {
assert old.key.equals(key);
if (old.deleted) {
deleted -= 1;
size += 1;
old.deleted = false;
}
old.key = key;
old.val = val;
} else {
table[index] = new OAEntry(key, val);
size += 1;
}
return oldValue;
}
The put(K key, V val)
method begins by calling maybeRehash()
, a helper function that checks to see if adding an element will put the load factor above 0.7… and if so, it rehashes the whole array.
There are a few noteworthy things here:
- The
@SuppressWarnings("unchecked")
lines are to ignore warnings from the casts involving generic types. maybeRehash()
usesput(K key, V val)
to rehash by just putting a new table in place.- A boolean
rehashing
keeps track of whether we’re in the middle of rehashing, to preventmaybeRehash
from calling itself and looping forever.
The put
method works by probing for the index and then examining the
entry already there. It needs to properly handle null
entries (for
when the key is new to the map) and non-null entries (for when the map
has an entry for the key). When the map already has an entry (old != null)
, it must check to see if the element was deleted, in which case
it corrects metadata, too. Otherwise it just creates a new
entry. Finally, it returns the oldValue
(which may be null
).
Implementing remove(K key)
You should implement remove(K key)
; you can follow put(K key, V val)
as a hint for structure. Be careful to update the metadata
correctly!
Implementing iterator()
Finally, you need to implement iterator()
. That means defining
hasNext()
and next()
methods in the anonymous inner class.
Here’s a way to do it:
- Add an
int
field to the inner class calledindex
. Start it at - Define a private helper method called
advance()
; it shouldn’t return anything.- When called,
advance()
checks to see whetherindex
points to a non-null, non-deleted entry. - If the
index
th entry is valid, it returns. - If not, it increments
index
and continues. - Be careful to stop when
index
meets or exceeds the size of your table!
- When called,
- Define
hasNext()
to calladvance()
and then returntrue
whenindex
is a valid index in the table. - Define
next()
to calladvance()
, save theindex
th element, incrementindex
, and return that saved element.
Advice
Don’t try to just do everything at once! Write tests as you go to make sure everything is working.
The debugger is your friend! Set breakpoints and watch how things work on particular inputs.