CMUQ 15-121 Sets & Maps



1. Introduction

In this set of notes we’ll talk about two new abstract data types as well as the Java implementations of them: Sets and Maps (also called dictionaries). You’ve seen both of these before, as they exist in Python.

2. Set

A set is a collection of elements typically used in situations where you want to quickly determine if something is present in the set.

2.1. Main Properties

A set has two, main properties:

  1. It is non-sequenced. This means that unlike an array, the elements in the set do not have indexes. You can’t access the n-th item in a set, because the set items are not sequenced.
  2. No duplicates are allowed. An item can only appear once in the set.

2.2. Sets in Java

https://docs.oracle.com/javase/8/docs/api/java/util/Set.html

Set in Java is an interface. That interface is also a Collection, so it has the following:

There are two main classes available that implement Set: TreeSet and HashSet. The names imply how they are built.

2.2.1. HashSet

https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html

A HashSet implements a Set using a hash table. Items are hashed into buckets, and if needed the table is periodically enlarged to maintain performance. On average, operations on a HashSet are all \(O(1)\). For the most part, if you want a set, this is the implementation you should use.

2.2.2. TreeSet

https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html

A TreeSet implements a Set using a tree data structure. This means that add, contains, and remove are all \(O(\textrm{log } N)\). Why would you ever use this instead of a HashSet? With a TreeSet, the items are stored in order. That means that you can iterate through them (using the Iterator) and they will be in order. (If you don’t need order, then just use a HashSet.)

3. Maps

A map, also called a dictionary, is used to map keys to values.

3.1. Main Properties

A map has a number of important properties:

  1. It is a collection of key-value pairs. (Each key is associated with one value.)
  2. Keys must be unique. (You can’t insert two things with the same key.)

At first glance, these properties are limiting: What if I want to associate more than one thing with a key? For example, imagine I’m building a music manager and want to use a map to link the playlist name with the songs in the list. The key is the playlist name, but it can only map to one thing. So how do I associate it with more than one song? Easy, I put those songs in a list (or a set), and associate the list with the key.

3.2. Maps in Java

https://docs.oracle.com/javase/8/docs/api/java/util/Map.html

Map is a Java interface. It has the following methods:

Much like sets, there are two main classes that implement Map: HashMap and TreeMap.

3.2.1. HashMap

https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html

A HashMap implements a Map using a hash table. The key is hashed and the key value pair is stored in the appropriate bucket and if needed the table is periodically enlarged to maintain performance.

The main operations average \(O(1)\) performance due to the hashtable. If you need a map, you probably want to use a HashMap.

3.2.2. TreeMap

https://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html

You are actually already intimately familiar with a TreeMap: You’ve built one. They KeyValueBinarySearchTree you constructed for homework 6 is very similar to a TreeMap. There are a few differences, however:

The main operations are \(O(\textrm{log } N)\) due to the tree. Why would you use a TreeMap instead of HashMap? Much like sets, it is about order. You can iterate through the keys of a TreeMap in order. The keys of a HashMap are not usefully ordered.

4. Examples

We’ll do some examples using these in class. Check out the demo code repository to find them. (After class, of course.)