# Union Find

Tweet

**Disjoint Set** or **Union Find** is a data structures that tracks a set of keys/elements which are partitioned into number of disjoint sets. It is useful in number of applications such as speeding up **Kruskal's Minimum Spanning Tree**, **maximal k space clustering** etc. It basically supports two operations:

**Find(x)**: finds the subset in which element**x**belongs.**Union(x, y)**: Merges the subsets in which**x**and**y**belongs to a single subset.

## Data Structure

The data structure accomplishes the operations by representing as tree and thus maintaining the two invariants.

**Parent pointers**: The pointer to the parent of each element**x****Rank**: The rank of**x**is the maximum distance among all paths from x to the leaves

For the detail see here.

A quick implementation of Union Find, optimized with lazy unions, path compression and Union By Rank is given here.

## Runtime

The runtime of Union Find is very hard to analyze. There is a theorem for this by Ullman-HopCroft:

Starting from an empty data structure, Union By Rank with path compression performs any intermixed sequence of $m \geq n$ FIND and $n – 1$ UNION operations in $\mathcal{O}(m \log^*{n} )$ time.

where $\log^*{n}$ is number of times we need to apply $\log$ before the final result reaches 1.

Discuss this post via e-mail