Prefix Tree
Robin Kim
—August 14, 2014
By reading this post, you will be able to answer the following questions:
- What is a prefix tree? (Tree that groups together elements with common beginnings.)
- How do I navigate through one? (One piece of the data at a time!)
- What can I use a prefix tree for? (Autocompletion: search engine, T9 texting, spellcheck, etc.)
- Can I see one in action? (Yes!)
Before continuing, you may want to be comfortable with the following topics:
- What is a tree (data structure)?
- What are its basic characteristics?
- How does navigation through a tree and its child nodes work?
Slide 1: Introduction
I will be using a 5-minute presentation I created for my Hack Reactor cohort to act as a visual aid for this post. You can view the slideshow here, and a demo of my implementation of a prefix tree is at the end of the post.
Slide 2: Characteristics
A prefix tree is a data structure that is often used to store words. It groups elements with similar beginnings together so they share ancestor nodes. A prefix tree has quick lookup times for words, so it's a great choice for storing dictionary words.
A suffix tree groups elements with similar endings together.
Slide 3: Example A
If you want to find the word tap
, then:
- Start at the root node, which doesn't hold any data itself. (It has references to other child nodes instead.)
- Ask the root node: "Hey, do you know where I can find
t
?" - The root node replies: "Yes! It's right here!"
- Once you navigate to the
t
node, ask that node: "Where can I finda
?" - Once we reach
a
, ask forp
. - After arriving at the
p
node, we can ask it: "Are we at the end of a word (EOW)?" - We get a response: "Yes!"
- Congratulations, we've successfully found
tap
in our prefix tree.
Slide 4: Example B
Here, we look for tea
:
- Start at the root node.
- Navigate to
t
and ask: "Where can I finde
?" - Arrive at
te
! (This prefix tree is drawn so that it keeps track of all of the preceding letters.) - Ask for
a
, and arrive attea
. - The final
tea
node informs us that it is the end of a word. - Success!
Slide 5: Possible Uses
A prefix tree can be used to store autocompletion results for:
- Search engines (which also rank the relevance of the autocompletion suggestions)
- T9 texting
- Spellcheck
Slide 6: Example (Autocompletion)
In this example, we will use our given prefix tree to look for autocompletion suggestions for the string t
. Put in another way, what words start with the letter t
?
- We start at the root node and ask: "Where can I find the letter
t
?" - The root node points us to the
t
node, at which point we must begin our search for autocompletion suggestions. - We can do this by (recursively) navigating to ALL child nodes of
t
and creating a list of those which "end of word" attribute is set to true. to
,tea
,ted
, andten
are valid words, which are added to our results, whilete
is not, hence ommitted.
Slide 7: My Implementation
The listed property and methods summarize my implementation of prefix tree. I've preloaded my prefix tree with almost 5,000 common English words.
Slide 8: Time for a Demo!
Go to https://therobinkim.github.io/prefix-tree/ for a working demo.
Looking for my code? It's right here!