Close
Trie internal structure.

Trie for efficient information retrieval and its applications

Introduction

Trie is an n-ary tree data structure, which is also known as a prefix tree. This name came into existence from the term ‘Information retrieval‘, indeed this data structure tries to achieve faster lookup for a given word from a set of words dictionary. However it is fairly different from the binary search tree, wherein the internal nodes store the keys. But in Trie internal nods only store the prefix of and help guide the search. Now let’s go into more details.

Trie Internal structure

Each Trie node contains a set of symbols pointing to the next node in the tree, that represent the language. For instance ASCII Or UniCode. Also each node saves the boolean that represents the end of the word.

The below example shows Trie with English alphabets as symbols, then in that case node structure would look like below.

private class TrieNode {         
   List<TrieNode> list;         
   boolean endOfWord;         
   String word;      
} 
Note: The size of list is 26 in above English alphabet symbols case

Below depicts an example, how a Trie would look like given below set of strings inserted. Now it seems clear why it is called prefix tree, For example below common prefix for data, date is dat which is stored once for both words. In a way trie is also helping in saving some space by saving common prefix only once.

Trie operations

Now lets take look at what operations does Trie supports with there implementations.

Insert

Algorithm Insert (word)

  • Assign Tnode = TrieRoot
  • For each character symbol from word string
  • TNode.EndOfWord = true
  • TNode.Data = InsertKey
  • Get the index of the symbols from the set of symbols
  • If TNode.list[index] == Null
    • TNode.list[index] = new Trie node
  • TNode = TNode.list[index]

Implementation

public void trieInsert(String in) { 
int index;
        TrieNode temp = mRoot;
        if (in == null) {
            return;
        }
        for(char character : in.toCharArray()) {
            index = mSymbols.get(character);
               if (temp.list.get(index) == null) {
                   temp.list.add(index, new TrieNode());
                   temp = temp.list.get(index);
                } else {
                   temp = temp.list.get(index);
                }
        }
        temp.word = in;
        temp.endOfWord = true;
}

Delete

Algorithm delete (word)

  • Assign Tnode = TrieRoot
  • For each character symbol from word string
    • get the index of the symbol
    • Tnode = TNode.list[index]
  • If Tnode != Null
    • Tnode.EndOfWord = false
    • Tnode.data = Null

Implementation

public void trieDelete(String in) {
        TrieNode temp = mRoot;
        for(char character : in.toCharArray()) {
            int index = mSymbols.get(character);
            temp = temp.list.get(index);
        }
        if(temp != null) {
            temp.word = null;
            temp.endOfWord = false;
        }
    }

Search

Algorithm search (word)

  • Tnode = TrieRoot
  • For each character symbol from the word
    • get the index of the symbol
    • Tnode = Tnode.list[index]
  • If Tnode != Null & Tnode.EndOfWord == true
    • Print search word found
  • Else
    • Print search word not found

Implementation

     public void trieSearch(String in) {
        TrieNode temp = mRoot;
        for(char character : in.toCharArray()) {
            int index = mSymbols.get(character);
            temp = temp.list.get(index);
            if(null == temp) break;
        }
        System.out.println();
        if (null == temp) {
            System.out.println("Key not found");
        } else {
            searchTrieSubtree(temp);
        }
    }
   public void searchTrieSubtree(TrieNode node) {
        if (null == node) return;
        if (node.endOfWord) {
            System.out.print(node.word + " ");
        }
        for (TrieNode n : node.list) {
            searchTrieSubtree(n);
        }
    }

Complexity

Insert – O(dn)

Search – O(dn)

Delete – O(dn)

Space optimized variants

We shall discuss the variants below in the upcoming blogs. For now just listening down the variants below.

  • Bitwise Trie
  • Radix tree
  • Patrica Trees

Applications

  • Auto complete
  • Spell checking
  • Text searching (suffix tree is used special type of Trie)
  • Internet Routing

Conclusion

In this blog, Trie data structure naive implementation is discussed as an introduction to the topic. I shall discuss the memory optimized implementation in the next blog. So stay tuned!

More blogs

https://www.workingexample.tech/interview-questions-merge-k-sorted-lists-efficiently/

https://www.workingexample.tech/interview-questions-find-the-immediate-lesser-key-than-the-given-key-in-the-binary-search-tree/

https://www.workingexample.tech/interview-questions-implement-linux-change-directory-command/