算法之trie tree

时间 2019-11-07

标签算法 trie tree 繁體版

原文原文链接

introduction

最近在作ASTR，而algorithm是该项目的task 1. 因此开始从新刷LeetCode题。相比其余ACM题库而言，LeetCode的难度属于初、中级，更加的偏向于职场，而弱化了一些高级数学相关的东西。javascript

这段话是我对于刷LeetCode的一些见解。不少人以为算法对于实际工做过程当中的用途不大，有点相似于“面试造火箭，工做拧螺丝”。可是真的用途不大吗？我以为并不是如此，若是你可以发现里面的数学之美的话。其一，一个优秀的算法能够极大的提升你的程序运行效率，尤为在某些极端状况下面；其二，学习了这些算法，能够极大的提升咱们的逻辑思惟；其三，这对于面试仍是有不少好处的。java

我作LeetCode的方法是：node

选取某一到特定的题目（通常是净点赞多的优先）
作题，并研究其抽象出来的原理
触类旁通运用于这一类题

Note：特别强调，刷题不是目的，会作某一到具体的题更不是完成结果。学会每道题背后的原理，并可以理解、完成全部的这一类问题，以及将其运用于本身的工做当中，这才是咱们真正须要达到的目标。git

举个例子：面试

选取题目leetcode 5: Longest Palindromic Substring
由该题目联想到经典的两类问题： longest common substring problem 和 longest common subsequence problem。而后，联系到longest common substring problem的通常解决思路，generalized suffix tree，在细化到 trie tree。
搜索相关的问题，并解决。

本篇为trie树相关第一篇算法

Trie tree

首先Trie 来自于单词retrieval, 一般发音为 /ˈtraɪ/ (as "try").数组

In computer science, a trie, also called digital tree, radix tree or prefix tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings.数据结构

Trie tree，又被称为字典树或前缀树。从名字咱们能够推断出其能够被用来查找字符串。app

咱们先来看一个例子：学习

给定一个字符串集合cat, cash, app, apple , aply, ok，来构建一颗字典树，以下图：

由此咱们引出字典树的特色：

Trie tree用边来表示字母
有相同前缀的单词公用前缀节点。那么咱们能够知道，在单词只包含小写字母的状况下，咱们能够知道每一个节点最多有26个子节点
整棵树的根节点是空的
每一个单词结束的时候用特殊字符表示(好比上图的$)，在代码中能够单独创建一个bool字段来表示是不是单词结束处

基本操做

最简单的两个操做为: insert 和 search

insert: 插入一个新单词

从图中能够直观看出来，从左到右扫描新单词，若是字母在相应根节点下没有出现过，就插入这个字母；不然沿着字典树往下走，看单词的下一个字母。

问题1：字母往哪一个位置插？有两种编码方式。第一种能够按照输入顺序对其进行编码，这里相同字母的编码可能不一样：

第二种编码方式: 由于每一个节点最多26个子节点，我能够能够按他们的字典序0-25编号，这里相同字母的编码相同

一般来说，咱们来实现这个数据结构会有两种方式：

数组模拟
类的形式

一般，虽然第二种方式更加的浪费空间，可是我会更加的喜欢用第二种方式。好比在处理下面这几个问题时更加方便：

查询某个单词是否存在字典树中。咱们只须要在节点中添加属性表示便可。
查询某个前缀出现的次数。咱们仍然只须要在节点中添加属性便可。

所以，咱们来看实际的代码(javascript版):

var TrieNode = function() {
    this.isEnd = false;
    this.links = new Array(26);
}
TrieNode.prototype.containsKey = function(ch) { // 当前节点的子节点中是否包含该字符
    return this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)] !== undefined; 
}
TrieNode.prototype.get = function(ch) { // 获取当前节点相关字符的子节点
    return this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)];
}
TrieNode.prototype.put = function(ch, node) { // 插入当前相关字符的子节点
    this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)] = node;
}
TrieNode.prototype.setEnd = function() { // 设置当前节点是否为单词结尾
    this.isEnd =true
}

/** * Initialize your data structure here. */
var Trie = function() {
   this.root = new TrieNode();
};

/** * Inserts a word into the trie. * @param {string} word * @return {void} */
Trie.prototype.insert = function(word) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(!node.containsKey(currentChar)) {
            node.put(currentChar, new TrieNode());
        }
        node = node.get(currentChar);
    }
    node.setEnd();
};

/** * Returns if the word is in the trie. * @param {string} word * @return {boolean} */
Trie.prototype.search = function(word) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(node.containsKey(currentChar)) {
            node = node.get(currentChar);
        } else {
            return false;
        }
    }
    return node.isEnd();
};

/** * Returns if there is any word in the trie that starts with the given prefix. * @param {string} prefix * @return {boolean} */
Trie.prototype.startsWith = function(prefix) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(node.containsKey(currentChar)) {
            node = node.get(currentChar);
        } else {
            return false;
        }
    }
    return true;    
};

/** */
  var obj = new Trie();
  let words = ["Trie","insert","search","search","startsWith","insert","search"]
  for (let i = 0; i < words.length; i++ ) {
    obj.insert(words[i]);
  }
复制代码

很明显，上面的写法比较偏向于工程化，比较完整类型的，上面的代码能够更加的优化,咱们用对象来模拟：

var Trie = function() {
   this.root = {};
};

/** * Inserts a word into the trie. * @param {string} word * @return {void} */
Trie.prototype.insert = function(word) {
    let node = this.root;
    for (let ch of word) {
      if (!(ch in node)) node[ch] = {}
      node = node[ch]
    }
    node['$'] = true // 表示单词结束位置
};

/** * Returns if the word is in the trie. * @param {string} word * @return {boolean} */
Trie.prototype.search = function(word) {
    let node = this.root;
    for(let ch of word) {
      if(ch in node) node = node[ch]
      else return false
    }
    return node['$'] === true;
};

/** * Returns if there is any word in the trie that starts with the given prefix. * @param {string} prefix * @return {boolean} */
Trie.prototype.startsWith = function(prefix) {
  let node = this.root;
    for(let ch of prefix) {
      if(ch in node) node = node[ch]
      else return false
    }
    return true; 
};
复制代码

咱们来看看两段代码的运行效率：

实际应用

咱们下面来看看实际开发中trie tree的运用：

autocomplete
spell checker
IP routing(longest prefix matching)

T9 predictive text

reference

1. Trie Tree
2. Trie Tree匹配算法实现
3. Trie / Radix Tree / Suffix Tree
4. Trie Tree和Radix Tree
5. Implement Trie (Prefix Tree)
6. 字典树（trie tree）
7. 208.Implement Trie(Prefix Tree)
8. 字典树（Trie tree）
9. 208. Implement Trie (Prefix Tree)
10. [LeetCode]Implement Trie (Prefix Tree)
更多相关文章...
• PHP 运算符 - PHP教程
• Scala 运算符 - Scala教程
• 算法总结-深度优先算法
• 算法总结-广度优先算法