如何实现一个基于 DOM 的模板引擎

时间 2019-11-08

标签如何实现一个基于 dom 模板引擎栏目 HTML 繁體版

原文原文链接

题图：Vincent Guthhtml

注：本文全部代码都可在本人的我的项目colon中找到，本文也同步到了知乎专栏vue

可能你已经体会到了 Vue 所带来的便捷了，相信有一部分缘由也是由于其基于 DOM 的语法简洁的模板渲染引擎。这篇文章将会介绍如何实现一个基于 DOM 的模板引擎（就像 Vue 的模板引擎同样）。node

Preface

开始以前，咱们先来看一下最终的效果：git

const compiled = Compile(`<h1>Hey ?, {{ greeting }}</h1>`, {
    greeting: `Hello World`,
});
compiled.view // => `<h1>Hey ?, Hello World</h1>`

Compile

实现一个模板引擎实际上就是实现一个编译器，就像这样：github

const compiled = Compile(template: String|Node, data: Object);
compiled.view // => compiled template

首先，让咱们来看下 Compile 内部是如何实现的：正则表达式

// compile.js
/**
 * template compiler
 *
 * @param {String|Node} template
 * @param {Object} data
 */
function Compile(template, data) {
    if (!(this instanceof Compile)) return new Compile(template, data);

    this.options = {};
    this.data = data;

    if (template instanceof Node) {
        this.options.template = template;
    } else if (typeof template === 'string') {
        this.options.template = domify(template);
    } else {
        console.error(`"template" only accept DOM node or string template`);
    }

    template = this.options.template;

    walk(template, (node, next) => {
        if (node.nodeType === 1) {
            // compile element node
            this.compile.elementNodes.call(this, node);
            return next();
        } else if (node.nodeType === 3) {
            // compile text node
            this.compile.textNodes.call(this, node);
        }
        next();
    });

    this.view = template;
    template = null;
}

Compile.compile = {};

walk

经过上面的代码，能够看到 Compile 的构造函数主要就是作了一件事 ———— 遍历 template，而后经过判断节点类型的不一样来作不一样的编译操做，这里就不介绍如何遍历 template 了，不明白的话能够直接看 walk 函数的源码，咱们着重来看下如何编译这些不一样类型的节点，以编译 node.nodeType === 1 的元素节点为例：express

/**
 * compile element node
 *
 * @param {Node} node
 */
Compile.compile.elementNodes = function (node) {
    const bindSymbol = `:`;
    let attributes = [].slice.call(node.attributes),
        attrName = ``,
        attrValue = ``,
        directiveName = ``;

    attributes.map(attribute => {
        attrName = attribute.name;
        attrValue = attribute.value.trim();

        if (attrName.indexOf(bindSymbol) === 0 && attrValue !== '') {
            directiveName = attrName.slice(bindSymbol.length);

            this.bindDirective({
                node,
                expression: attrValue,
                name: directiveName,
            });
            node.removeAttribute(attrName);
        } else {
            this.bindAttribute(node, attribute);
        }
    });
};

噢忘记说了，这里我参考了 Vue 的指令语法，就是在带有冒号 : 的属性名中（固然这里也能够是任何其余你所喜欢的符号），能够直接写 JavaScript 的表达式，而后也会提供几个特殊的指令，例如 :text, :show 等等来对元素作一些不一样的操做。浏览器

其实该函数只作了两件事：app

遍历该节点的全部属性，经过判断属性类型的不一样来作不一样的操做，判断的标准就是属性名是不是冒号 : 开头而且属性的值不为空；dom
绑定相应的指令去更新属性。

Directive

其次，再看一下 Directive 内部是如何实现的：

import directives from './directives';
import { generate } from './compile/generate';

export default function Directive(options = {}) {
    Object.assign(this, options);
    Object.assign(this, directives[this.name]);
    this.beforeUpdate && this.beforeUpdate();
    this.update && this.update(generate(this.expression)(this.compile.options.data));
}

Directive 作了三件事：

注册指令（Object.assign(this, directives[this.name])）；
计算指令表达式的实际值（generate(this.expression)(this.compile.options.data)）；
把计算出来的实际值更新到 DOM 上面(this.update())。

在介绍指令以前，先看一下它的用法：

Compile.prototype.bindDirective = function (options) {
    new Directive({
        ...options,
        compile: this,
    });
};

Compile.prototype.bindAttribute = function (node, attribute) {
    if (!hasInterpolation(attribute.value) || attribute.value.trim() == '') return false;

    this.bindDirective({
        node,
        name: 'attribute',
        expression: parse.text(attribute.value),
        attrName: attribute.name,
    });
};

bindDirective 对 Directive 作了一个很是简单的封装，接受三个必填属性：

node: 当前所编译的节点，在 Directive 的 update 方法中用来更新当前节点；
name: 当前所绑定的指令名称，用来区分具体使用哪一个指令更新器来更新视图；
expression: parse 以后的 JavaScript 的表达式。

updater

在 Directive 内部咱们经过 Object.assign(this, directives[this.name]); 来注册不一样的指令，因此变量 directives 的值多是这样的：

// directives
export default {
    // directive `:show`
    show: {
        beforeUpdate() {},
        update(show) {
            this.node.style.display = show ? `block` : `none`;
        },
    },
    // directive `:text`
    text: {
        beforeUpdate() {},
        update(value) {
            // ...
        },
    },
};

因此假设某个指令的名字是 show 的话，那么 Object.assign(this, directives[this.name]); 就等同于：

Object.assign(this, {
    beforeUpdate() {},
    update(show) {
        this.node.style.display = show ? `block` : `none`;
    },
});

表示对于指令 show，指令更新器会改变该元素 style 的 display 值，从而实现对应的功能。因此你会发现，整个编译器结构设计好后，若是咱们要拓展功能的话，只需简单地编写指令的更新器便可，这里再以指令 text 举个例子：

// directives
export default {
    // directive `:show`
    // show: { ... },
    // directive `:text`
    text: {
        update(value) {
            this.node.textContent = value;
        },
    },
};

有没有发现编写一个指令其实很是的简单，而后咱们就能够这么使用咱们的 text 指令了：

const compiled = Compile(`<h1 :text="'Hey ?, ' + greeting"></h1>`, {
    greeting: `Hello World`,
});
compiled.view // => `<h1>Hey ?, Hello World</h1>`

generate

讲到这里，其实还有一个很是重要的点没有提到，就是咱们如何把 data 真实数据渲染到模板中，好比 <h1>Hey ?, {{ greeting }}</h1> 如何渲染成 <h1>Hey ?, Hello World</h1>，经过下面三个步骤便可计算出表达式的真实数据：

把 <h1>Hey ?, {{ greeting }}</h1> 解析成 'Hey ?, ' + greeting 这样的 JavaScript 表达式；
提取其中的依赖变量并取得所在 data 中的对应值；
利用 new Function() 来建立一个匿名函数来返回这个表达式；
最后经过调用这个匿名函数来返回最终计算出来的数据并经过指令的 update 方法更新到视图中。

parse text

// reference: https://github.com/vuejs/vue/blob/dev/src/compiler/parser/text-parser.js#L15-L41
const tagRE = /\{\{((?:.|\n)+?)\}\}/g;
function parse(text) {
    if (!tagRE.test(text)) return JSON.stringify(text);

    const tokens = [];
    let lastIndex = tagRE.lastIndex = 0;
    let index, matched;

    while (matched = tagRE.exec(text)) {
        index = matched.index;
        if (index > lastIndex) {
            tokens.push(JSON.stringify(text.slice(lastIndex, index)));
        }
        tokens.push(matched[1].trim());
        lastIndex = index + matched[0].length;
    }

    if (lastIndex < text.length) tokens.push(JSON.stringify(text.slice(lastIndex)));

    return tokens.join('+');
}

该函数我是直接参考 Vue 的实现，它会把含有双花括号的字符串解析成标准的 JavaScript 表达式，例如：

parse(`Hi {{ user.name }}, {{ colon }} is awesome.`);
// => 'Hi ' + user.name + ', ' + colon + ' is awesome.'

extract dependency

咱们会经过下面这个函数来提取出一个表达式中可能存在的变量：

const dependencyRE = /"[^"]*"|'[^']*'|\.\w*[a-zA-Z$_]\w*|\w*[a-zA-Z$_]\w*:|(\w*[a-zA-Z$_]\w*)/g;
const globals = [
    'true', 'false', 'undefined', 'null', 'NaN', 'isNaN', 'typeof', 'in',
    'decodeURI', 'decodeURIComponent', 'encodeURI', 'encodeURIComponent', 'unescape',
    'escape', 'eval', 'isFinite', 'Number', 'String', 'parseFloat', 'parseInt',
];

function extractDependencies(expression) {
    const dependencies = [];

    expression.replace(dependencyRE, (match, dependency) => {
        if (
            dependency !== undefined &&
            dependencies.indexOf(dependency) === -1 &&
            globals.indexOf(dependency) === -1
        ) {
            dependencies.push(dependency);
        }
    });

    return dependencies;
}

经过正则表达式 dependencyRE 匹配出可能的变量依赖后，还要进行一些对比，好比是不是全局变量等等。效果以下：

extractDependencies(`typeof String(name) === 'string'  && 'Hello ' + world + '! ' + hello.split('').join('') + '.'`);
// => ["name", "world", "hello"]

这正是咱们须要的结果，typeof, String, split 和 join 并非 data 中所依赖的变量，因此不须要被提取出来。

generate

export function generate(expression) {
    const dependencies = extractDependencies(expression);
    let dependenciesCode = '';

    dependencies.map(dependency => dependenciesCode += `var ${dependency} = this.get("${dependency}"); `);

    return new Function(`data`, `${dependenciesCode}return ${expression};`);
}

咱们提取变量的目的就是为了在 generate 函数中生成相应的变量赋值的字符串便于在 generate 函数中使用，例如：

new Function(`data`, `
    var name = data["name"];
    var world = data["world"];
    var hello = data["hello"];
    return typeof String(name) === 'string'  && 'Hello ' + world + '! ' + hello.split('').join('') + '.';
`);

// will generated:

function anonymous(data) {
    var name = data["name"];
    var world = data["world"];
    var hello = data["hello"];
    return typeof String(name) === 'string'  && 'Hello ' + world + '! ' + hello.split('').join('') + '.';
}

这样的话，只须要在调用这个匿名函数的时候传入对应的 data 便可得到咱们想要的结果了。如今回过头来看以前的 Directive 部分代码应该就一目了然了：

export default class Directive {
    constructor(options = {}) {
        // ...
        this.beforeUpdate && this.beforeUpdate();
        this.update && this.update(generate(this.expression)(this.compile.data));
    }
}

generate(this.expression)(this.compile.data) 就是表达式通过 this.compile.data 计算后咱们所须要的值。

compile text node

咱们前面只讲了如何编译 node.nodeType === 1 的元素节点，那么文字节点如何编译呢，其实理解了前面所讲的内容话，文字节点的编译就简单得不能再简单了：

/**
 * compile text node
 *
 * @param {Node} node
 */
Compile.compile.textNodes = function (node) {
    if (node.textContent.trim() === '') return false;

    this.bindDirective({
        node,
        name: 'text',
        expression: parse.text(node.textContent),
    });
};

经过绑定 text 指令，并传入解析后的 JavaScript 表达式，在 Directive 内部就会计算出表达式实际的值并调用 text 的 update 函数更新视图完成渲染。

`:each` 指令

到目前为止，该模板引擎只实现了比较基本的功能，而最多见且重要的列表渲染功能尚未实现，因此咱们如今要实现一个 :each 指令来渲染一个列表，这里可能要注意一下，不能按照前面两个指令的思路来实现，应该换一个角度来思考，列表渲染其实至关于一个「子模板」，里面的变量存在于 :each 指令所接收的 data 这个「局部做用域」中，这么说可能抽象，直接上代码：

// :each updater
import Compile from 'path/to/compile.js';
export default {
    beforeUpdate() {
        this.placeholder = document.createComment(`:each`);
        this.node.parentNode.replaceChild(this.placeholder, this.node);
    },
    update() {
        if (data && !Array.isArray(data)) return;

        const fragment = document.createDocumentFragment();

        data.map((item, index) => {
            const compiled = Compile(this.node.cloneNode(true), { item, index, });
            fragment.appendChild(compiled.view);
        });

        this.placeholder.parentNode.replaceChild(fragment, this.placeholder);
    },
};

在 update 以前，咱们先把 :each 所在节点从 DOM 结构中去掉，可是要注意的是并不能直接去掉，而是要在去掉的位置插入一个 comment 类型的节点做为占位符，目的是为了在咱们把列表数据渲染出来后，能找回原来的位置并把它插入到 DOM 中。

那具体如何编译这个所谓的「子模板」呢，首先，咱们须要遍历 :each 指令所接收的 Array 类型的数据（目前只支持该类型，固然你也能够增长对 Object 类型的支持，原理是同样的）；其次，咱们针对该列表的每一项数据进行一次模板的编译并把渲染后的模板插入到建立的 document fragment 中，当全部整个列表编译完后再把刚刚建立的 comment 类型的占位符替换为 document fragment 以完成列表的渲染。

此时，咱们能够这么使用 :each 指令：

Compile(`<li :each="comments" data-index="{{ index }}">{{ item.content }}</li>`, {
    comments: [{
        content: `Hello World.`,
    }, {
        content: `Just Awesome.`,
    }, {
        content: `WOW, Just WOW!`,
    }],
});

会渲染成：

<li data-index="0">Hello World.</li>
<li data-index="1">Just Awesome.</li>
<li data-index="2">WOW, Just WOW!</li>

其实细心的话你会发现，模板中使用的 item 和 index 变量其实就是 :each 更新函数中 Compile(template, data) 编译器里的 data 值的两个 key 值。因此要自定义这两个变量也是很是简单的：

// :each updater
import Compile from 'path/to/compile.js';
export default {
    beforeUpdate() {
        this.placeholder = document.createComment(`:each`);
        this.node.parentNode.replaceChild(this.placeholder, this.node);

        // parse alias
        this.itemName = `item`;
        this.indexName = `index`;
        this.dataName = this.expression;

        if (this.expression.indexOf(' in ') != -1) {
            const bracketRE = /\(((?:.|\n)+?)\)/g;
            const [item, data] = this.expression.split(' in ');
            let matched = null;

            if (matched = bracketRE.exec(item)) {
                const [item, index] = matched[1].split(',');
                index ? this.indexName = index.trim() : '';
                this.itemName = item.trim();
            } else {
                this.itemName = item.trim();
            }

            this.dataName = data.trim();
        }

        this.expression = this.dataName;
    },
    update() {
        if (data && !Array.isArray(data)) return;

        const fragment = document.createDocumentFragment();

        data.map((item, index) => {
            const compiled = Compile(this.node.cloneNode(true), {
                [this.itemName]: item,
                [this.indexName]: index,
            });
            fragment.appendChild(compiled.view);
        });

        this.placeholder.parentNode.replaceChild(fragment, this.placeholder);
    },
};

这样一来咱们就能够经过 (aliasItem, aliasIndex) in items 来自定义 :each 指令的 item 和 index 变量了，原理就是在 beforeUpdate 的时候去解析 :each 指令的表达式，提取相关的变量名，而后上面的例子就能够写成这样了：

Compile(`<li :each="(comment, index) in comments" data-index="{{ index }}">{{ comment.content }}</li>`, {
    comments: [{
        content: `Hello World.`,
    }, {
        content: `Just Awesome.`,
    }, {
        content: `WOW, Just WOW!`,
    }],
});

Conclusion

到这里，其实一个比较简单的模板引擎算是实现了，固然还有不少地方能够完善的，好比能够增长 :class, :style, :if 或 :src 等等你能够想到的指令功能，添加这些功能都是很是的简单的。

全篇介绍下来，整个核心无非就是遍历整个模板的节点树，其次针对每个节点的字符串值来解析成对应的表达式，而后经过 new Function() 这个构造函数来计算成实际的值，最终经过指令的 update 函数来更新到视图上。

若是仍是不清楚这些指令如何编写的话，能够参考我这个项目 colon 的相关源码（部分代码可能会有不影响理解的细微差异，可忽略），有任何问题均可以在 issue 上提。

目前有一个局限就是 DOM-based 的模板引擎只适用于浏览器端，目前笔者也正在实现兼容 Node 端的版本，思路是把字符串模板解析成 AST，而后把更新数据到 AST 上，最后再把 AST 转成字符串模板，实现出来后有空的话再来介绍一下 Node 端的实现。

最后，若是上面有说得不对或者有更好的实现方式的话，欢迎指出讨论。

如何实现一个基于 DOM 的模板引擎

Preface

Compile

walk

Directive

updater

generate

parse text

extract dependency

generate

compile text node

:each 指令

Conclusion

`:each` 指令