原来rollup这么简单之 tree shaking篇

时间 2020-04-07

标签原来 rollup 简单 tree shaking 繁體版

原文原文链接

你们好，我是小雨小雨，致力于分享有趣的、实用的技术文章。
内容分为翻译和原创，若是有问题，欢迎随时评论或私信，但愿和你们一块儿进步。
分享不易，但愿可以获得你们的支持和关注。javascript

计划

rollup系列打算一章一章的放出，内容更精简更专注更易于理解java

目前打算分为如下几章：node

rollup.rollup
rollup.generate + rollup.write
rollup.watch
tree shaking <==== 当前文章
plugins

TL;DR

es node：各类语法块的类，好比if块，箭头函数块，函数调用块等等webpack

rollup()阶段，分析源码，生成ast tree，对ast tree上的每一个节点进行遍历，判断出是否include，是的话标记，而后生成chunks，最后导出。
generate()或者write()阶段根据rollup()阶段作的标记，进行代码收集，最后生成真正用到的代码，这就是tree shaking的基本原理。git

一句话就是，根据side effects的定义，设定es node的include 与不一样的es node生成不一样的渲染(引入到magic string实例)函数，有magic string进行收集，最后写入。github

本文没有具体分析各es node渲染方法和include设定的具体实现，不过有问题欢迎讨论，拍砖~web

注意点

!!!版本 => 笔者阅读的rollup版本为: 1.32.0数组

!!!提示 => 标有TODO为具体实现细节，会视状况分析。缓存

!!!注意 => 每个子标题都是父标题(函数)内部实现sass

!!!强调 => rollup中模块(文件)的id就是文件地址，因此相似resolveID这种就是解析文件地址的意思，咱们能够返回咱们想返回的文件id(也就是地址，相对路径、决定路径)来让rollup加载

rollup是一个核心，只作最基础的事情，好比提供默认模块(文件)加载机制, 好比打包成不一样风格的内容，咱们的插件中提供了加载文件路径，解析文件内容(处理ts，sass等)等操做，是一种插拔式的设计，和webpack相似
插拔式是一种很是灵活且可长期迭代更新的设计，这也是一个中大型框架的核心，人多力量大嘛~

主要通用模块以及含义

Graph: 全局惟一的图，包含入口以及各类依赖的相互关系，操做方法，缓存等。是rollup的核心
PathTracker: 引用(调用)追踪器
PluginDriver: 插件驱动器，调用插件和提供插件环境上下文等
FileEmitter: 资源操做器
GlobalScope: 全局做用局，相对的还有局部的
ModuleLoader: 模块加载器
NodeBase: ast各语法(ArrayExpression、AwaitExpression等)的构造基类

流程解析

此次就不全流程解析了，我们举个最简单的例子分析一下，更有助于理解。

好比咱们有这么一段简单的代码：

function test() {
    var name = 'test';
    console.log(123);
}
const name = '测试测试';
function fn() {
    console.log(name);
}

fn();

如你所见，打包的结果应该是不包含test函数的，像下面这样：

'use strict';

const name = '测试测试';
function fn() {
    console.log(name);
}

fn();

那rollup是怎么处理这段代码的呢？

模块解析

还得回到了rollup()流程，根据例子，咱们能够把对import、export、re-export等相关的都干掉，暂时不须要关注，了解最基本的流程后会水到渠成的。
关于插件，我也不会使用任何插件，只使用rollup内置的默认插件。

对于这个例子来讲，首先会根据解析文件地址，获取文件真正的路径：

function createResolveId(preserveSymlinks: boolean) {
	return function(source: string, importer: string) {
		if (importer !== undefined && !isAbsolute(source) && source[0] !== '.') return null;

		// 最终调用path.resolve，将合法路径片断转为绝对路径
		return addJsExtensionIfNecessary(
			resolve(importer ? dirname(importer) : resolve(), source),
			preserveSymlinks
		);
	};
}

而后建立rollup模块，设置缓存等:

const module: Module = new Module(
    this.graph,
    id,
    moduleSideEffects,
    syntheticNamedExports,
    isEntry
);

以后经过内置的load钩子获取文件内容，固然咱也能够自定义该行为:

// 第二个参数是 传给load钩子函数的 参数，内部使用的apply
return Promise.resolve(this.pluginDriver.hookFirst('load', [id]))

以后通过transform(transform这里能够理解为webpack的各类loader，处理不一样类型文件的)处理生成一段标准化的结构:

const source = { ast: undefined,
    code:
     'function test() {\n    var name = \'test\';\n    console.log(123);\n}\nconst name = \'测试测试\';\nfunction fn() {\n    console.log(name);\n}\n\nfn();\n',
    customTransformCache: false,
    moduleSideEffects: null,
    originalCode:
     'function test() {\n    var name = \'test\';\n    console.log(123);\n}\nconst name = \'测试测试\';\nfunction fn() {\n    console.log(name);\n}\n\nfn();\n',
    originalSourcemap: null,
    sourcemapChain: [],
    syntheticNamedExports: null,
    transformDependencies: [] 
}

而后就到了比较关键的一步，将source解析并设置到当前module上：

// 生成 es tree ast
this.esTreeAst = ast || tryParse(this, this.graph.acornParser, this.graph.acornOptions);

// 调用 magic string，超方便操做字符串的工具库
this.magicString = new MagicString(code, options).

// 搞一个ast上下文环境，包装一些方法，好比动态导入、导出等等吧，以后会将分析到的模块或内容填充到当前module中，bind的this指向当前module
this.astContext = {
    addDynamicImport: this.addDynamicImport.bind(this), // 动态导入
    addExport: this.addExport.bind(this), // 导出
    addImport: this.addImport.bind(this), // 导入
    addImportMeta: this.addImportMeta.bind(this), // importmeta
    annotations: (this.graph.treeshakingOptions && this.graph.treeshakingOptions.annotations)!,
    code, // Only needed for debugging
    deoptimizationTracker: this.graph.deoptimizationTracker,
    error: this.error.bind(this),
    fileName, // Needed for warnings
    getExports: this.getExports.bind(this),
    getModuleExecIndex: () => this.execIndex,
    getModuleName: this.basename.bind(this),
    getReexports: this.getReexports.bind(this),
    importDescriptions: this.importDescriptions,
    includeDynamicImport: this.includeDynamicImport.bind(this),
    includeVariable: this.includeVariable.bind(this),
    isCrossChunkImport: importDescription =>
        (importDescription.module as Module).chunk !== this.chunk,
    magicString: this.magicString,
    module: this,
    moduleContext: this.context,
    nodeConstructors,
    preserveModules: this.graph.preserveModules,
    propertyReadSideEffects: (!this.graph.treeshakingOptions ||
        this.graph.treeshakingOptions.propertyReadSideEffects)!,
    traceExport: this.getVariableForExportName.bind(this),
    traceVariable: this.traceVariable.bind(this),
    treeshake: !!this.graph.treeshakingOptions,
    tryCatchDeoptimization: (!this.graph.treeshakingOptions ||
        this.graph.treeshakingOptions.tryCatchDeoptimization)!,
    unknownGlobalSideEffects: (!this.graph.treeshakingOptions ||
        this.graph.treeshakingOptions.unknownGlobalSideEffects)!,
    usesTopLevelAwait: false,
    warn: this.warn.bind(this),
    warnDeprecation: this.graph.warnDeprecation.bind(this.graph)
};

// 实例化Program，将结果赋给当前模块的ast属性上以供后续使用
// ！！！ 注意实例中有一个included属性，用因而否打包到最终输出文件中，也就是tree shaking。默认为false。！！！
this.ast = new Program(
    this.esTreeAst,
    { type: 'Module', context: this.astContext }, // astContext里包含了当前module和当前module的相关信息，使用bind绑定当前上下文
    this.scope
);
// Program内部会将各类不一样类型的 estree node type 的实例添加到实例上，以供后续遍历使用
// 不一样的node type继承同一个NodeBase父类，好比箭头函数表达式(ArrayExpression类)，详见[nodes目录](https://github.com/FoxDaxian/rollup-analysis/tree/master/src/ast/nodes)
function parseNode(esTreeNode: GenericEsTreeNode) {
		// 就是遍历，而后new nodeType，而后挂载到实例上
    for (const key of Object.keys(esTreeNode)) {
        // That way, we can override this function to add custom initialisation and then call super.parseNode
        // this 指向 Program构造类，经过new建立的
        // 若是program上有的话，那么跳过
        if (this.hasOwnProperty(key)) continue;
        // ast tree上的每个属性
        const value = esTreeNode[key];
        // 不等于对象或者null或者key是annotations
        // annotations是type
        if (typeof value !== 'object' || value === null || key === 'annotations') {
            (this as GenericEsTreeNode)[key] = value;
        } else if (Array.isArray(value)) {
            // 若是是数组，那么建立数组并遍历上去
            (this as GenericEsTreeNode)[key] = [];
            // this.context.nodeConstructors 针对不一样的语法书类型，进行不一样的操做，好比挂载依赖等等
            for (const child of value) {
                // 循环而后各类new 各类类型的node，都是继成的NodeBase
                (this as GenericEsTreeNode)[key].push(
                    child === null
                        ? null
                        : new (this.context.nodeConstructors[child.type] ||
                                this.context.nodeConstructors.UnknownNode)(child, this, this.scope) // 处理各类ast类型
                );
            }
        } else {
            // 以上都不是的状况下，直接new
            (this as GenericEsTreeNode)[key] = new (this.context.nodeConstructors[value.type] ||
                this.context.nodeConstructors.UnknownNode)(value, this, this.scope);
        }
    }
}

后面处理相关依赖模块，直接跳过咯~

return this.fetchAllDependencies(module).then();

到目前为止，咱们将文件转换成了模块，并解析出 es tree node 以及其内部包含的各种型的语法树

使用PathTracker追踪上下文关系

for (const module of this.modules) {
    // 每一个一个节点本身的实现，不是全都有
    module.bindReferences();
}

好比咱们有箭头函数，因为没有this指向因此默认设置UNKONW

// ArrayExpression类，继承与NodeBase
bind() {
    super.bind();
    for (const element of this.elements) {
        if (element !== null) element.deoptimizePath(UNKNOWN_PATH);
    }
}

若是有外包裹函数，就会加深一层path，最后会根据层级关系，进行代码的wrap

标记模块是否可shaking

其中核心为根据isExecuted的状态进行模块以及es tree node的引入，再次以前咱们要知道includeMarked方式是获取入口以后调用的。
也就是全部的入口模块(用户定义的、动态引入、入口文件依赖、入口文件依赖的依赖..)都会module.isExecuted为true
以后才会调用下面的includeMarked方法，这时候module.isExecuted已经为true，便可调用include方法

function includeMarked(modules: Module[]) {
    // 若是有treeshaking不为空
    if (this.treeshakingOptions) {
        // 第一个tree shaking
        let treeshakingPass = 1;
        do {
            timeStart(`treeshaking pass ${treeshakingPass}`, 3);
            this.needsTreeshakingPass = false;
            for (const module of modules) {
                // 给ast node标记上include
                if (module.isExecuted) module.include();
            }
            timeEnd(`treeshaking pass ${treeshakingPass++}`, 3);
        } while (this.needsTreeshakingPass);
    } else {
        // Necessary to properly replace namespace imports
        for (const module of modules) module.includeAllInBundle();
    }
}

// 上面module.include()的实现。
include(context: InclusionContext, includeChildrenRecursively: IncludeChildren) {
    // 将固然程序块的included设为true，再去遍历当前程序块中的全部es node，根据不一样条件进行include的设定
    this.included = true;
    for (const node of this.body) {
        if (includeChildrenRecursively || node.shouldBeIncluded(context)) {
            node.include(context, includeChildrenRecursively);
        }
    }
}

module.include内部就涉及到es tree node了，因为NodeBase初始include为false，因此还有第二个判断条件：当前node是否有反作用side effects。
这个是否有反作用是继承与NodeBase的各种node子类自身的实现。目前就我看来，反作用也是有自身的协议规定的，好比修改了全局变量这类就算是反作用，固然也有些是确定无反作用的，好比export语句，rollup中就写死为false了。
rollup内部不一样类型的es node 实现了不一样的hasEffects实现，可自身观摩学习。能够经过该篇文章，简单了解一些 side effects。

chunks的生成

后面就是经过模块，生成chunks，固然其中还包含多chunk，少chunks等配置选项的区别，这里再也不赘述，有兴趣的朋友能够参考本系列第一篇文章或者直接查看带注释的源码

经过chunks生成代码(字符串)

调用rollup方法后，会返回一个对象，其中包括了代码生成和写入操做的write方法(已去掉一些warn等)：

return {
    write: ((rawOutputOptions: GenericConfigObject) => {
        const { outputOptions, outputPluginDriver } = getOutputOptionsAndPluginDriver(
            rawOutputOptions
        );
		// 这里是关键
        return generate(outputOptions, true, outputPluginDriver).then(async bundle => {
            await Promise.all(
                Object.keys(bundle).map(chunkId =>
                    writeOutputFile(result, bundle[chunkId], outputOptions, outputPluginDriver) // => 写入操做
                )
            );
            // 修改生成后的代码
            await outputPluginDriver.hookParallel('writeBundle', [bundle]);
            // 目前看来是供以后缓存用，提升构建速度
            return createOutput(bundle);
        });
    }) as any
}

generate是就相对简单些了，就是一些钩子和方法的调用，好比：

preRender方法将es node渲染为字符串，调用的是各es node自身实现的render方法，具体参考代码哈。规则不少，这里不赘述，我也没细看~~

那哪些须要渲染，哪些不须要渲染呢？

没错，就用到了以前定义的include字段，作了一个简单的判断，好比：

// label node类中
function render(code: MagicString, options: RenderOptions) {
    // 诺~
    if (this.label.included) {
        this.label.render(code, options);
    } else {
        code.remove(
            this.start,
            findFirstOccurrenceOutsideComment(code.original, ':', this.label.end) + 1
        );
    }
    this.body.render(code, options);
}

以后添加到chunks中，这样chunks中不只有ast，还有生成后的可执行代码。

以后根据format字段获取不一样的wrapper，对代码字符串进行处理，而后传递给renderChunk方法，该方法主要为了调用renderChunk、transformChunk、transformBundle三个钩子函数，对结果进行进一步处理。不过因为我分析的版本不是最新的，因此会与当前2.x有出入，改动详见changlog

对了，还有sourceMap，这个能力是magic string提供的，可自行查阅

这样咱们就获得了最终想要的结果：

chunks.map(chunk => {
    // 经过id获取以前设置到outputBundleWithPlaceholders上的一些属性
    const outputChunk = outputBundleWithPlaceholders[chunk.id!] as OutputChunk;
    return chunk
        .render(outputOptions, addons, outputChunk, outputPluginDriver)
        .then(rendered => {
            // 引用类型，outputBundleWithPlaceholders上的也变化了，因此outputBundle也变化了，最后返回outputBundle
            // 在这里给outputBundle挂载上了code和map，后面直接返回 outputBundle 了
            outputChunk.code = rendered.code;
            outputChunk.map = rendered.map;

            // 调用生成的钩子函数
            return outputPluginDriver.hookParallel('ongenerate', [
                { bundle: outputChunk, ...outputOptions },
                outputChunk
            ]);
        });
})

上面函数处理的是引用类型，因此最后能够直接返回结果。不在赘述。

文件写入

这部分没啥好说的，你们本身看下下面的代码吧。其中writeFile方法调用的node fs模块提供的能力。

function writeOutputFile(
	build: RollupBuild,
	outputFile: OutputAsset | OutputChunk,
	outputOptions: OutputOptions,
	outputPluginDriver: PluginDriver
): Promise<void> {
	const fileName = resolve(outputOptions.dir || dirname(outputOptions.file!), outputFile.fileName);
	let writeSourceMapPromise: Promise<void>;
	let source: string | Buffer;
	if (outputFile.type === 'asset') {
		source = outputFile.source;
	} else {
		source = outputFile.code;
		if (outputOptions.sourcemap && outputFile.map) {
			let url: string;
			if (outputOptions.sourcemap === 'inline') {
				url = outputFile.map.toUrl();
			} else {
				url = `${basename(outputFile.fileName)}.map`;
				writeSourceMapPromise = writeFile(`${fileName}.map`, outputFile.map.toString());
			}
			if (outputOptions.sourcemap !== 'hidden') {
				source += `//# ${SOURCEMAPPING_URL}=${url}\n`;
			}
		}
	}

	return writeFile(fileName, source)
		.then(() => writeSourceMapPromise)
		.then(
			(): any =>
				outputFile.type === 'chunk' &&
				outputPluginDriver.hookSeq('onwrite', [
					{
						bundle: build,
						...outputOptions
					},
					outputFile
				])
		)
		.then(() => {});
}

其余
对于import、export、re-export这类ast node，rollup会解析生成的ast tree，获取其中的value，也就是模块名，组合有用的信息备用。而后就和上述流程相似了。

推荐使用ast explorer解析一段code，而后看看里面的结构，了解后会更容易理解。

function addImport(node: ImportDeclaration) {
    // 好比引入了path模块
    // source: {
    // 	type: 'Literal',
    // 	start: 'xx',
    // 	end: 'xx',
    // 	value: 'path',
    // 	raw: '"path"'
    // }
    const source = node.source.value;
    this.sources.add(source);
    for (const specifier of node.specifiers) {
        const localName = specifier.local.name;

        // 重复引入了
        if (this.importDescriptions[localName]) {
            return this.error(
                {
                    code: 'DUPLICATE_IMPORT',
                    message: `Duplicated import '${localName}'`
                },
                specifier.start
            );
        }

        const isDefault = specifier.type === NodeType.ImportDefaultSpecifier;
        const isNamespace = specifier.type === NodeType.ImportNamespaceSpecifier;

        const name = isDefault
            ? 'default'
            : isNamespace
            ? '*'
            : (specifier as ImportSpecifier).imported.name;

        // 导入的模块的相关描述
        this.importDescriptions[localName] = {
            module: null as any, // filled in later
            name,
            source,
            start: specifier.start
        };
    }
}

总结

感受此次写的很差，看下来可能会以为只是标记与收集的这么一个过程，可是其内部细节是很是复杂的。以致于你须要深刻了解side effects的定义与影响。往后也许会专门整理一下。

rollup系列也快接近尾声了，虽然一直在自嗨，可是也蛮爽的。

学习使我快乐，哈哈~~