本文转载自:众成翻译
译者:LexHuang
连接:http://www.zcfy.cc/article/2959
原文:http://mrale.ph/blog/2012/06/03/explaining-js-vms-in-js-inline-caches.htmljavascript
我知道如何实现用语言(或者语言的子集)来实现运行该语言虚拟机。若是我在学校或者有更多的时间我确定会用JavaScript实现一个JavaScript虚拟机。实际上这并不会变成一个独一无二的JavaScript项目,由于蒙特利尔大学的人所造的Tachyon已经在某种程度上达到了一样的目的,可是我也有些我本身想要追求的点子。php
我则有另外一个和自循环虚拟机紧密相关的梦想。我想要帮助JavaScript开发者理解JS引擎的工做方式。我认为理解你正在使用的工具是咱们职业生涯中最重要的。越多人不在把JS VM看做是将JavaScript源码转为0-1神秘的黑盒越好。html
我应该说我不是一我的在追求如何解释虚拟机的内部机制而且帮助人们编写性能更好的代码。全世界有许多人正在尝试作一样的事情。可是我认为又一个问题正在阻止知识有效地被开发者所吸取——咱们正在用错误的形式来传授咱们的知识。我对此深感愧疚:java
有时候我把我对V8的了解包装成了很难消化的“作这个,别作那个”的教条化意见。这样的问题在于它对于解释起不到任何帮助而且很容易随着关注人的减小而消失。git
有时候咱们用了错误的抽象层次来解释虚拟机的工做机制。我喜欢一个想法:看见尽是汇编代码的ppt演示可能会鼓励人们去学习汇编而且学会以后会去读ppt演示的内容。但我也惧怕有时候这些ppt只会被人忽视和遗忘而对于实践毫无用处。github
我一直在思考这些问题很长时间了而且我认为用JavaScript来解释JavaScript虚拟机是一个值得尝试的事情。我在WebRebels 2012发表的演讲“V8 Inside Out”追求的正是这一点[视频][演示]而且在本文中我像回顾我一直在奥斯陆所谈论的事情可是不一样的是不会有任何音频的干扰。(我认为我写做的方式比我演讲的方式更加严肃些 ☺)。web
想象你想要为了一个在语法上很是相似于JavaScript可是有着更简单的对象模型的语言——用表来映射key到任意类型的值来代替JavaScript对象——而来用JavaScript实现其虚拟机。简单起见,让咱们想象Lua, 既像JavaScript但做为一个语言又很不同。我最喜欢的“造出一个充满点的数组而后去计算向量合”的例子看起来大体以下:shell
function MakePoint(x, y) local point = {} point.x = x point.y = y return point end function MakeArrayOfPoints(N) local array = {} local m = -1 for i = 0, N do m = m * -1 array[i] = MakePoint(m * i, m * -i) end array.n = N return array end function SumArrayOfPoints(array) local sum = MakePoint(0, 0) for i = 0, array.n do sum.x = sum.x + array[i].x sum.y = sum.y + array[i].y end return sum end function CheckResult(sum) local x = sum.x local y = sum.y if x ~= 50000 or y ~= -50000 then error("failed: x = " .. x .. ", y = " .. y) end end local N = 100000 local array = MakeArrayOfPoints(N) local start_ms = os.clock() * 1000; for i = 0, 5 do local sum = SumArrayOfPoints(array) CheckResult(sum) end local end_ms = os.clock() * 1000; print(end_ms - start_ms)
注意我有一个至少检查某些最终结果的微型基准测试的习惯。这有助于当有人发现个人革命性的jsperf测试用例只不过是我本身的bug时,让我不会太尴尬。express
若是你拿上面的例子放入一个Lua编译器你会获得相似于下面的东西:vim
∮ lua points.lua 150.2
很好,可是对于了解虚拟机的工做过程起不到任何帮助。因此让咱们想一想若是咱们有用JavaScript编写的类Lua虚拟机会长什么样。“类”是由于我不想实现彻底相似于Lua的语法,我更喜欢只关注于用表来实现对象这一点上。原生编译器应该会将咱们的代码编译成下面的JavaScript:
function MakePoint(x, y) { var point = new Table(); STORE(point, 'x', x); STORE(point, 'y', y); return point; } function MakeArrayOfPoints(N) { var array = new Table(); var m = -1; for (var i = 0; i <= N; i++) { m = m * -1; STORE(array, i, MakePoint(m * i, m * -i)); } STORE(array, 'n', N); return array; } function SumArrayOfPoints(array) { var sum = MakePoint(0, 0); for (var i = 0; i <= LOAD(array, 'n'); i++) { STORE(sum, 'x', LOAD(sum, 'x') + LOAD(LOAD(array, i), 'x')); STORE(sum, 'y', LOAD(sum, 'y') + LOAD(LOAD(array, i), 'y')); } return sum; } function CheckResult(sum) { var x = LOAD(sum, 'x'); var y = LOAD(sum, 'y'); if (x !== 50000 || y !== -50000) { throw new Error("failed: x = " + x + ", y = " + y); } } var N = 100000; var array = MakeArrayOfPoints(N); var start = LOAD(os, 'clock')() * 1000; for (var i = 0; i <= 5; i++) { var sum = SumArrayOfPoints(array); CheckResult(sum); } var end = LOAD(os, 'clock')() * 1000; print(end - start);
可是若是你尝试用d8(V8的独立shell)去运行编译后的代码,它会很礼貌的拒绝:
∮ d8 points.js points.js:9: ReferenceError: Table is not defined var array = new Table(); ^ ReferenceError: Table is not defined at MakeArrayOfPoints (points.js:9:19) at points.js:37:13
失败的缘由很简单:咱们还缺乏负责实现对象模型和存取语法的运行时系统代码。这可能看起来很明显,可是我想要强调的是:虚拟机从外面看起来像是黑盒,在内部其实是一系列盒子为了获得出最佳性能的相互协做。这些盒子是:编译器、运行时例程、对象模型、垃圾回收等。幸运的是咱们的语言和例子很是简单因此咱们的运行时系统仅仅多了几行代码:
function Table() { // Map from ES Harmony is a simple dictionary-style collection. this.map = new Map; } Table.prototype = { load: function (key) { return this.map.get(key); }, store: function (key, value) { this.map.set(key, value); } }; function CHECK_TABLE(t) { if (!(t instanceof Table)) { throw new Error("table expected"); } } function LOAD(t, k) { CHECK_TABLE(t); return t.load(k); } function STORE(t, k, v) { CHECK_TABLE(t); t.store(k, v); } var os = new Table(); STORE(os, 'clock', function () { return Date.now() / 1000; });
注意到我用了ES6的Map而不是通常的JavaScript对象由于潜在的表可使用任何键,而不只是字符串形式的。
∮ d8 **--harmony** quasi-lua-runtime.js points.js 737
如今咱们编译后的代码能够执行可是却慢地使人失望,由于每一次读和写不得不跨越全部这些抽象层级后才能拿到值。让咱们经过全部JavaScript虚拟机都有的最基本的优化inline caching来尝试减小这些开销。即便是用Java实现的JS虚拟机最终也会使用它由于动态调用的本质是被暴露在字节码层面的结构化的內联缓存。Inline caching(在V8资源里一般简写为IC)其实是一门近30年的很是古老的技术,最初用在Smalltalk虚拟机上。
内联缓存(Inline caching)背后的思想很是简单:建立一个高速路来绕过运行时系统来读取对象的属性:对传入的对象及其属性做出某种假设,而后经过一个低成本的方式验证这个假设是否正确,若是正确就读取上次缓存的结果。在充满了动态类型和晚绑定以及其余古怪行为——好比eval——的语言里对一个对象做出合理的假设是很是困难的,因此咱们退而求其次,让咱们的读/写操做可以有学习能力:一旦它们看见某个对象它们就能够以某种方式来自适应,使得以后的读取操做在遇到相似结构的对象时可以更快地进行。在某种意义上,咱们将要在读/写操做上缓存关于以前见过的对象的布局的相关知识——这也是内联缓存这个名字的由来。内联缓存能够被用在几乎全部须要动态行为的操做上,只要你能够找到正确的高速路:算数操做、调用自由函数、方法调用等等。有些内联缓存还能缓存不止一条快速通道,这些内联缓存就变成了多态的。
若是咱们开始思考如何应用内联缓存到上面编译后的代码,答案就变得显而易见了:咱们须要改变咱们的对象模型。咱们不可能从一个map
中进行快速读取,由于咱们老是要调用get
方法。[若是咱们可以窥探map
后的纯哈希表,咱们就能够经过缓存桶索引来让内联缓存替咱们工做而不须要相处一个新的对象布局。]
出于效率角度考虑,用做数据结构的表应该更相似于C结构:带有固定偏移量的命名字段序列。这样表就和数组相似:咱们但愿数字形式的属性的存储相似于数组。可是很显然并非全部表的键都是数字:键能够被设计成非字符串非数字或者包含太多字符串命名的属性,而且随着表的修改键也会随之修改。不幸的是,咱们不能作任何昂贵的类型推断。取而代之咱们必须找在程序运行期间的每个表背后的结构,而且随着程序的运行能够建立和修改它们。幸运的是,有一个众所周知的技术 ☺:_隐藏类(hidden classes)_。
隐藏类背后的思想能够归结为如下两点:
对于每一个javascript对象,运行时系统都会将其合一个hidden class关联起来。就像Java VM会关联一个java.lang.Class
的实例给每一个对象同样。
若是对象的布局改变了,则运行时就会 找到一个hidden class或者建立一个新的hidden class来匹配这个新对象布局而且链接到该对象上。
隐藏类有个很是重要的特性:它们运行虚拟机经过简单比对缓存过的隐藏类来检查关于某个对象布局的假设。这正是咱们的内联缓存功能所须要的。让咱们为咱们的类-Lua运行时来实现一些简单的隐藏类系统。每一个隐藏类本质上是属性描述符的集合,每一个描述符要么是一个真正的属性,要么是一个过渡(transition):从一个没有该属性的类指向一个有该属性的类。
function Transition(klass) { this.klass = klass; } function Property(index) { this.index = index; } function Klass(kind) { // Classes are "fast" if they are C-struct like and "slow" is they are Map-like. this.kind = kind; this.descriptors = new Map; this.keys = []; }
过渡之因此存在是为了让多个对象之间能共享隐藏类:若是你有两个对象共享了隐藏类而且你为它们同时增长了某些属性,你不但愿获得不一样的隐藏类。
Klass.prototype = { // Create hidden class with a new property that does not exist on // the current hidden class. addProperty: function (key) { var klass = this.clone(); klass.append(key); // Connect hidden classes with transition to enable sharing: // this == add property key ==> klass this.descriptors.set(key, new Transition(klass)); return klass; }, hasProperty: function (key) { return this.descriptors.has(key); }, getDescriptor: function (key) { return this.descriptors.get(key); }, getIndex: function (key) { return this.getDescriptor(key).index; }, // Create clone of this hidden class that has same properties // at same offsets (but does not have any transitions). clone: function () { var klass = new Klass(this.kind); klass.keys = this.keys.slice(0); for (var i = 0; i < this.keys.length; i++) { var key = this.keys[i]; klass.descriptors.set(key, this.descriptors.get(key)); } return klass; }, // Add real property to descriptors. append: function (key) { this.keys.push(key); this.descriptors.set(key, new Property(this.keys.length - 1)); } };
如今咱们可让咱们的表变得更加灵活而且能容许它们自适应其自身地构造方式
var ROOT_KLASS = new Klass("fast"); function Table() { // All tables start from the fast empty root hidden class and form // a single tree. In V8 hidden classes actually form a forest - // there are multiple root classes, e.g. one for each constructor. // This is partially due to the fact that hidden classes in V8 // encapsulate constructor specific information, e.g. prototype // poiinter is actually stored in the hidden class and not in the // object itself so classes with different prototypes must have // different hidden classes even if they have the same structure. // However having multiple root classes also allows to evolve these // trees separately capturing class specific evolution independently. this.klass = ROOT_KLASS; this.properties = []; // Array of named properties: 'x','y',... this.elements = []; // Array of indexed properties: 0, 1, ... // We will actually cheat a little bit and allow any int32 to go here, // we will also allow V8 to select appropriate representation for // the array's backing store. There are too many details to cover in // a single blog post :-) } Table.prototype = { load: function (key) { if (this.klass.kind === "slow") { // Slow class => properties are represented as Map. return this.properties.get(key); } // This is fast table with indexed and named properties only. if (typeof key === "number" && (key | 0) === key) { // Indexed property. return this.elements[key]; } else if (typeof key === "string") { // Named property. var idx = this.findPropertyForRead(key); return (idx >= 0) ? this.properties[idx] : void 0; } // There can be only string&number keys on fast table. return void 0; }, store: function (key, value) { if (this.klass.kind === "slow") { // Slow class => properties are represented as Map. this.properties.set(key, value); return; } // This is fast table with indexed and named properties only. if (typeof key === "number" && (key | 0) === key) { // Indexed property. this.elements[key] = value; return; } else if (typeof key === "string") { // Named property. var index = this.findPropertyForWrite(key); if (index >= 0) { this.properties[index] = value; return; } } this.convertToSlow(); this.store(key, value); }, // Find property or add one if possible, returns property index // or -1 if we have too many properties and should switch to slow. findPropertyForWrite: function (key) { if (!this.klass.hasProperty(key)) { // Try adding property if it does not exist. // To many properties! Achtung! Fast case kaput. if (this.klass.keys.length > 20) return -1; // Switch class to the one that has this property. this.klass = this.klass.addProperty(key); return this.klass.getIndex(key); } var desc = this.klass.getDescriptor(key); if (desc instanceof Transition) { // Property does not exist yet but we have a transition to the class that has it. this.klass = desc.klass; return this.klass.getIndex(key); } // Get index of existing property. return desc.index; }, // Find property index if property exists, return -1 otherwise. findPropertyForRead: function (key) { if (!this.klass.hasProperty(key)) return -1; var desc = this.klass.getDescriptor(key); if (!(desc instanceof Property)) return -1; // Here we are not interested in transitions. return desc.index; }, // Copy all properties into the Map and switch to slow class. convertToSlow: function () { var map = new Map; for (var i = 0; i < this.klass.keys.length; i++) { var key = this.klass.keys[i]; var val = this.properties[i]; map.set(key, val); } Object.keys(this.elements).forEach(function (key) { var val = this.elements[key]; map.set(key | 0, val); // Funky JS, force key back to int32. }, this); this.properties = map; this.elements = null; this.klass = new Klass("slow"); } };
[我不打算一行一行地解释上面的代码,由于它已是用JavaScript书写的了;而不是C++ 或者 汇编...这正是使用JavaScript的意义所在。然而你能够经过评论或者邮件来询问任何不理解的地方。]
既然咱们已经在运行时系统里加入了隐藏类,使得咱们可以快速检查对象的结构而且经过它们的索引来快速读取属性,咱们只差实现内联缓存了。这须要在编译器和运行时系统增长一些新的功能(还记得我谈论过虚拟机内不一样成员之间的协做么?)。
实现内联缓存的途径之一是将其分割成两个部分:生成代码里的可变调用点和能够被调用点调用的一系列存根(stubs,一小片生成的本地代码)。很是重要的一点是:存根自己必须能从调用它们的调用点(或者运行时系统)中找到:存根只存放特定假设下的编译后的快速路径,若是这些假设对存根遇到的对象不适用,则存根能够初始化调用该存根的调用点的变更(打包,patching),使得该调用点可以适应新的状况。咱们的纯JavaScript仍然包含两个部分:
一个全局变量,每一个ic都会使用一个全局变量来模拟可变调用指令;
并使用闭包来代替存根。
在本地代码里, V8经过在栈上监听返回地址来找到要打包的内联缓存点。咱们不能经过纯JavaScript来达到这一点(arguments.caller
的粒度不够细)。因此咱们将只会显式地传递内联缓存的id到内联缓存的存根。经过内联缓存优化后的代码以下:
// Initially all ICs are in uninitialized state. // They are not hitting the cache and always missing into runtime system. var STORE$0 = NAMED_STORE_MISS; var STORE$1 = NAMED_STORE_MISS; var KEYED_STORE$2 = KEYED_STORE_MISS; var STORE$3 = NAMED_STORE_MISS; var LOAD$4 = NAMED_LOAD_MISS; var STORE$5 = NAMED_STORE_MISS; var LOAD$6 = NAMED_LOAD_MISS; var LOAD$7 = NAMED_LOAD_MISS; var KEYED_LOAD$8 = KEYED_LOAD_MISS; var STORE$9 = NAMED_STORE_MISS; var LOAD$10 = NAMED_LOAD_MISS; var LOAD$11 = NAMED_LOAD_MISS; var KEYED_LOAD$12 = KEYED_LOAD_MISS; var LOAD$13 = NAMED_LOAD_MISS; var LOAD$14 = NAMED_LOAD_MISS; function MakePoint(x, y) { var point = new Table(); STORE$0(point, 'x', x, 0); // The last number is IC's id: STORE$0 ⇒ id is 0 STORE$1(point, 'y', y, 1); return point; } function MakeArrayOfPoints(N) { var array = new Table(); var m = -1; for (var i = 0; i <= N; i++) { m = m * -1; // Now we are also distinguishing between expressions x[p] and x.p. // The fist one is called keyed load/store and the second one is called // named load/store. // The main difference is that named load/stores use a fixed known // constant string key and thus can be specialized for a fixed property // offset. KEYED_STORE$2(array, i, MakePoint(m * i, m * -i), 2); } STORE$3(array, 'n', N, 3); return array; } function SumArrayOfPoints(array) { var sum = MakePoint(0, 0); for (var i = 0; i <= LOAD$4(array, 'n', 4); i++) { STORE$5(sum, 'x', LOAD$6(sum, 'x', 6) + LOAD$7(KEYED_LOAD$8(array, i, 8), 'x', 7), 5); STORE$9(sum, 'y', LOAD$10(sum, 'y', 10) + LOAD$11(KEYED_LOAD$12(array, i, 12), 'y', 11), 9); } return sum; } function CheckResults(sum) { var x = LOAD$13(sum, 'x', 13); var y = LOAD$14(sum, 'y', 14); if (x !== 50000 || y !== -50000) throw new Error("failed x: " + x + ", y:" + y); }
上述的改变依旧是不言自明的:每个属性的读/写点都有属于它们本身的、带有id的内联缓存。距离最终完成还剩一小步:实现未命中(MISS)
存根和能够生存特定存根的“存根编译器”:
function NAMED_LOAD_MISS(t, k, ic) { var v = LOAD(t, k); if (t.klass.kind === "fast") { // Create a load stub that is specialized for a fixed class and key k and // loads property from a fixed offset. var stub = CompileNamedLoadFastProperty(t.klass, k); PatchIC("LOAD", ic, stub); } return v; } function NAMED_STORE_MISS(t, k, v, ic) { var klass_before = t.klass; STORE(t, k, v); var klass_after = t.klass; if (klass_before.kind === "fast" && klass_after.kind === "fast") { // Create a store stub that is specialized for a fixed transition between classes // and a fixed key k that stores property into a fixed offset and replaces // object's hidden class if necessary. var stub = CompileNamedStoreFastProperty(klass_before, klass_after, k); PatchIC("STORE", ic, stub); } } function KEYED_LOAD_MISS(t, k, ic) { var v = LOAD(t, k); if (t.klass.kind === "fast" && (typeof k === 'number' && (k | 0) === k)) { // Create a stub for the fast load from the elements array. // Does not actually depend on the class but could if we had more complicated // storage system. var stub = CompileKeyedLoadFastElement(); PatchIC("KEYED_LOAD", ic, stub); } return v; } function KEYED_STORE_MISS(t, k, v, ic) { STORE(t, k, v); if (t.klass.kind === "fast" && (typeof k === 'number' && (k | 0) === k)) { // Create a stub for the fast store into the elements array. // Does not actually depend on the class but could if we had more complicated // storage system. var stub = CompileKeyedStoreFastElement(); PatchIC("KEYED_STORE", ic, stub); } } function PatchIC(kind, id, stub) { this[kind + "$" + id] = stub; // non-strict JS funkiness: this is global object. } function CompileNamedLoadFastProperty(klass, key) { // Key is known to be constant (named load). Specialize index. var index = klass.getIndex(key); function KeyedLoadFastProperty(t, k, ic) { if (t.klass !== klass) { // Expected klass does not match. Can't use cached index. // Fall through to the runtime system. return NAMED_LOAD_MISS(t, k, ic); } return t.properties[index]; // Veni. Vidi. Vici. } return KeyedLoadFastProperty; } function CompileNamedStoreFastProperty(klass_before, klass_after, key) { // Key is known to be constant (named load). Specialize index. var index = klass_after.getIndex(key); if (klass_before !== klass_after) { // Transition happens during the store. // Compile stub that updates hidden class. return function (t, k, v, ic) { if (t.klass !== klass_before) { // Expected klass does not match. Can't use cached index. // Fall through to the runtime system. return NAMED_STORE_MISS(t, k, v, ic); } t.properties[index] = v; // Fast store. t.klass = klass_after; // T-t-t-transition! } } else { // Write to an existing property. No transition. return function (t, k, v, ic) { if (t.klass !== klass_before) { // Expected klass does not match. Can't use cached index. // Fall through to the runtime system. return NAMED_STORE_MISS(t, k, v, ic); } t.properties[index] = v; // Fast store. } } } function CompileKeyedLoadFastElement() { function KeyedLoadFastElement(t, k, ic) { if (t.klass.kind !== "fast" || !(typeof k === 'number' && (k | 0) === k)) { // If table is slow or key is not a number we can't use fast-path. // Fall through to the runtime system, it can handle everything. return KEYED_LOAD_MISS(t, k, ic); } return t.elements[k]; } return KeyedLoadFastElement; } function CompileKeyedStoreFastElement() { function KeyedStoreFastElement(t, k, v, ic) { if (t.klass.kind !== "fast" || !(typeof k === 'number' && (k | 0) === k)) { // If table is slow or key is not a number we can't use fast-path. // Fall through to the runtime system, it can handle everything. return KEYED_STORE_MISS(t, k, v, ic); } t.elements[k] = v; } return KeyedStoreFastElement; }
代码很长(以及注释),可是配合上面全部解释应该不难理解:内联缓存负责观察而存根编译器/工程负责生产自适应和特化后的存根[有心的读者可能注意到了我本能够初始化全部键控的存储内联缓存(keyed store ICs),用一开始的快速读取或者当它进入快速状态后就一直保持住]。
若是咱们无论上面全部代码而回到咱们的“基准测试”,咱们会获得很是使人满意的结果:
∮ d8 --harmony quasi-lua-runtime-ic.js points-ic.js 117
这要比咱们一开始的天真尝试提高了6倍!
但愿你在阅读这一部分的时候已经看完了以前全部内容...我尝试从不一样的角度,JavaScript开发者的角度,来看某些驱动当今JavaScript引擎的点子。所写的代码越长,我越有一种盲人摸象的感受。下面的事实只是为了给你一种望向深渊的感受:V8有10种描述符,5种元素类型(+9外部元素类型),ic.cc里包含了几乎全部内联缓存状态选择的逻辑多达2500行,而且V8的内联缓存的状态不止2个(它们是uninitialized, premonomorphic, monomorphic, polymorphic, generic states,更别提用于键控读/写的内联缓存的特殊的状态或者是算数内敛缓存的彻底不一样的状态层级),ia32-specific手写的内联缓存存根多达5000行代码,等等。这些数字只会随着时间的流逝和V8为了识别和适应愈来愈多的对象布局的学习而增加。并且我甚至都还没谈到对象模型自己(objects.cc 13k行代码),或者垃圾回收,或者优化编译器。
话虽如此,在可预见的将来內,我确信基础将不会改变,若是变了确定会引起一场你必定会注意到的巨大的爆炸!所以我认为此次尝试用JavaScript去理解基础的练习是很是很是很是重要的。
我但愿明天或者几周以后你会停下来而且大喊“我找到了!”而且告诉你的为何特定状况下在一个地方为一个对象增长属性会影响其他很远的接触这些对象的热回路的性能。_你知道的,由于隐藏类变了!_