版权声明:本文为博主原创文章,未经博主容许不得转载。java
手动码字不易,请你们尊重劳动成果,谢谢多线程
本文经过scala代码编译生成的class文件的角度来对Scala的闭包实现机制进行简单分析app
首先以一个简单的例子开始:函数
class ClosureDemo {
def func() = {
var i = 2
val inc: () => Unit = () => i = i + 1
val add: Int => Int = (ii: Int) => ii + i
(inc, add)
}
}
在这个代码中,inc
和add
引用了func
函数中的i
变量,因为Scala中函数是头等值,所以inc
和add
将造成闭包来引用外部的i
变量。ui
编译上述代码咱们将获得三个class文件:this
ClosureDemo.class
ClosureDemo$$anonfun$1.class
ClosureDemo$$anonfun$2.classspa
这三个文件分别是ClosureDemo
类自身和两个闭包,Scala会为每一个闭包生成一个Class文件,若是嵌套过深,可能会出现特别长的类名,从而在Windows上引发一些路径过长的错误。.net
在Spark源码中的ClosureCleaner
类中,咱们能够看到这样的代码,用来判断这个类是否是闭包:线程
// Check whether a class represents a Scala closure
private def isClosure(cls: Class[_]): Boolean = {
cls.getName.contains("$anonfun$")
}
首先咱们使用javap
来看下ClosureDemo.class
文件的内容:
{
public scala.Tuple2<scala.Function0<scala.runtime.BoxedUnit>, scala.Function1<java.lang.Object, java.lang.Object>> func();
descriptor: ()Lscala/Tuple2;
flags: ACC_PUBLIC
Code:
stack=4, locals=4, args_size=1
0: iconst_2
1: invokestatic #16 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
4: astore_1
5: new #18 // class ClosureDemo$$anonfun$1
8: dup
9: aload_0
10: aload_1
11: invokespecial #22 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
14: astore_2
15: new #24 // class ClosureDemo$$anonfun$2
18: dup
19: aload_0
20: aload_1
21: invokespecial #25 // Method ClosureDemo$$anonfun$2."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
24: astore_3
25: new #27 // class scala/Tuple2
28: dup
29: aload_2
30: aload_3
31: invokespecial #30 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
34: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 35 0 this LClosureDemo;
5 29 1 i Lscala/runtime/IntRef;
15 19 2 inc Lscala/Function0;
25 9 3 add Lscala/Function1;
LineNumberTable:
line 3: 0
line 4: 5
line 5: 15
line 6: 25
Signature: #46 // ()Lscala/Tuple2<Lscala/Function0<Lscala/runtime/BoxedUnit;>;Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;>;
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #41 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo;
LineNumberTable:
line 8: 0
}
因为其不含字段表,所以咱们重点关注其方法表,从上述class文件中我满能够看到它具备两个方法:
一、func() 咱们定义的func函数
二、ClosureDemo() 类构造函数
咱们重点关注func
函数的实现:
首先将一个int型整数2
压入栈顶,而后调用scala.runtime.IntRef
类中的静态函数:create(Int):scala.runtime.IntRef
来将以前的2
包装到IntRef类里,咱们来看下IntRef的实现:
package scala.runtime;
public class IntRef implements java.io.Serializable {
private static final long serialVersionUID = 1488197132022872888L;
public int elem;
public IntRef(int elem) { this.elem = elem; }
public String toString() { return java.lang.Integer.toString(elem); }
public static IntRef create(int e) { return new IntRef(e); }
public static IntRef zero() { return new IntRef(0); }
}
代码很简单,只是简单把这个int
类型的变量包装在了IntRef类里,这样这个变量就成功从栈中跑到了堆里。再以后就是两个闭包类的构造过程了,其中有一点须要重点关注下,那就是在调用这两个闭包类的构造函数时,传入了this
和刚刚构造好的IntRef
。
下面咱们进入闭包类里来看下,如下是ClosureDemo$$anonfun$1.class
文件的字段表和方法表,它是inc
编译后生成的字节码:
{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final scala.runtime.IntRef i$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final void apply();
descriptor: ()V
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #23 // Method apply$mcV$sp:()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public void apply$mcV$sp();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=3, locals=1, args_size=1
0: aload_0
1: getfield #27 // Field i$1:Lscala/runtime/IntRef;
4: aload_0
5: getfield #27 // Field i$1:Lscala/runtime/IntRef;
8: getfield #33 // Field scala/runtime/IntRef.elem:I
11: iconst_1
12: iadd
13: putfield #33 // Field scala/runtime/IntRef.elem:I
16: return
LocalVariableTable:
Start Length Slot Name Signature
0 17 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public final java.lang.Object apply();
descriptor: ()Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #36 // Method apply:()V
4: getstatic #42 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
7: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 8 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef);
descriptor: (LClosureDemo;Lscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=3, args_size=3
0: aload_0
1: aload_2
2: putfield #27 // Field i$1:Lscala/runtime/IntRef;
5: aload_0
6: invokespecial #46 // Method scala/runtime/AbstractFunction0$mcV$sp."<init>":()V
9: return
LocalVariableTable:
Start Length Slot Name Signature
0 10 0 this LClosureDemo$$anonfun$1;
0 10 1 $outer LClosureDemo;
0 10 2 i$1 Lscala/runtime/IntRef;
LineNumberTable:
line 4: 0
}
从上述代码中咱们能够看到,其含有两个字段和四个方法:
public static final long serialVersionUID=0L;
private final scala.runtime.IntRef i$1;
public final void apply()
public void apply$mcV$sp()
public final java.lang.Object apply()
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef)
咱们先从构造函数看起,以前分析ClosureDemo.class
时咱们看到在构造两个闭包时,传入了外部类的this引用的IntRef,正时调用的这个构造函数。这个构造函数很简单,把第二个参数IntRef
存到了类字段i$1
里,这个IntRef
就是包装了2
这个数字的类引用。以后调用其父类scala/runtime/AbstractFunction0$mcV$sp
的构造函数。这个类名仍是颇有意思的,从我几回试验来看,它具备如下规律:
一、前半部分scala/runtime/AbstractFunction0
中AbstractFunction0
表明函数的参数类型,0
表明没有参数,AbstractFunctionX
表明X个参数等。它继承了对应的FunctionX
父类。
二、后半部分$mcV$sp
中的V
表明了函数的返回值是Void类型,举个Scala源码中的例子:boolean apply$mcZIJ$sp(int v1, long v2);
咱们再看上面class文件中剩余的几个方法,两个apply
方法,其中一个只是为了兼容老版本而生成的方法(ACC_BRIDGE, ACC_SYNTHETIC),另外一个仅仅直接调用apply$mcV$sp
方法。所以咱们重点来看下apply$mcV$sp
方法的实现。代码也十分简单:
一、取类字段i$1
到栈中
二、取IntRef的elem字段值,即IntRef所包装的值
三、将其加1并写回该IntRef类中
因为IntRef为堆中的类,所以全部其余引用了该IntRef类的字段都将看到该数字被加1(不考虑多线程)
在ClosureDemo$$anonfun$2.class
中的代码和ClosureDemo$$anonfun$1.class
中一致,只是仅仅返回了IntRef中值与输入的Int之和。因为在构造ClosureDemo$$anonfun$1
和ClosureDemo$$anonfun$2
时传入的是同一个IntRef,所以当它们对应的inc
和add
被外部调用时,其操做的数字为同一个数字,看上去就还像操做func
方法中的i
变量同样。这样inc
和add
就实现了包含外部变量i
的闭包。
不知你们是否注意到,在构造这两个闭包时,构造函数里传入了外包装的类对象,可是在这个例子中,咱们看到它并无被使用,而且它的名字很奇特,叫$outer
。下面咱们对例子稍微改造下:
class ClosureDemo {
def func() = {
def i = 2
val j = 3
var k = 4
val add: Int => Int = (ii: Int) => ii + i + j + k
k = k + 1
add
}
}
编译后会生成两个文件:
ClosureDemo.class
ClosureDemo$$anonfun$1.class
咱们仍是先来看ClosureDemo.class
文件:
{
public scala.Function1<java.lang.Object, java.lang.Object> func();
descriptor: ()Lscala/Function1;
flags: ACC_PUBLIC
Code:
stack=5, locals=4, args_size=1
0: iconst_3
1: istore_1
2: iconst_4
3: invokestatic #16 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
6: astore_2
7: new #18 // class ClosureDemo$$anonfun$1
10: dup
11: aload_0
12: iload_1
13: aload_2
14: invokespecial #22 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;ILscala/runtime/IntRef;)V
17: astore_3
18: aload_2
19: aload_2
20: getfield #26 // Field scala/runtime/IntRef.elem:I
23: iconst_1
24: iadd
25: putfield #26 // Field scala/runtime/IntRef.elem:I
28: aload_3
29: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 30 0 this LClosureDemo;
2 27 1 j I
7 22 2 k Lscala/runtime/IntRef;
18 11 3 add Lscala/Function1;
LineNumberTable:
line 4: 0
line 5: 2
line 6: 7
line 7: 18
line 8: 28
Signature: #43 // ()Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;
public final int ClosureDemo$$i$1();
descriptor: ()I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: iconst_2
1: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 2 0 this LClosureDemo;
LineNumberTable:
line 3: 0
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #38 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo;
LineNumberTable:
line 10: 0
}
因为咱们在func
方法中定义了i
函数,所以生成了一个叫作ClosureDemo$$i$1
的方法。咱们首先看下val j
、var k
两个变量的处理方式:
一、因为j是val修饰,所以它直接做为Int类型变量传入了ClosureDemo$$anonfun$1
的构造函数里
二、因为k是var修饰,所以它被包装到了IntRef里并传入ClosureDemo$$anonfun$1
的构造函数里,关注下后面对k加1的操做,它也是基于IntRef这个包装进行的。
以后咱们来看下ClosureDemo$$anonfun$1.class
文件:
{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final ClosureDemo $outer;
descriptor: LClosureDemo;
flags: ACC_PRIVATE, ACC_FINAL, ACC_SYNTHETIC
private final int j$1;
descriptor: I
flags: ACC_PRIVATE, ACC_FINAL
private final scala.runtime.IntRef k$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final int apply(int);
descriptor: (I)I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: iload_1
2: invokevirtual #27 // Method apply$mcII$sp:(I)I
5: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 6 0 this LClosureDemo$$anonfun$1;
0 6 1 ii I
LineNumberTable:
line 6: 0
public int apply$mcII$sp(int);
descriptor: (I)I
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: iload_1
1: aload_0
2: getfield #32 // Field $outer:LClosureDemo;
5: invokevirtual #36 // Method ClosureDemo.ClosureDemo$$i$1:()I
8: iadd
9: aload_0
10: getfield #38 // Field j$1:I
13: iadd
14: aload_0
15: getfield #40 // Field k$1:Lscala/runtime/IntRef;
18: getfield #45 // Field scala/runtime/IntRef.elem:I
21: iadd
22: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 this LClosureDemo$$anonfun$1;
0 23 1 ii I
LineNumberTable:
line 6: 0
public final java.lang.Object apply(java.lang.Object);
descriptor: (Ljava/lang/Object;)Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: aload_1
2: invokestatic #52 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
5: invokevirtual #54 // Method apply:(I)I
8: invokestatic #58 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
11: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 12 0 this LClosureDemo$$anonfun$1;
0 12 1 v1 Ljava/lang/Object;
LineNumberTable:
line 6: 0
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef);
descriptor: (LClosureDemo;ILscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=4, args_size=4
0: aload_1
1: ifnonnull 6
4: aconst_null
5: athrow
6: aload_0
7: aload_1
8: putfield #32 // Field $outer:LClosureDemo;
11: aload_0
12: iload_2
13: putfield #38 // Field j$1:I
16: aload_0
17: aload_3
18: putfield #40 // Field k$1:Lscala/runtime/IntRef;
21: aload_0
22: invokespecial #65 // Method scala/runtime/AbstractFunction1$mcII$sp."<init>":()V
25: return
LocalVariableTable:
Start Length Slot Name Signature
0 26 0 this LClosureDemo$$anonfun$1;
0 26 1 $outer LClosureDemo;
0 26 2 j$1 I
0 26 3 k$1 Lscala/runtime/IntRef;
LineNumberTable:
line 6: 0
StackMapTable: number_of_entries = 1
frame_type = 6 /* same */
}
从上述代码中咱们能够看到,其含有四个字段和四个方法:
public static final long serialVersionUID=0L;
private final ClosureDemo $outer
private final int j$1;
private final scala.runtime.IntRef k$1
public final int apply(int)
public int apply$mcII$sp(int)
public final java.lang.Object apply(java.lang.Object)
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef)
咱们仍是从构造函数开始入手,它先检测了第一个入参是不是null,若是是null则抛出空指针异常,不然将其存入类的$outer
字段里。以后将j: Int
与k: IntRef
存入类的j$1
与k$1
字段里。
因为apply
方法只是简单调用apply$mcII$sp(int)
方法,所以咱们继续分析apply$mcII$sp(int)
。首先它调用了ClosureDemo
类的ClosureDemo$$i$1
方法取i
的值,而后取Int类型的j$1
的值,再取IntRef类型的k$1
中的elem值,将它们加在一块儿返回。
从这个例子咱们能够看出:
一、闭包调用外部方法会把外层类对象存在该闭包的$outer
字段中,并在使用到该函数时用$outer
进行invokevirtual
调用
二、闭包调用外部val变量时,仅仅把该变量存在对应名称的字段中,在使用时直接取值
三、闭包调用外部var变量时,若是变量为值(AnyVal)类型,则会建立对应的Ref对象将其包裹并存在字段中,若是为引用类型(AnyRef),则会建立ObjectRef对象来包裹。在使用时取其elem
字段来取它的原始值。
在本篇博客中,只介绍了一层包装的闭包。在Scala中还能够实现不少层包装的闭包,与一层包装的区别仅仅在于每一层闭包会在须要时将其最近的一层外包装对象的存储在其
$outer
字段里,有兴趣能够本身构造如下来看看其class文件。