[smali]String/StringBuilder字符串拼接操做

相关demo源码;java

基于: macOs:10.13/AS:3.3.2/Android build-tools:28.0.0/jdk: 1.8android

1. 原因

这两天在看 smali, 偶然看到 log 语句中的 String 拼接被优化为了 StringBuilder, 代码以下;git

// MainActivity.java
public class MainActivity extends AppCompatActivity implements View.OnClickListener {
    private static final String TAG = "MainActivity";
    private void methodBoolean(boolean showLog) {
        Log.d(TAG, "methodBoolean: " + showLog);
    }
}
复制代码
# 对应的 smali 代码
.method private methodBoolean(Z)V
 .locals 3
 .param p1, "showLog"    # Z

 .line 51
    const-string v0, "MainActivity" # 定义 TAG 变量值
    new-instance v1, Ljava/lang/StringBuilder; # 建立了一个 StringBuilder
    invoke-direct {v1}, Ljava/lang/StringBuilder;-><init>()V

    # 定义 Log msg参数中第一部分字符串字面量值
    const-string v2, "methodBoolean: "

    # 拼接并输出 String 存入 v1 寄存器中
    invoke-virtual {v1, v2}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    invoke-virtual {v1, p1}, Ljava/lang/StringBuilder;->append(Z)Ljava/lang/StringBuilder;
    invoke-virtual {v1}, Ljava/lang/StringBuilder;->toString()Ljava/lang/String;
    move-result-object v1

    # 调用 Log 方法打印日志
    invoke-static {v0, v1}, Landroid/util/Log;->d(Ljava/lang/String;Ljava/lang/String;)I
 .line 52
    return-void .end method
复制代码

想起之前根深蒂固的 "大量字符串拼接时 StringBuilderString 性能更好" 的说法, 顿时好奇是否真是那样, 是否全部场景都那样, 因此想探究下, 简单起见, 源码用 Java 而非 Kotlin 编写;github

2. 测试

既然底层会优化为 StringBuilder 那拼接还会有效率差距吗? 测试下数组

public class MainActivity extends AppCompatActivity implements View.OnClickListener {
    /** * String循环拼接测试 * * @param loop 循环次数 * @param base 拼接字符串 * @return 耗时, 单位: ms */
    private long methodForStr(int loop, String base) {
        long startTs = System.currentTimeMillis();
        String result = "";
        for (int i = 0; i < loop; i++) {
            result += base;
        }
        return System.currentTimeMillis() - startTs;
    }

    /** * StringBuilder循环拼接测试 */
    @Keep
    private long methodForSb(int loop, String base) {
        long startTs = System.currentTimeMillis();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < loop; i++) {
            sb.append(base);
        }
        String result = sb.toString();
        return System.currentTimeMillis() - startTs;
    }
}
复制代码

在三星s8+ 上循环拼接 5000 次 smali 字符串,获得二者的耗时大概为 460ms:1ms, 效率差距明显;app

3. smali 循环拼接代码分析

既然 String 拼接会转化为 StringBuilder, 理论上来讲应该差距不大才对,但实际差距明显, 猜测可能跟for循环有关,咱们看下 methodForStr(int loop, String base) 方法的smali代码:ide

.method private methodForStr(ILjava/lang/String;)J
 .locals 5
 .param p1, "loop"    # I 表示参数 loop
 .param p2, "base"    # Ljava/lang/String;

 .line 73
    invoke-static {}, Ljava/lang/System;->currentTimeMillis()J # 获取循环起始时间戳

    move-result-wide v0

 .line 74
 .local v0, "startTs":J # v0表示 局部变量 startTs ,类型为 long
    const-string v2, ""

 .line 75
 .local v2, "result":Ljava/lang/String; # v2 表示局部变量 result
    const/4 v3, 0x0 # 定义for循环变量 i 的初始化

 .local v3, "i":I
    :goto_0  # for循环体起始处
    if-ge v3, p1, :cond_0  # 若 i >= loop 值,则跳转到 cond_0 标签处,退出循环,不然继续执行下面的代码

    # 如下为for循环体逻辑:
    # 1. 建立 StringBuilder 对象
    # 2. 拼接 result + base 字符串, 而后经过 toString() 获得拼接结果
    # 3. 将结果再赋值给 result 变量
    # 4. 进入下一轮循环
 .line 76
    new-instance v4, Ljava/lang/StringBuilder;
    invoke-direct {v4}, Ljava/lang/StringBuilder;-><init>()V

    invoke-virtual {v4, v2}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    invoke-virtual {v4, p2}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    invoke-virtual {v4}, Ljava/lang/StringBuilder;->toString()Ljava/lang/String;

    move-result-object v2

    # for 循环变量i自加1,而后进行下一轮循环
 .line 75
    add-int/lit8 v3, v3, 0x1 # 将第二个寄存器v3中的值加上0x1,而后放入第一个寄存器v3中, 实现自增加

    goto :goto_0 # 跳转到 goto_0 标签,即: 从新计算循环条件, 执行循环体

 .line 78 .end local v3    # "i":I
    :cond_0 # 定义标签 cond_0

    # 循环结束后,获取当前时间戳, 并计算耗时
    invoke-static {}, Ljava/lang/System;->currentTimeMillis()J
    move-result-wide v3
    sub-long/2addr v3, v0

    return-wide v3 .end method
复制代码

根据上面的 smali 代码,能够逆推出其源码应该为:oop

private long methodForStr(int loop, String base) {
    long startTs = System.currentTimeMillis();
    String result = "";
    for (int i = 0; i < loop; i++) {
        // 每次都在循环体中将 String 的拼接改为了 StringBuilder
        // 这算是负优化吗?
        StringBuilder sb = new StringBuilder();
        sb.append(result);
        sb.append(base);
        result = sb.toString();
    }
    return System.currentTimeMillis() - startTs;
}
复制代码

4. 源码分析

4.1 String.java

/* * Strings are constant; their values cannot be changed after they * are created. String buffers support mutable strings. * Because String objects are immutable they can be shared * */
public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
        // String实际也是char数组,但因为其用private final修饰,因此不可变(固然,还有其余措施共同保证"不可变")
        private final char value[];
    }
复制代码

类注释描述了其为 immutable ,每一个字面量都是一个对象,修改string时,不会在原内存处进行修改,而是从新指向一个新对象:源码分析

String str = "a"; // String对象 "a"
str = "a" + "a"; // String对象 "aa"
复制代码

每次进行 + 运算时,都会生成一个新的 String 对象:性能

string追加

// 结合第3部分的smali分析,能够发现:
// 每次for循环体中,都会建立一个 `StringBuilder`对象,并生成拼接结果的 `String` 对象;
private long methodForStr(int loop, String base) {
    long startTs = System.currentTimeMillis();
    String result = "";
    for (int i = 0; i < loop; i++) {
        result += base;
    }
    return System.currentTimeMillis() - startTs;
}
复制代码

在循环体中频繁的建立对象,还会致使大量对象被废弃,触发GC,频繁 stop the world 天然也会致使拼接耗时加长, 以下图:

string拼接gc

4.2 StringBuilder.java

/** * A mutable sequence of characters. This class provides an API compatible * with {@code StringBuffer}, but with no guarantee of synchronization. * */
public final class StringBuilder extends AbstractStringBuilder implements java.io.Serializable, CharSequence{}

// StringBuilder 的类注释指明了其实际为一个可变字符数组, 核心逻辑其实都实如今 AbstractStringBuilder 中了
// 咱们看下 stringBuilder.append("str") 是怎么实现的
abstract class AbstractStringBuilder implements Appendable, CharSequence {
    char[] value; // 用于实际存储字符串对应的字符序列
    int count; // 已存储的字符个数

    AbstractStringBuilder() {
    }

    // 提供一个合理的初始化容量大小, 有助于减少扩容次数,提升效率
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }

    @Override
    public AbstractStringBuilder append(CharSequence s) {
        if (s == null)
            return appendNull();
        if (s instanceof String)
            return this.append((String)s);
        if (s instanceof AbstractStringBuilder)
            return this.append((AbstractStringBuilder)s);

        return this.append(s, 0, s.length());
    }

    public AbstractStringBuilder append(String str) {
        if (str == null)
            return appendNull();
        int len = str.length();
        ensureCapacityInternal(count + len); // 确保value数组有足够的空间能够存储变量str的全部字符
        str.getChars(0, len, value, count); // 提取变量str中的全部字符,并追加复制到value数组的最后
        count += len;
        return this;
    }

    // 若是当前value数组容量不够,进行自动扩容: 建立新数组,并复制原数组数据
    private void ensureCapacityInternal(int minimumCapacity) {
        if (minimumCapacity - value.length > 0) {
            value = Arrays.copyOf(value,
                    newCapacity(minimumCapacity));
        }
    }
}

// String.java
public final String{
    // 从当前字符串中复制指定区间的字符到数组dst dstBegin位后
    public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        // 省略部分判断代码
        getCharsNoCheck(srcBegin, srcEnd, dst, dstBegin);
    }

    @FastNative
    native void getCharsNoCheck(int start, int end, char[] buffer, int index);
}
复制代码

从上面源码能够看出 StringBuilder 每次 append 字符串时,都是在操做同一个 char[] 数组(无需扩容时),不涉及对象的建立;

stringBuilder数组操做

5. 是否是全部字符串拼接场景都该首选 StringBuilder ?

也不尽然, 好比有些是编译时常量, 直接用 String 就能够, 即便用 StringBuilder , AS也会提示改成 String 否则反倒浪费;

对于非循环拼接字符串的场景, 源码是用 String 或者 StringBuilder 没啥区别, 字节码中都转换成 StringBuilder 了;

建议StringBuilder转String

// 编译时常量测试
    private String methodFixStr() {
        return "a" + "a" + "a" + "a" + "a" + "a";
    }

    private String methodFixSb() {
        StringBuilder sb = new StringBuilder();
        sb.append("a");
        sb.append("a");
        sb.append("a");
        sb.append("a");
        sb.append("a");
        return sb.toString();
    }
复制代码

对应的smali代码:

.method private methodFixStr()Ljava/lang/String;
 .locals 1

 .line 100
    const-string v0, "aaaaaa" # 编译器直接优化成最终结果了

    return-object v0 .end method

# stringBuilder就没有优化,仍是要一步一步进行拼接
# 这也就是 IDE 提示使用 String 的缘由吧
.method private methodFixSb()Ljava/lang/String;
 .locals 2

 .line 108
    new-instance v0, Ljava/lang/StringBuilder;
    invoke-direct {v0}, Ljava/lang/StringBuilder;-><init>()V

 .line 109
 .local v0, "sb":Ljava/lang/StringBuilder;
    const-string v1, "a"

    invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;

 .line 110
    const-string v1, "a"
    invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;

 .line 111
    const-string v1, "a"
    invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;

 .line 112
    const-string v1, "a"
    invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;

 .line 113
    const-string v1, "a"
    invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;

 .line 114
    invoke-virtual {v0}, Ljava/lang/StringBuilder;->toString()Ljava/lang/String;

    move-result-object v1
    return-object v1 .end method
复制代码
相关文章
相关标签/搜索