puppeteer二维并发队列解决QA重构对比测试难题

时间 2019-11-24

原文原文链接

嗯，周末在家抱着沉睡中的宝宝边翻掘金，翻到一篇介绍puppeteer的文章，联想到最近正在搞的Client端重构，终觉能作点什么……javascript

由于此次大规模的重构全面使用了资源权限替代老版本中的硬编码鉴权，而这些入库的资源整理所有来自于对老代码的人为判断（得益于以前的Hive架构，每一个模块重构都是独立负责人，由其收集资源再合适不过了）。css

但人眼总归不太可信，QA须要针对30+客户，平均8个左右的角色，首期20个重构功能用肉眼进行新老界面资源比较测试。粗略计算下时间：按每一个功能5分钟的比较时间计算，前端

30（客户）* 8（角色）* 2（新老2个帐号）* 20（功能）* 5（分钟）=  48000 （分钟）
复制代码

也就是800小时，不吃不喝不睡33天/人，真TM惊人。java

而后思惟开始跳脱，遐想，我应该能用puppeteer为QA小姐姐们作点什么才是。node

Puppeteer是什么？

Puppeteer 是一个 Node 库，它提供了一个高级 API 来经过 DevTools 协议控制 Chromium 或 Chrome。Puppeteer 默认以 headless 模式运行，可是能够经过修改配置文件运行“有头”模式。chrome

它是google为chrome量身打造的，并且仍是nodejs实现的，想一想就很激动对不对？typescript

Puppeteer能作什么？

你能够在浏览器中手动执行的绝大多数操做均可以使用 Puppeteer 来完成！下面是一些示例：数据库

生成页面 PDF。npm

抓取 SPA（单页应用）并生成预渲染内容（即“SSR”（服务器端渲染））。api

自动提交表单，进行 UI 测试，键盘输入等。

建立一个时时更新的自动化测试环境。使用最新的 JavaScript 和浏览器功能直接在最新版本的Chrome中执行测试。

捕获网站的 timeline trace，用来帮助分析性能问题。

测试浏览器扩展。

个人设想

基于Hive的子项目管理功能，封装一个相似于分布式的自动化测试框架。

首要目的是解决资源比对的问题；
框架能自动帮每个子项目登陆新老帐号，并自动导航到子项目页面；
分发下去，由子项目的负责人实现本身的资源采集与截屏操做；
管理每一个功能-客户-角色的对应资源数据；
在功能-客户-角色的对应位置生成测试报表与截屏，特别是报表，包含了新老功能的资源比对，匹配不上的会标红，还会计算匹配百分比；
提供一些API给子项目使用，譬如：
- 屏幕截取，puppeteer提供的原生截屏api须要手动指定生成位置，我提供的已经预置好生成位置，并极大的简化了api，好比api.screenshot(name)便可生成一张名为name_new或name_old的截屏文件到相应位置；
- 添加资源，当操做权交由子项目接管时，他们能使用puppeteer在页面上抓取指定的资源元素，调用添加资源的api将其添加，最终生成新老资源比对报表；
- 其它一些简化过的api等。
相似于分布式系统同样，框架会去挨个调用各个子项目的实现，最终生成详细报表与大纲报表

Puppeteer 安装指南

Puppeteer须要node版本v7.6.0以上才支持async/await，建议安装最新stable版本。

安装Puppeteer时会自动下载最新的Chromium(~71Mb Mac, ~90Mb Linux, ~110Mb Win)，这一步对于国内的网络而言是很是不友好的。

能够在安装前执行如下命令避免自动下载Chromium：

npm config set puppeteer_skip_chromium_download true
复制代码

安装puppeteer与typescript支持。

npm install puppeteer @types/puppeteer --save-dev
复制代码

使用方法

const browser = await puppeteer.launch({
    executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
    headless: false, // 无头模式
    timeout: 0, // 超时时间
    // devtools: true, // 自动打开devtools面板
    defaultViewport: defaultViewport, // 默认窗口尺寸
    args: ['--start-maximized'], // 浏览器默认参数 [全屏]
  });
复制代码

对于忽略了Chromium内核下载的状况，须要加上executablePath这个属性，手动指定本地的chrome执行文件位置。

headless属性默认为true，表明是否启动无头模式（不使用浏览器界面直接进行测试）

咱们来了解一下Puppeteer的API

有了上面launch的浏览器实例以后，建立一个新页签：

const page = browser.newPage();
复制代码

多数时候咱们都是在跟page对象打交道，好比：

前往某页面

await page.goto(`${context}/Account/Login`, {
    waitUntil: "networkidle0"
  });
复制代码

第二个是可选参数，拿waitUntil举例，意思是await直到没有任何网络请求时才继续执行以后的代码。

它有四个可选值，分别以下：

load - 页面的load事件触发时
domcontentloaded - 页面的DOMContentLoaded事件触发时
networkidle0 - 再也不有网络链接时触发（至少500毫秒后）
networkidle2 - 只有2个网络链接时触发（至少500毫秒后）

在登陆页，咱们须要操做表单，输入一些内容

await page.type('#userName', account.username);
  await page.type('#password', account.password);
  await page.click('button[type=submit]');
复制代码

type方法意味着键入，第一个参数是元素选择器（input元素），第二个参数是须要输入的内容

click方法表明触发某选择器指定元素的点击事件（除了click，还有hover、focus、press等事件）

puppeteer有几个特色：

1，全部的操做都是异步的，都须要使用async、await去调用。由于它是基于chrome DevTools协议的，与chrome的互相调用都是靠发送异步消息。

2，大量的api依赖选择器，因此css选择器须要了解；

3，有一部分隐藏api没有在文档上体现出来，好比如何打开隐身模式，如何清除cookie等。

当完成表单输入与点击提交以后，咱们须要跳转到子项目页面去，可是，在这以前咱们须要等待登陆操做完成才行

await page.waitForNavigation({
    waitUntil: "domcontentloaded"
  });
复制代码

此处，咱们看到waitUntil的第二个枚举值了domcontentloaded，它的意思是和document的DOMContentLoaded事件是同样的意思。了解前端开发的童鞋应该都清楚它与onload事件的差别，我这里就很少说了，反正比onload时间点靠前不少。

获取页面元素

const input = await page.$('input.form-input');
  const buttons = await page.$$('button');
复制代码

page.$ 能够理解为咱们经常使用的 document.querySelector, 而 page.$$ 则对应 document.querySelectorAll。

// 获取视窗信息
  const dimensions = await page.evaluate(() => {
      return {
          width: document.documentElement.clientWidth,
          height: document.documentElement.clientHeight,
          deviceScaleFactor: window.devicePixelRatio
      };
  });
  const value = await page.$eval('input[name=search]', input => input.value);
复制代码

page.evaluate 意为在浏览器环境执行脚本，可传入第二个参数做为句柄，而 page.$eval 则针对选中的一个 DOM 元素执行操做。

此外，还有一个功能颇有意思`page.exposeFunction`暴露函数

const puppeteer = require('puppeteer');
  const crypto = require('crypto');
  
  puppeteer.launch().then(async browser => {
    const page = await browser.newPage();
    page.on('console', msg => console.log(msg.text));
    await page.exposeFunction('md5', text =>
      crypto.createHash('md5').update(text).digest('hex')
    );
    await page.evaluate(async () => {
      // use window.md5 to compute hashes
      const myString = 'PUPPETEER';
      const myHash = await window.md5(myString);
      console.log(`md5 of ${myString} is ${myHash}`);
    });
    await browser.close();
  });

复制代码

上面的例子展现了如何暴露一个md5方法到window对象上

由于nodejs有不少工具包能够很轻松的实现很复杂的功能，好比要实现md5加密函数，这个用纯js去实现就不太方便了，而用nodejs倒是几行代码的事情。

开始设计咱们的框架

步骤贴出来

准备一份帐号列表（包含新老版本），获取这个列表；
使用lerna ls获取全部的子项目列表，转而寻找其实现了的xxxx/autotest/index.ts文件；
循环帐号，每一个帐号登陆进去，再循环全部子项目，挨个导航进去，将操做权以及准备好的page与api对象交给子项目。
- 子项目收集资源与截屏
结束操做，将收集到的全部资源分门别类的按功能-客户-角色位置生成新老对比报表
最后生成大纲报表

其实步骤很是简洁明了。

先贴一下最终的报表截图

详情报表

大纲报表

新老资源对比，有差别的地方都标红处理，并给出各自的资源总数与匹配百分比，相似于Jest单元测试生成的测试报告。

再贴一下主要实现代码

const doAutoTest = async (page: puppeteer.Page, resource: IAllProjectResource, account: IAccount, projectAutoTestList: IAutoTest[], newVersion: boolean) => {
    await login(page, account);
    const clientCode = await getCookie(page, 'ClientCode');
    const clientDevRole = await getCookie(page, 'ClientCurrentRole');
    for(const autotest of projectAutoTestList) {
      await page.goto(`${context}${autotest.url}`, {
        waitUntil: "domcontentloaded"
      });
      const api: IpuppeteerApi = {
        logger: logger,
        screenshot: screenshot({ page, newVersion, clientCode, clientDevRole, subProjectName: autotest.name }),
        addResource: addResource({ newVersion, clientCode, clientDevRole, subProjectName: autotest.name, projectResource: resource, username: account.username }),
        isExist: isExist({ page })
      };
      !newVersion ? await autotest.oldVersionTest(page, api) : await autotest.newVersionTest(page, api);
    }
  };

  (async () => {
    const browser = await launch();
    const projectAutoTestList = getSubProjectAutotests();
    const accounts = getAccounts();
    const resource = getResource();
    const page = await createPage();
    for(const accountGroup of accounts) {
      await doAutoTest(page, resource, accountGroup.oldVersionAccount, projectAutoTestList, false);
      await doAutoTest(page, resource, accountGroup.newVersionAccount, projectAutoTestList, true);
    }
    
    exportHtml(resource);
    await browser.close();
  })();
复制代码

是否到这里就结束了？

No，我可不是个标题党。做为一个不折腾会死的技术宅，我以为还能够更进一步，作到极致！

目前咱们的自动化测试是挨个帐号登入，并挨个功能进行测试。固然，对于这个流程来讲是没问题的，人类也是这样测试的，只不过机器不用休息，“手速”更快而已……

但，

CPU跑满了吗？

内存跑满了吗？

机器发挥了应有的价值了吗？

彻底没有。

是的，也许你猜到了，多任务+并发

起初我会想，对于chrome而言，咱们打开的浏览器是共享session的，随便看了看官方文档没发现有新建session相关的api……

那咱们仍然能够去实现登陆单个帐号，开多个页签同时测多个功能吧！

答案是能够的，但我仍然不死心，google了一圈，还真发现了一个隐藏api

const { browserContextId } = await browser._connection.send('Target.createBrowserContext');
  _page = await browser._createPageInContext(browserContextId);
  _page.browserContextId = browserContextId;
复制代码

经过发送Target.createBrowserContext指令（暂且称之为指令吧），能够建立一个新的上下文（用浏览器的功能来讲就是建立一个隐身模式）。而后经过browser._createPageInContext(browserContextId)就能够获得这个新的隐身模式窗口对象！

经过这个api，我能够建立无数多个session隔离的page对象！

有了这个api，我就能够实现n*m二维并发（并发多个帐号登陆，每一个帐号并发多个功能测试）

首先，我须要稍稍改造一下代码

大部分的代码不须要改动，把它们看待成任务便可，咱们须要添加的是任务调度逻辑。

实现一个TaskQueue队列

export default class Task<T = any> {

  executor: () => Promise<T>;

  constructor(executor?: () => Promise<T>) {
    this.executor = executor;
  }

  async execute() {
    return this.executor && await this.executor();
  }
}

export default class TaskQueue<T extends Task> {

  concurrence: number;
  queue: T[] = [];

  constructor(concurrence: number = 1) {
    this.concurrence = concurrence;
  }

  addTask(task: T | T[]) {
    if (Object.prototype.toString.call([]) === '[object Array]') {
      this.queue = [...this.queue, ...task as T[]];
    } else {
      this.queue.push(task as T);
    }
    return this;
  }

  async run() {
    const todos = this.queue.splice(0, this.concurrence);
    if (todos.length === 0) return;
    await Promise.all(todos.map(task => task.execute()));
    return await this.run();
  }
}
复制代码

再实现一个AccountTask类

import Task from "./task";
import puppetter from 'puppeteer';
import ProjectTask from "./projectTask";
import TaskQueue from "./taskQueue";
import { IAccount, IAutoTest, getCookie, IpuppeteerApi, screenshot, addResource, isExist } from "../utils/api";
import { max_page_number, context } from "../utils/constant";
import logger from "../utils/logger";
import login from "../login";
import { createPage, pool } from "../browser";
import { getResource } from "../config/config";

const resource = getResource();

export default class AccountTask extends Task {
  
  account: IAccount;
  page: puppetter.Page;
  projects: IAutoTest[];
  taskQueue: TaskQueue<ProjectTask>;
  newVersion: boolean;

  constructor(account: IAccount, projects: IAutoTest[], newVersion: boolean) {
    super();
    this.account = account;
    this.projects = projects;
    this.taskQueue = new TaskQueue(max_page_number);
    this.newVersion = newVersion;
    this.initTaskQueue();
  }

  initTaskQueue() {
    this.taskQueue.addTask(this.projects.map((autotest, index) => new ProjectTask(async () => {
      const page = await createPage(true, this.page);
      const clientCode = await getCookie(this.page, 'ClientCode');
      const clientDevRole = await getCookie(this.page, 'ClientCurrentRole');
      await page.goto(`${context}${autotest.url}`, {
        waitUntil: "domcontentloaded"
      });
      const api: IpuppeteerApi = {
        logger: logger,
        screenshot: screenshot({ page, newVersion: this.newVersion, clientCode, clientDevRole, subProjectName: autotest.name }),
        addResource: addResource({ newVersion: this.newVersion, clientCode, clientDevRole, subProjectName: autotest.name, projectResource: resource, username: this.account.username }),
        isExist: isExist({ page })
      };
      !this.newVersion ? await autotest.oldVersionTest(page, api) : await autotest.newVersionTest(page, api);
      await page.close();
    })));
  }
  
  async execute() {
    this.page = await createPage(true);
    await login(this.page, this.account);
    await this.taskQueue.run();
    this.page.close();
  }

}
复制代码

而后改造一下主函数

if (argv.mode === 'crazy') {
    const accountTaskQueue = new TaskQueue(max_isolation_number);
    for(const accountGroup of accounts) {
      accountTaskQueue.addTask([
        new AccountTask(accountGroup.oldVersionAccount, projectAutoTestList, false),
        new AccountTask(accountGroup.newVersionAccount, projectAutoTestList, true)
      ]);
    }
    await accountTaskQueue.run();
  } else {
    const page = await createPage();
    for(const accountGroup of accounts) {
      logger.info(`start old version test`);
      await doAutoTest(page, resource, accountGroup.oldVersionAccount, projectAutoTestList, false);
      logger.info(`start new version test`);
      await doAutoTest(page, resource, accountGroup.newVersionAccount, projectAutoTestList, true);
    }
  }
复制代码

这里配置了两个常量用于控制并发阈值

export const max_isolation_number = 5; // session隔离（帐号）最大并发数

  export const max_page_number = 5; // 子项目最大并发数
复制代码

判断若是为鸡血（crazy）模式，则建立一个accountTaskQueue队列，载入帐号登陆任务。AccountTask内部会初始化另外一个队列taskQueue，用于处理子项目任务队列。

我管它叫打鸡血，由于实际跑起来确实有些吓人。

而后启动帐号队列便可。

和预想的同样，当accountTaskQueue启动时，立马启动了5个隐身模式。而AccountTask内部的taskQueue又会各自并发5个子项目任务进行测试。

速度很是快，完成一次测试比刚刚至少快了10倍不止！！！

但是仔细想了一下，发现仍然有一个小问题，你们能够观察一下下面这段队列控制代码

async run() {
    const todos = this.queue.splice(0, this.concurrence);
    if (todos.length === 0) return;
    await Promise.all(todos.map(task => task.execute()));
    return await this.run();
  }
复制代码

这段代码最大的问题在于await Promise.all，每次取最大并发数（5个）同时执行，但我没有考虑过动态补充剩下的进去执行，而是傻傻的等到5个都执行结束再取5个执行。

这样就浪费了时间与CPU性能。

可是这里又是个异步并发队列，动态补充并非太好处理。

权衡了一下子，仍是不造轮子了，找了个国外大神sindresorhus写的库p-limit，从新处理了一下这里

import pLimit, { Limit } from 'p-limit';

export default class TaskQueue<T extends Task> {

  concurrence: number;
  queue: T[] = [];
  limit: Limit;

  constructor(concurrence: number = 1) {
    this.concurrence = concurrence;
    this.limit = pLimit(concurrence);
  }

  addTask(task: T | T[]) {
    if (Object.prototype.toString.call([]) === '[object Array]') {
      this.queue = [...this.queue, ...task as T[]];
    } else {
      this.queue.push(task as T);
    }
    return this;
  }

  async run() {
    return await Promise.all(this.queue.map(task => this.limit(async () => await task.execute())));
  }
}
复制代码

它的做用主要是帮咱们起到一个队列的并发限制做用，等因而把队列的功能转交给它了。

这时候咱们再运行代码看看，起初是5个，某个窗口执行完了自动关闭以后立刻又补了一个新的进来。嗯，这是咱们想要的结果了。

还有优化空间吗？

有！咱们每个隐身模式任务执行完毕以后都会关闭，而后又从新申请建立，这也形成了资源与时间的浪费。

怎么办呢？

我想到了数据库的链接池

我能够将这些隐身模式的page对象放进池里面，须要的时候去取，不用了从新放回池子里就好了。

修改AccountTask的执行函数execute以下

async execute() {
    this.page = await pool.getMainPage();
    await login(this.page, this.account);
    await this.taskQueue.run();
    pool.recycle(this.page);
  }
复制代码

又运行了一遍，计算了一下时间，运行8个帐号2个功能，时间从70秒提高到了48秒！

至此，咱们不只完成了最初的目标，还提供了普通与鸡血两种模式。

普通模式用来开发与调试，鸡血模式用于实际应用。

特别是鸡血模式，才是此次文章要表达的精髓所在，它实现了一个二维并发队列，这是前端工程师在平常工做中比较少接触到的。