精读《函数缓存》

时间 2020-07-30

标签精读函数缓存繁體版

原文原文链接

1 引言

函数缓存是重要概念，本质上就是用空间（缓存存储）换时间（跳过计算过程）。前端

对于无反作用的纯函数，在合适的场景使用函数缓存是很是必要的，让咱们跟着 https://whatthefork.is/memoiz... 这篇文章深刻理解一下函数缓存吧！git

2 概述

假设又一个获取天气的函数 getChanceOfRain，每次调用都要花 100ms 计算：github

import { getChanceOfRain } from "magic-weather-calculator";
function showWeatherReport() {
  let result = getChanceOfRain(); // Let the magic happen
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // (!) Triggers the calculation

很显然这样太浪费计算资源了，当已经计算过一次天气后，就没有必要再算一次了，咱们指望的是后续调用能够直接拿上一次结果的缓存，这样能够节省大量计算。所以咱们能够作一个 memoizedGetChanceOfRain 函数缓存计算结果：npm

import { getChanceOfRain } from "magic-weather-calculator";
let isCalculated = false;
let lastResult;
// We added this function!
function memoizedGetChanceOfRain() {
  if (isCalculated) {
    // No need to calculate it again.
    return lastResult;
  }
  // Gotta calculate it for the first time.
  let result = getChanceOfRain();
  // Remember it for the next time.
  lastResult = result;
  isCalculated = true;
  return result;
}
function showWeatherReport() {
  // Use the memoized function instead of the original function.
  let result = memoizedGetChanceOfRain();
  console.log("The chance of rain tomorrow is:", result);
}

在每次调用时判断优先用缓存，若是没有缓存则调用原始函数并记录缓存。这样当咱们屡次调用时，除了第一次以外都会当即从缓存中返回结果：浏览器

showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // Uses the calculated result
showWeatherReport(); // Uses the calculated result
showWeatherReport(); // Uses the calculated result

然而对于有参数的场景就不适用了，由于缓存并无考虑参数：缓存

function showWeatherReport(city) {
  let result = getChanceOfRain(city); // Pass the city
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // Uses the calculated answer

因为参数可能性不少，因此有三种解决方案：微信

1. 仅缓存最后一次结果

仅缓存最后一次结果是最节省存储空间的，并且不会有计算错误，但带来的问题就是当参数变化时缓存会当即失效：闭包

import { getChanceOfRain } from "magic-weather-calculator";
let lastCity;
let lastResult;
function memoizedGetChanceOfRain(city) {
  if (city === lastCity) {
    // Notice this check!
    // Same parameters, so we can reuse the last result.
    return lastResult;
  }
  // Either we're called for the first time,
  // or we're called with different parameters.
  // We have to perform the calculation.
  let result = getChanceOfRain(city);
  // Remember both the parameters and the result.
  lastCity = city;
  lastResult = result;
  return result;
}
function showWeatherReport(city) {
  // Pass the parameters to the memoized function.
  let result = memoizedGetChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("London"); // Uses the calculated result

在极端状况下等同于没有缓存：app

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // (!) Triggers the calculation

2. 缓存全部结果

第二种方案是缓存全部结果，使用 Map 存储缓存便可：ide

// Remember the last result *for every city*.
let resultsPerCity = new Map();
function memoizedGetChanceOfRain(city) {
  if (resultsPerCity.has(city)) {
    // We already have a result for this city.
    return resultsPerCity.get(city);
  }
  // We're called for the first time for this city.
  let result = getChanceOfRain(city);
  // Remember the result for this city.
  resultsPerCity.set(city, result);
  return result;
}
function showWeatherReport(city) {
  // Pass the parameters to the memoized function.
  let result = memoizedGetChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("London"); // Uses the calculated result
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("Paris"); // (!) Triggers the calculation

这么作带来的弊端就是内存溢出，当可能参数过多时会致使内存无限制的上涨，最坏的状况就是触发浏览器限制或者页面崩溃。

3. 其余缓存策略

介于只缓存最后一项与缓存全部项之间还有这其余选择，好比 LRU（least recently used）只保留最小化最近使用的缓存，或者为了方便浏览器回收，使用 WeakMap 替代 Map。

最后提到了函数缓存的一个坑，必须是纯函数。好比下面的 CASE：

// Inside the magical npm package
function getChanceOfRain() {
  // Show the input box!
  let city = prompt("Where do you live?");
  // ... calculation ...
}
// Our code
function showWeatherReport() {
  let result = getChanceOfRain();
  console.log("The chance of rain tomorrow is:", result);
}

getChanceOfRain 每次会由用户输入一些数据返回结果，致使缓存错误，缘由是 “函数入参一部分由用户输入” 就是反作用，咱们不能对有反作用的函数进行缓存。

这有时候也是拆分函数的意义，将一个有反作用函数的无反作用部分分解出来，这样就能局部作函数缓存了：

// If this function only calculates things,
// we would call it "pure".
// It is safe to memoize this function.
function getChanceOfRain(city) {
  // ... calculation ...
}
// This function is "impure" because
// it shows a prompt to the user.
function showWeatherReport() {
  // The prompt is now here
  let city = prompt("Where do you live?");
  let result = getChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

最后，咱们能够将缓存函数抽象为高阶函数：

function memoize(fn) {
  let isCalculated = false;
  let lastResult;
  return function memoizedFn() {
    // Return the generated function!
    if (isCalculated) {
      return lastResult;
    }
    let result = fn();
    lastResult = result;
    isCalculated = true;
    return result;
  };
}

这样生成新的缓存函数就方便啦：

let memoizedGetChanceOfRain = memoize(getChanceOfRain);
let memoizedGetNextEarthquake = memoize(getNextEarthquake);
let memoizedGetCosmicRaysProbability = memoize(getCosmicRaysProbability);

isCalculated 与 lastResult 都存储在 memoize 函数生成的闭包内，外部没法访问。

3 精读

通用高阶函数实现函数缓存

原文的例子仍是比较简单，没有考虑函数多个参数如何处理，下面咱们分析一下 Lodash memoize 函数源码：

function memoize(func, resolver) {
  if (
    typeof func != "function" ||
    (resolver != null && typeof resolver != "function")
  ) {
    throw new TypeError(FUNC_ERROR_TEXT);
  }
  var memoized = function () {
    var args = arguments,
      key = resolver ? resolver.apply(this, args) : args[0],
      cache = memoized.cache;

    if (cache.has(key)) {
      return cache.get(key);
    }
    var result = func.apply(this, args);
    memoized.cache = cache.set(key, result) || cache;
    return result;
  };
  memoized.cache = new (memoize.Cache || MapCache)();
  return memoized;
}

原文有提到缓存策略多种多样，而 Lodash 将缓存策略简化为 key 交给用户本身管理，看这段代码：

key = resolver ? resolver.apply(this, args) : args[0];

也就是缓存的 key 默认是执行函数时第一个参数，也能够经过 resolver 拿到参数处理成新的缓存 key。

在执行函数时也传入了参数 func.apply(this, args)。

最后 cache 也再也不使用默认的 Map，而是容许用户自定义 lodash.memoize.Cache 自行设置，好比设置为 WeakMap：

_.memoize.Cache = WeakMap;

何时不适合用缓存

如下两种状况不适合用缓存：

不常常执行的函数。
自己执行速度较快的函数。

对于不常常执行的函数，自己就不须要利用缓存提高执行效率，而缓存反而会长期占用内存。对于自己执行速度较快的函数，其实大部分简单计算速度都很快，使用缓存后对速度没有明显的提高，同时若是计算结果比较大，反而会占用存储资源。

对于引用的变化尤为重要，好比以下例子：

function addName(obj, name){
  return {
    ...obj,
    name:
  }
}

为 obj 添加一个 key，自己执行速度是很是快的，但添加缓存后会带来两个坏处：

若是 obj 很是大，会在闭包存储完整 obj 结构，内存占用加倍。
若是 obj 经过 mutable 方式修改了，则普通缓存函数还会返回原先结果（由于对象引用没有变），形成错误。

若是要强行进行对象深对比，虽然会避免出现边界问题，但性能反而会大幅降低。

4 总结

函数缓存很是有用，但并非全部场景都适用，所以千万不要极端的将全部函数都添加缓存，仅限于计算耗时、可能重复利用屡次，且是纯函数的。

讨论地址是：精读《函数缓存》· Issue #261 · dt-fe/weekly

若是你想参与讨论，请点击这里，每周都有新的主题，周末或周一发布。前端精读 - 帮你筛选靠谱的内容。

关注 前端精读微信公众号

版权声明：自由转载-非商用-非衍生-保持署名（创意共享 3.0 许可证）

本文使用 mdnice 排版