flask 源码解析：上下文

时间 2019-11-09

原文原文链接

这是 flask 源码解析系列文章的其中一篇，本系列全部文章列表：python

上下文（application context 和 request context）

上下文一直是计算机中难理解的概念，在知乎的一个问题下面有个很通俗易懂的回答：并发

每一段程序都有不少外部变量。只有像Add这种简单的函数才是没有外部变量的。一旦你的一段程序有了外部变量，这段程序就不完整，不能独立运行。你为了使他们运行，就要给全部的外部变量一个一个写一些值进去。这些值的集合就叫上下文。
-- vzchapp

好比，在 flask 中，视图函数须要知道它执行状况的请求信息（请求的 url，参数，方法等）以及应用信息（应用中初始化的数据库等），才可以正确运行。frontend

最直观地作法是把这些信息封装成一个对象，做为参数传递给视图函数。可是这样的话，全部的视图函数都须要添加对应的参数，即便该函数内部并无使用到它。ide

flask 的作法是把这些信息做为相似全局变量的东西，视图函数须要的时候，可使用 from flask import request 获取。可是这些对象和全局变量不一样的是——它们必须是动态的，由于在多线程或者多协程的状况下，每一个线程或者协程获取的都是本身独特的对象，不会互相干扰。

那么如何实现这种效果呢？若是对 python 多线程比较熟悉的话，应该知道多线程中有个很是相似的概念 threading.local，能够实现多线程访问某个变量的时候只看到本身的数据。内部的原理提及来也很简单，这个对象有一个字典，保存了线程 id 对应的数据，读取该对象的时候，它动态地查询当前线程 id 对应的数据。flaskpython 上下文的实现也相似，后面会详细解释。

flask 中有两种上下文：application context 和 request context。上下文有关的内容定义在 globals.py 文件，文件的内容也很是短：

def _lookup_req_object(name):
    top = _request_ctx_stack.top
    if top is None:
        raise RuntimeError(_request_ctx_err_msg)
    return getattr(top, name)


def _lookup_app_object(name):
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return getattr(top, name)


def _find_app():
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return top.app


# context locals
_request_ctx_stack = LocalStack()
_app_ctx_stack = LocalStack()
current_app = LocalProxy(_find_app)
request = LocalProxy(partial(_lookup_req_object, 'request'))
session = LocalProxy(partial(_lookup_req_object, 'session'))
g = LocalProxy(partial(_lookup_app_object, 'g'))

flask 提供两种上下文：application context 和 request context 。app lication context 又演化出来两个变量 current_app 和 g，而 request context 则演化出来 request 和 session。

这里的实现用到了两个东西：LocalStack 和 LocalProxy。它们两个的结果就是咱们能够动态地获取两个上下文的内容，在并发程序中每一个视图函数都会看到属于本身的上下文，而不会出现混乱。

LocalStack 和 LocalProxy 都是 werkzeug 提供的，定义在 local.py 文件中。在分析这两个类以前，咱们先介绍这个文件另一个基础的类 Local。Local 就是实现了相似 threading.local 的效果——多线程或者多协程状况下全局变量的隔离效果。下面是它的代码：

# since each thread has its own greenlet we can just use those as identifiers
# for the context.  If greenlets are not available we fall back to the
# current thread ident depending on where it is.
try:
    from greenlet import getcurrent as get_ident
except ImportError:
    try:
        from thread import get_ident
    except ImportError:
        from _thread import get_ident

class Local(object):
    __slots__ = ('__storage__', '__ident_func__')

    def __init__(self):
        # 数据保存在 __storage__ 中，后续访问都是对该属性的操做
        object.__setattr__(self, '__storage__', {})
        object.__setattr__(self, '__ident_func__', get_ident)

    def __call__(self, proxy):
        """Create a proxy for a name."""
        return LocalProxy(self, proxy)

    # 清空当前线程/协程保存的全部数据
    def __release_local__(self):
        self.__storage__.pop(self.__ident_func__(), None)

    # 下面三个方法实现了属性的访问、设置和删除。
    # 注意到，内部都调用 `self.__ident_func__` 获取当前线程或者协程的 id，而后再访问对应的内部字典。
    # 若是访问或者删除的属性不存在，会抛出 AttributeError。
    # 这样，外部用户看到的就是它在访问实例的属性，彻底不知道字典或者多线程/协程切换的实现
    def __getattr__(self, name):
        try:
            return self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

    def __setattr__(self, name, value):
        ident = self.__ident_func__()
        storage = self.__storage__
        try:
            storage[ident][name] = value
        except KeyError:
            storage[ident] = {name: value}

    def __delattr__(self, name):
        try:
            del self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

能够看到，Local 对象内部的数据都是保存在 __storage__ 属性的，这个属性变量是个嵌套的字典：map[ident]map[key]value。最外面字典 key 是线程或者协程的 identity，value 是另一个字典，这个内部字典就是用户自定义的 key-value 键值对。用户访问实例的属性，就变成了访问内部的字典，外面字典的 key 是自动关联的。__ident_func 是协程的 get_current 或者线程的 get_ident，从而获取当前代码所在线程或者协程的 id。

除了这些基本操做以外，Local 还实现了 __release_local__ ，用来清空（析构）当前线程或者协程的数据（状态）。__call__ 操做来建立一个 LocalProxy 对象，LocalProxy 会在下面讲到。

理解了 Local，咱们继续回来看另外两个类。

LocalStack 是基于 Local 实现的栈结构。若是说 Local 提供了多线程或者多协程隔离的属性访问，那么 LocalStack 就提供了隔离的栈访问。下面是它的实现代码，能够看到它提供了 push、pop 和 top 方法。

__release_local__ 能够用来清空当前线程或者协程的栈数据，__call__ 方法返回当前线程或者协程栈顶元素的代理对象。

class LocalStack(object):
    """This class works similar to a :class:`Local` but keeps a stack
    of objects instead. """

    def __init__(self):
        self._local = Local()

    def __release_local__(self):
        self._local.__release_local__()

    def __call__(self):
        def _lookup():
            rv = self.top
            if rv is None:
                raise RuntimeError('object unbound')
            return rv
        return LocalProxy(_lookup)

    # push、pop 和 top 三个方法实现了栈的操做，
    # 能够看到栈的数据是保存在 self._local.stack 属性中的
    def push(self, obj):
        """Pushes a new item to the stack"""
        rv = getattr(self._local, 'stack', None)
        if rv is None:
            self._local.stack = rv = []
        rv.append(obj)
        return rv

    def pop(self):
        """Removes the topmost item from the stack, will return the
        old value or `None` if the stack was already empty.
        """
        stack = getattr(self._local, 'stack', None)
        if stack is None:
            return None
        elif len(stack) == 1:
            release_local(self._local)
            return stack[-1]
        else:
            return stack.pop()

    @property
    def top(self):
        """The topmost item on the stack.  If the stack is empty,
        `None` is returned.
        """
        try:
            return self._local.stack[-1]
        except (AttributeError, IndexError):
            return None

咱们在以前看到了 request context 的定义，它就是一个 LocalStack 的实例：

_request_ctx_stack = LocalStack()

它会当前线程或者协程的请求都保存在栈里，等使用的时候再从里面读取。至于为何要用到栈结构，而不是直接使用 Local，咱们会在后面揭晓答案，你能够先思考一下。

LocalProxy 是一个 Local 对象的代理，负责把全部对本身的操做转发给内部的 Local 对象。LocalProxy 的构造函数介绍一个 callable 的参数，这个 callable 调用以后须要返回一个 Local 实例，后续全部的属性操做都会转发给 callable 返回的对象。

class LocalProxy(object):
    """Acts as a proxy for a werkzeug local.
    Forwards all operations to a proxied object. """
    __slots__ = ('__local', '__dict__', '__name__')

    def __init__(self, local, name=None):
        object.__setattr__(self, '_LocalProxy__local', local)
        object.__setattr__(self, '__name__', name)

    def _get_current_object(self):
        """Return the current object."""
        if not hasattr(self.__local, '__release_local__'):
            return self.__local()
        try:
            return getattr(self.__local, self.__name__)
        except AttributeError:
            raise RuntimeError('no object bound to %s' % self.__name__)

    @property
    def __dict__(self):
        try:
            return self._get_current_object().__dict__
        except RuntimeError:
            raise AttributeError('__dict__')

    def __getattr__(self, name):
        if name == '__members__':
            return dir(self._get_current_object())
        return getattr(self._get_current_object(), name)

    def __setitem__(self, key, value):
        self._get_current_object()[key] = value

这里实现的关键是把经过参数传递进来的 Local 实例保存在 __local 属性中，并定义了 _get_current_object() 方法获取当前线程或者协程对应的对象。

NOTE：前面双下划线的属性，会保存到 _ClassName__variable 中。因此这里经过 “_LocalProxy__local” 设置的值，后面能够经过 self.__local 来获取。关于这个知识点，能够查看 stackoverflow 的这个问题。

而后 LocalProxy 重写了全部的魔术方法（名字先后有两个下划线的方法），具体操做都是转发给代理对象的。这里只给出了几个魔术方法，感兴趣的能够查看源码中全部的魔术方法。

继续回到 request context 的实现：

_request_ctx_stack = LocalStack()
request = LocalProxy(partial(_lookup_req_object, 'request'))
session = LocalProxy(partial(_lookup_req_object, 'session'))

再次看这段代码但愿能看明白，_request_ctx_stack 是多线程或者协程隔离的栈结构，request 每次都会调用 _lookup_req_object 栈头部的数据来获取保存在里面的 requst context。

那么请求上下文信息是什么被放在 stack 中呢？还记得以前介绍的 wsgi_app() 方法有下面两行代码吗？

ctx = self.request_context(environ)
ctx.push()

每次在调用 app.__call__ 的时候，都会把对应的请求信息压栈，最后执行完请求的处理以后把它出栈。

咱们来看看request_context，这个方法只有一行代码：

def request_context(self, environ):
    return RequestContext(self, environ)

它调用了 RequestContext，并把 self 和请求信息的字典 environ 当作参数传递进去。追踪到 RequestContext 定义的地方，它出如今 ctx.py 文件中，代码以下：

class RequestContext(object):
    """The request context contains all request relevant information.  It is
    created at the beginning of the request and pushed to the
    `_request_ctx_stack` and removed at the end of it.  It will create the
    URL adapter and request object for the WSGI environment provided.
    """

    def __init__(self, app, environ, request=None):
        self.app = app
        if request is None:
            request = app.request_class(environ)
        self.request = request
        self.url_adapter = app.create_url_adapter(self.request)
        self.match_request()

    def match_request(self):
        """Can be overridden by a subclass to hook into the matching
        of the request.
        """
        try:
            url_rule, self.request.view_args = \
                self.url_adapter.match(return_rule=True)
            self.request.url_rule = url_rule
        except HTTPException as e:
            self.request.routing_exception = e

    def push(self):
        """Binds the request context to the current context."""
        # Before we push the request context we have to ensure that there
        # is an application context.
        app_ctx = _app_ctx_stack.top
        if app_ctx is None or app_ctx.app != self.app:
            app_ctx = self.app.app_context()
            app_ctx.push()
            self._implicit_app_ctx_stack.append(app_ctx)
        else:
            self._implicit_app_ctx_stack.append(None)

        _request_ctx_stack.push(self)

        self.session = self.app.open_session(self.request)
        if self.session is None:
            self.session = self.app.make_null_session()

    def pop(self, exc=_sentinel):
        """Pops the request context and unbinds it by doing that.  This will
        also trigger the execution of functions registered by the
        :meth:`~flask.Flask.teardown_request` decorator.
        """
        app_ctx = self._implicit_app_ctx_stack.pop()

        try:
            clear_request = False
            if not self._implicit_app_ctx_stack:
                self.app.do_teardown_request(exc)

                request_close = getattr(self.request, 'close', None)
                if request_close is not None:
                    request_close()
                clear_request = True
        finally:
            rv = _request_ctx_stack.pop()

            # get rid of circular dependencies at the end of the request
            # so that we don't require the GC to be active.
            if clear_request:
                rv.request.environ['werkzeug.request'] = None

            # Get rid of the app as well if necessary.
            if app_ctx is not None:
                app_ctx.pop(exc)

    def auto_pop(self, exc):
        if self.request.environ.get('flask._preserve_context') or \
           (exc is not None and self.app.preserve_context_on_exception):
            self.preserved = True
            self._preserved_exc = exc
        else:
            self.pop(exc)

    def __enter__(self):
        self.push()
        return self

    def __exit__(self, exc_type, exc_value, tb):
        self.auto_pop(exc_value)

每一个 request context 都保存了当前请求的信息，好比 request 对象和 app 对象。在初始化的最后，还调用了 match_request 实现了路由的匹配逻辑。

push 操做就是把该请求的 ApplicationContext（若是 _app_ctx_stack 栈顶不是当前请求所在 app ，须要建立新的 app context）和 RequestContext 有关的信息保存到对应的栈上，压栈后还会保存 session 的信息； pop 则相反，把 request context 和 application context 出栈，作一些清理性的工做。

到这里，上下文的实现就比较清晰了：每次有请求过来的时候，flask 会先建立当前线程或者进程须要处理的两个重要上下文对象，把它们保存到隔离的栈里面，这样视图函数进行处理的时候就能直接从栈上获取这些信息。

NOTE：由于 app 实例只有一个，所以多个 request 共享了 application context。

到这里，关于 context 的实现和功能已经讲解得差很少了。还有两个疑惑没有解答。

为何要把 request context 和 application context 分开？每一个请求不是都同时拥有这两个上下文信息吗？
为何 request context 和 application context 都有实现成栈的结构？每一个请求难道会出现多个 request context 或者 application context 吗？

第一个答案是“灵活度”，第二个答案是“多 application”。虽然在实际运行中，每一个请求对应一个 request context 和一个 application context，可是在测试或者 python shell 中运行的时候，用户能够单首创建 request context 或者 application context，这种灵活度方便用户的不一样的使用场景；并且栈可让 redirect 更容易实现，一个处理函数能够从栈中获取重定向路径的多个请求信息。application 设计成栈也是相似，测试的时候能够添加多个上下文，另一个缘由是 flask 能够多个 application 同时运行:

from werkzeug.wsgi import DispatcherMiddleware
from frontend_app import application as frontend
from backend_app import application as backend

application = DispatcherMiddleware(frontend, {
    '/backend':     backend
})

这个例子就是使用 werkzeug 的 DispatcherMiddleware 实现多个 app 的分发，这种状况下 _app_ctx_stack 栈里会出现两个 application context。

flask 源码解析：上下文

上下文（application context 和 request context）

参考资料