Python基础入门教程：WSGI

时间 2019-11-17

标签 python 基础入门教程 wsgi 栏目 Python 繁體版

原文原文链接

Python基础入门教程：WSGI

WSGI 是什么

WSGI 是 Python Web Server Gateway Interface 的缩写，是描述 Web 服务器与 Python 应用程序之间如何交互的接口规范。该规范具体描述在 PEP-3333。python

这个规范至关于 Web 服务器和 Python 应用程序之间的桥梁。对于 Web 服务器，WSGI 描述了如何把用户的请求数据交给 Python 应用程序；对于 Python 应用程序，WSGI 描述了如何得到用户的请求数据，如何将请求的处理结果返回给 Web 服务器。web

WSGI 应用程序

符合 WSGI 规范的 Python 应用程序必须是：bash

一个可调用对象(callable object，例如函数、类、实现了 __call__ 方法的实例)服务器
接受两个由 WSGI Server 提供的参数：app
- environ 字典，包含环境变量，请求数据，如 REQUEST_METHOD，PATH_INFO，QUERY_STRING 等框架
- start_response 函数，开始响应请求的回调函数，用于发送响应状态(HTTP status) 和响应头(HTTP headers)异步
返回一个由 bytes 类型元素组成的可迭代对象（一般是一个字节序列），即响应正文(Response body)函数

下面分别以函数、类、实现了 __call__ 方法的实例来演示符合 WSGI 规范的 Python 应用程序：性能

# 可调用对象是一个函数
def simple_app(environ, start_response):
    # 响应状态(状态码和状态信息)
    status = '200 OK'
    # 响应体
    response_body = b"Hello WSGI"
    # 响应头，是一个列表，每对键值都必须是一个 tuple
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(response_body)))]
    
    # 回调 WSGI 服务器提供的 start_response，返回响应状态和响应头
    start_response(status, response_headers)
    
    # 返回响应体，由 bytes 类型元素组成的可迭代对象
    return [response_body]


# 可调用对象是一个类
class AppClass:
    """可调用对象是 AppClass 类，调用方法： for body in AppClass(env, start_response): process_body(body) """

    def __init__(self, environ, start_response):
        self.environ = environ
        self.start = start_response

    def __iter__(self):
        status = '200 OK'
        response_body = b"Hello WSGI"
        response_headers = [('Content-type', 'text/plain'),
                            ('Content-Length', str(len(response_body)))]
        self.start(status, response_headers)
        yield response_body


# 可调用对象是一个类实例
class AnotherAppClass:
    """可调用对象是 AnotherAppClass 类实例，调用方法： app = AnotherAppClass() for body in app(env, start_response): process_body(body) """

    def __init__(self):
        pass

    def __call__(self, environ, start_response):
        status = '200 OK'
        response_body = b"Hello WSGI"
        response_headers = [('Content-type', 'text/plain'),
                            ('Content-Length', str(len(response_body)))]
        start_response(status, response_headers)
        yield response_body复制代码

WSGI 服务器

跟 WSGI 应用程序对应的 WSGI 服务器须要完成如下工做：学习

接收 HTTP 请求，返回 HTTP 响应
提供 environ 数据，实现回调函数 start_response
调用 WSGI application，并将 environ，start_response 做为参数传入

简化版 WSGI 服务器内部的实现流程：

import os, sys

def unicode_to_wsgi(u):
    return u.decode('utf-8')

def wsgi_to_bytes(s):
    return s.encode('utf-8')

# application 是 WSGI 应用程序，一个可调用对象
def run_with_cgi(application):
    # 准备 environ 参数数据
    # 内部包含本次 HTTP 请求的数据，如 REQUEST_METHOD, PATH_INFO, QUERY_STRING 等
    environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
    # WSGI 环境变量
    environ['wsgi.input']        = sys.stdin.buffer
    environ['wsgi.errors']       = sys.stderr
    environ['wsgi.version']      = (1, 0)
    environ['wsgi.multithread']  = False
    environ['wsgi.multiprocess'] = True
    environ['wsgi.run_once']     = True

    if environ.get('HTTPS', 'off') in ('on', '1'):
        environ['wsgi.url_scheme'] = 'https'
    else:
        environ['wsgi.url_scheme'] = 'http'

    headers_set = []
    headers_sent = []

    def write(data):
        out = sys.stdout.buffer
        
        if not headers_set:
             raise AssertionError("write() before start_response()")
        elif not headers_sent:
             # 在第一次发送响应体以前，发送已经存在的响应头
             status, response_headers = headers_sent[:] = headers_set
             out.write(wsgi_to_bytes('Status: %s\r\n' % status))
             for header in response_headers:
                 out.write(wsgi_to_bytes('%s: %s\r\n' % header))
             out.write(wsgi_to_bytes('\r\n'))

        out.write(data)
        out.flush()

    # start_response 回调函数，根据 WSGI 应用程序传递过来的 HTTP status 和 response_headers
    # 设置响应状态和响应头
    def start_response(status, response_headers, exc_info=None):
        # 处理异常状况
        if exc_info:
            pass

        headers_set[:] = [status, response_headers]

        return write

    # 调用 WSGI 应用程序，传入准备好的 environ（请求数据）和 start_response（开始响应回调函数）
    result = application(environ, start_response)
    
    # 处理响应体
    try:
        for data in result:
            if data:
                write(data)
    finally:
        if hasattr(result, 'close'):
            result.close()复制代码

Middleware

Middleware(中间件) 处于 WSGI 服务器和 WSGI 应用程序之间。对于 WSGI 应用程序它至关于 WSGI 服务器，而对于 WSGI 服务器它至关于 WSGI 应用程序。它很像 WSGI 应用程序，接收到请求以后，作一些针对请求的处理，同时它又能在接收到响应以后，作一些针对响应的处理。因此 Middleware 的特色是：

被 WSGI 服务器或其余 Middleware 调用，返回 WSGI 应用程序
调用 WSGI 应用程序，传入 environ 和 start_response

咱们以白名单过滤和响应后续处理来演示 Middleware：

from wsgiref.simple_server import make_server

def app(environ, start_response):
    # 响应状态(状态码和状态信息)
    status = '200 OK'
    # 响应体
    response_body = b"Hello WSGI"
    # 响应头，是一个列表，每对键值都必须是一个 tuple
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(response_body)))]
    
    # 回调 WSGI 服务器提供的 start_response，返回响应状态和响应头
    start_response(status, response_headers)
    
    # 返回响应体，由 bytes 类型元素组成的可迭代对象
    return [response_body]


# 针对请求数据进行处理的中间件
class WhitelistMiddleware(object):
    def __init__(self, app):
        self.app = app

    # 类实例被调用时，根据从请求中得到的 HTTP_HOST 实现白名单功能
    def __call__(self, environ, start_response):
        ip_addr = environ.get('HTTP_HOST').split(':')[0]
        if ip_addr not in ('127.0.0.1'):
            start_response('403 Forbidden', [('Content-Type', 'text/plain')])

            return [b'Forbidden']

        return self.app(environ, start_response)


# 针对响应数据进行处理的中间件
class UpperMiddleware(object):
    def __init__(self, app):
        self.app = app

    # 类实例被调用时，将响应体的内容转换成大写格式
    def __call__(self, environ, start_response):
        for data in self.app(environ, start_response):
            yield data.upper()


if __name__ == '__main__':
    app = UpperMiddleware(WhitelistMiddleware(app))
    with make_server('', 8000, app) as httpd:
        print("Serving on port 8000...")

        httpd.serve_forever()复制代码

上面例子是一份完整可运行的代码。函数 app 是 WSGI 应用程序，WhitelistMiddleware 和 UpperMiddleware 是 WSGI Middleware，WSGI 服务器使用的是 Python 内置的 wsgiref 模块（wsgiref 模块是 Python 3 提供的 WSGI 规范的参考实现，wsgiref 中的 WSGI 服务器可用于开发测试，不能使用在生产环境）。

在 WSGI 规范中给出了一些 Middleware 的使用场景，其中根据请求路径分发到不一样应用程序的场景，正是一个 Web Framework 最基本的一项功能。下面咱们来看一个经过 Middleware 实现的路由转发例子：

from wsgiref.simple_server import make_server

# 请求 path 分发中间件
class RouterMiddleware(object):
    def __init__(self):
        # 保存 path 与应用程序对应关系的字典
        self.path_info = {}
    
    def route(self, environ, start_response):
        application = self.path_info[environ['PATH_INFO']]
        return application(environ, start_response)
    
    # 类实例被调用时，保存 path 和应用程序对应关系
    def __call__(self, path):
        def wrapper(application):
            self.path_info[path] = application
        return wrapper

router = RouterMiddleware()


@router('/hello')  # 调用 RouterMiddleware 类实例，保存 path 和应用程序对应关系
def hello(environ, start_response):
    status = '200 OK'
    response_body = b"Hello"
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(response_body)))]
    start_response(status, response_headers)

    return [response_body]


@router('/world')
def world(environ, start_response):
    status = '200 OK'
    response_body = b'World'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(response_body)))]
    start_response(status, response_headers)

    return [response_body]


@router('/')
def hello_world(environ, start_response):
    status = '200 OK'
    response_body = b'Hello World'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(response_body)))]
    start_response(status, response_headers)

    return [response_body]


def app(environ, start_response):
    return router.route(environ, start_response)


if __name__ == '__main__':
    with make_server('', 8000, app) as httpd:
        print("Serving on port 8000...")

        httpd.serve_forever()复制代码

WSGI 接口规范描述的 WSGI 应用程序太过于底层，对于开发人员很不友好。人们一般会使用 Web Framework 来完成一个 Web 应用的开发工做，而后会把这个 Web 应用部署在为生产环境准备的 Web 服务器上。

经常使用的 Python Web Framework：

Django
- 一个功能完备的 Web 框架，拥有庞大的开发者社区和丰富的第三方库。
Flask
- 一款微型框架，构建更小应用、API 和 web 服务。是任何不适用 Django 的 Python web 应用的默认选择。
Tornado
- 一个异步 web 框架，原生支持 WebSocket。
Bottle
- 更小的 Web 框架，整个框架只有一个 Python 文件，是不错的源码学习案例。

经常使用的 WSGI Web Server:

Gunicorn
- 纯 Python 实现的 WSGI 服务器，拥有十分简单的配置和十分合理的默认配置，使用简单。
uWSGI
- 基于 uwsgi 协议的，功能十分强大的 Web 服务器，同时也支持 Python WSGI 协议。性能很好，但配置复杂。