Go语言HTTP请求流式写入body

时间 2020-06-03

标签语言 http 请求写入 body 栏目 HTTP/TCP 繁體版

原文原文链接

背景

最近在开发一个功能时，须要经过 http 协议上报大量的日志内容，可是在 Go 标准库里的 http client 的 API 是这样的：git

http.NewRequest(method, url string, body io.Reader)

body 是经过io.Reader接口来传递，并无暴露一个io.Writer接口来提供写入的办法，先来看看正常状况下怎么写入一个body，示例：github

buf := bytes.NewBuffer([]byte("hello"))
http.Post("localhost:8099/report","text/pain",buf)

须要先把要写入的数据放在Buffer中，放内存缓存着，可是我须要写入大量的数据，若是都放内存里确定要 OOM 了，http client 并无提供流式写入的方法，我这么大的数据量直接用Buffer确定是不行的，最后在 google 了一番以后找到了解决办法。golang

使用 io.pipe

调用io.pipe()方法会返回Reader和Writer接口实现对象，经过Writer写数据，Reader就能够读到，利用这个特性就能够实现流式的写入，开一个协程来写，而后把Reader传递到方法中，就能够实现 http client body 的流式写入了。缓存

代码示例：

pr, rw := io.Pipe()
// 开协程写入大量数据
go func(){
    for i := 0; i < 100000; i++ {
        rw.Write([]byte(fmt.Sprintf("line:%d\r\n", i)))
    }
    rw.Close()
}()
// 传递Reader
http.Post("localhost:8099/report","text/pain",buf)

源码阅读

目的

了解 go 中 http client 对于 body 的传输是如何处理的。性能

开始

在构建 Request 的时候，会断言 body 参数的类型，当类型为*bytes.Buffer、*bytes.Reader、*strings.Reader的时候，能够直接经过Len()方法取出长度，用于Content-Length请求头，相关代码net/http/request.go#L872-L914：优化

if body != nil {
    switch v := body.(type) {
    case *bytes.Buffer:
        req.ContentLength = int64(v.Len())
        buf := v.Bytes()
        req.GetBody = func() (io.ReadCloser, error) {
            r := bytes.NewReader(buf)
            return ioutil.NopCloser(r), nil
        }
    case *bytes.Reader:
        req.ContentLength = int64(v.Len())
        snapshot := *v
        req.GetBody = func() (io.ReadCloser, error) {
            r := snapshot
            return ioutil.NopCloser(&r), nil
        }
    case *strings.Reader:
        req.ContentLength = int64(v.Len())
        snapshot := *v
        req.GetBody = func() (io.ReadCloser, error) {
            r := snapshot
            return ioutil.NopCloser(&r), nil
        }
    default:
    }
    if req.GetBody != nil && req.ContentLength == 0 {
        req.Body = NoBody
        req.GetBody = func() (io.ReadCloser, error) { return NoBody, nil }
    }
}

在连接创建的时候，会经过body和上一步中获得的ContentLength来进行判断，若是body!=nil而且ContentLength==0时，可能就会启用Chunked编码进行传输，相关代码net/http/transfer.go#L82-L96：google

case *Request:
    if rr.ContentLength != 0 && rr.Body == nil {
        return nil, fmt.Errorf("http: Request.ContentLength=%d with nil Body", rr.ContentLength)
    }
    t.Method = valueOrDefault(rr.Method, "GET")
    t.Close = rr.Close
    t.TransferEncoding = rr.TransferEncoding
    t.Header = rr.Header
    t.Trailer = rr.Trailer
    t.Body = rr.Body
    t.BodyCloser = rr.Body
    // 当body为非nil，而且ContentLength==0时，这里返回-1
    t.ContentLength = rr.outgoingLength()
    // TransferEncoding没有手动设置，而且请求方法为PUT、POST、PATCH时，会启用chunked编码传输
    if t.ContentLength < 0 && len(t.TransferEncoding) == 0 && t.shouldSendChunkedRequestBody() {
        t.TransferEncoding = []string{"chunked"}
    }

验证(一)

按照对源码的理解，能够得知在使用io.pipe()方法进行流式传输时，会使用chunked编码进行传输，经过如下代码进行验证：编码

服务端

func main(){
    http.HandleFunc("/report", func(writer http.ResponseWriter, request *http.Request) {

    })
    http.ListenAndServe(":8099", nil)
}

客户端

func main(){
    pr, rw := io.Pipe()
    go func(){
        for i := 0; i < 100; i++ {
            rw.Write([]byte(fmt.Sprintf("line:%d\r\n", i)))
        }
        rw.Close()
    }()
    http.Post("localhost:8099/report","text/pain",buf)
}

先运行服务端，而后运行客户端，而且使用WireShake进行抓包分析，结果以下：url

能够看到和预想的结果同样。spa

验证(二)

在数据量大的时候chunked编码会增长额外的开销，包括编解码和额外的报文开销，能不能不用chunked编码来进行流式传输呢？经过源码能够得知，当ContentLength不为 0 时，若是能预先计算出待传输的body size，是否是就能避免chunked编码呢？思路就到这，接着就是写代码验证：

服务端

func main(){
    http.HandleFunc("/report", func(writer http.ResponseWriter, request *http.Request) {

    })
    http.ListenAndServe(":8099", nil)
}

客户端

count := 100
line := []byte("line\r\n")
pr, rw := io.Pipe()
go func() {
    for i := 0; i < count; i++ {
        rw.Write(line)
    }
    rw.Close()
}()
// 构造request对象
request, err := http.NewRequest("POST", "http://localhost:8099/report", pr)
if err != nil {
    log.Fatal(err)
}
// 提早计算出ContentLength
request.ContentLength = int64(len(line) * count)
// 发起请求
http.DefaultClient.Do(request)

抓包结果：

能够看到确实直接使用的Content-Length进行传输，没有进行chunked编码了。

总结

本文的目的主要是记录 go 语言中http client如何进行流式的写入，并经过阅读源码了解http client内部对 body 的写入是如何进行处理的，经过两个验证能够得知，若是能提早计算出ContentLength而且对性能要求比较苛刻的状况下，能够经过手动设置ContentLength来优化性能。