Elixir IO内幕（一）读操做

时间 2019-11-15

标签 elixir 内幕繁體版

原文原文链接

玩过Elixir的人也许注意有注意到，File.open(path)返回的不是什么文件描述子（file descriptor，简称fd），也不是什么文件句柄（handle），而是{:ok, pid}。node

为毛是pid？难道说打开文件的同时打开了一个新的进程？没错。之因此这样作，官方给出的解释是网络

By modelling IO devices with processes, the Erlang VM allows different nodes in the same network to exchange file processes in order to read/write files in between nodes.app

看不懂英文的能够看下面个人翻译（非直译）：框架

把IO设备建模成进程，可使Erlang虚拟机在同一个网络内的不一样节点（主机）上交换文件进程，从而实现节点之间的相互读写。编辑器

但这样一来，给咱们本身实现IO设备带来了不小的麻烦。当执行读操做的时候，文件进程接收到的消息是什么？须要回复给主进程什么样的消息？碰上EOF了怎么处理？其余异常呢？写操做的时候又是什么样的消息机制呢？全部的这些都没有文档可查（我事先声明我不会Erlang，因此请别让我去查Erlang文档。前阵子查了erlsom的文档，差点没让我哭出来）。字体

没文档怎么办？Elixir又不像Ruby那样可让我猴子补丁一把。突然想到有个模块叫StringIO，貌似据说是用Elixir写的，因而去GitHub上啃了一下它的源代码，啃完才知道什么叫醍醐灌顶。今天比较晚了，因此就先说说读操做吧。编码

当主进程须要读取IO的内容时，它会向文件进程发送一条如下几种消息之一atom

{:io_request, sender_pid, reference, {:get_chars, prompt, chunk_size}}
{:io_request, sender_pid, reference, {:get_chars, encoding, prompt, chunk_size}}
{:io_request, sender_pid, reference, {:get_line, prompt}}
{:io_request, sender_pid, reference, {:get_line, encoding, prompt}}
{:io_request, sender_pid, reference, {:get_until, prompt, mod, func, args}}
{:io_request, sender_pid, reference, {:get_until, encoding, prompt, mod, func, args}}

第1种消息对应IO.binread(device, n)，其中n为大于0的整数。
第2种消息对应IO.read(device, n)和IO.getn(device, n)，IO设备实现者本身决定输出什么字符编码的字符。
第3种消息对应IO.binread(device, :line)
第4种消息对应IO.read(device, :line)和IO.gets(device)
最后两种暂时不知道对应什么。

接下来讲说消息的参数。上面列出的消息中，spa

sender_pid 是消息发送方的pid。
reference 是对消息发送方的一个引用（由于消息发送方可能不是一个进程，而是一个Port神马的）。
encoding 是消息发送方指望的字符编码，是一个atom，默认:latin1。
prompt 是给消息接收方（文件进程）的提示信息，一般没用，可是若是消息接收方是到的标准输入流，则能够在控制台上把这个信息打印出来，提示操做者（人）能够开始输入文本了。
chunk_size 是一次应该读取多少字节。你本身实现的文件进程能够无视它，但一般都会尊重它。

不管哪一种消息，文件进程都应向消息发送方回复下列三种消息之一（至少我在StringIO的源代码里没发现第四种）：.net

{:io_reply, reference, chunk}
{:io_reply, reference, :eof}
{:io_reply, reference, {:error, reason}}

第一种是在成功读到数据时的回复。其中reference是发送方发过来的那个引用（就是上面提到的那个），而chunk就是获取到的数据片断。

第二种是在没有读到数据，碰到文件结尾时的回复。

第三种固然是读取出错时的回复了。其中reason能够是任何东西。

明白了这些，咱们就能够实现本身的IO设备了（固然使用GenServer啦，除非你想自虐）。好比，Phoenix框架的基础Plug.Conn并无实现IO的接口（也就是不能用IO.read这样的方法来读取HTTP请求内容），因而咱们就能够给Conn来个包装，包装成IO的样子（仅对应IO.binread(device, n)）：

defmodule ConnIOWrapper do
  use GenServer

  def wrap(conn) do
    GenServer.start_link(__MODULE__, %{conn: conn, done: false})
  end

  def init(state) do
    {:ok, state}
  end

  def handle_info({:io_request, from, reply_as, {:get_chars, _, chunk_size}}, %{conn: conn, done: false} = state) do
    state = case Plug.Conn.read_body(conn, length: chunk_size) do
      {status, data, conn} when status in [:ok, :more] ->
        send(from, {:io_reply, reply_as, to_string(data)})
        %{conn: conn, done: status == :ok}
      {:error, _} = reply ->
        send(from, {:io_reply, reply_as, reply})
        %{state | done: true}
    end
    {:noreply, state}
  end

  def handle_info({:io_request, from, reply_as, {:get_chars, _, _}}, %{conn: conn, done: true} = state) do
    send(from, {:io_reply, reply_as, :eof})
    GenServer.cast(self, :close)
    {:noreply, state}
  end

  def handle_cast(:close, state) do
    {:stop, :normal, state}
  end
end

好吧这期就写到这儿。欲知后事如何，且听下回分解。（话说这破Markdown编辑器能不能用monospace字体啊？害得我对缩进对了半天，尚未Stack Overflow上那种Ctrl + K）