[单刷APUE系列]第三章——文件I/O

时间 2019-11-17

标签 apue 系列第三文件繁體版

原文原文链接

原文来自静雅斋，转载请注明出处。javascript

文件描述符

在学习C语言的时候，应该也学习了使用<stdio.h>提供的通用文件操做，在C语言中，已经封装好了File结构体帮助操做文件，打开<stdio.h>java

typedef struct __sFILE {
        unsigned char *_p;      /* current position in (some) buffer */
        int     _r;             /* read space left for getc() */
        int     _w;             /* write space left for putc() */
        short   _flags;         /* flags, below; this FILE is free if 0 */
        short   _file;          /* fileno, if Unix descriptor, else -1 */
        struct  __sbuf _bf;     /* the buffer (at least 1 byte, if !NULL) */
        int     _lbfsize;       /* 0 or -_bf._size, for inline putc */

        /* operations */
        void    *_cookie;       /* cookie passed to io functions */
        int     (*_close)(void *);
        int     (*_read) (void *, char *, int);
        fpos_t  (*_seek) (void *, fpos_t, int);
        int     (*_write)(void *, const char *, int);

        /* separate buffer for long sequences of ungetc() */
        struct  __sbuf _ub;     /* ungetc buffer */
        struct __sFILEX *_extra; /* additions to FILE to not break ABI */
        int     _ur;            /* saved _r when _r is counting ungetc data */

        /* tricks to meet minimum requirements even when malloc() fails */
        unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */
        unsigned char _nbuf[1]; /* guarantee a getc() buffer */
        unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */
        unsigned char _nbuf[1]; /* guarantee a getc() buffer */

        /* separate buffer for fgetln() when line crosses buffer boundary */
        struct  __sbuf _lb;     /* buffer for fgetln() */

        /* Unix stdio files get aligned to block boundaries on fseek() */
        int     _blksize;       /* stat.st_blksize (may be != _bf._size) */
        fpos_t  _offset;        /* current lseek offset (see WARNING) */
} FILE;复制代码

可容易看到FILE结构体内有一个成员short _file;这就是Unix使用的文件描述符(file descriptor)，并且结构体内除了必须的一些缓冲区、文件打开标志等东西，还包括了以函数指针的方式提供的“成员函数”。若是一些朋友曾经使用过TC2.0而且查看过<stdio.h>头文件，可能会惊讶在Unix环境下多出的很是多的内容。
像这种提供给开发者的操做文件的函数，都统称为带缓冲的I/O函数，而Unix系统自己提供的就是不带缓冲的I/O函数。
对于每一个运行中的进程，都维护了一个文件描述符表，文件描述符是一个非负整数，当打开一个文件的时候，内核会向进程返回一个文件描述符。按照Unix惯例，0、一、2的数字分别被标准输入、标准输出、标准错误相关联，咱们也能够将其进行替换。在POSIX规范中，已经提供了STDIN_FILENO、STDOUT_FILENO、STDERR_FILENO来替代0、一、2数字，这样更加便于开发者理解。
按照Unix规定，每一个Unix系统都应当提供OPEN_MAX限制做为进程最大打开文件限制，也就是说文件描述符的范围在0~OPEN_MAX-1的范围内变更，可是用过Linux的朋友知道，可使用ulimit命令修改最大文件打开数，甚至能够修改成无限，也就是说，没法经过OPEN_MAX的定义来得到最大文件打开数。这也是上一章中提到的没法在运行时得到的参数。shell

打开文件函数族

int open(const char *path, int oflag, ...);
int openat(int fd, const char *path, int oflag, ...);复制代码

oflag能够指定为如下常量数据库

O_RDONLY        open for reading only
O_WRONLY        open for writing only
O_RDWR          open for reading and writing
O_NONBLOCK      do not block on open or for data to become available
O_APPEND        append on each write
O_CREAT         create file if it does not exist
O_TRUNC         truncate size to 0
O_EXCL          error if O_CREAT and the file exists
O_SHLOCK        atomically obtain a shared lock
O_EXLOCK        atomically obtain an exclusive lock
O_NOFOLLOW      do not follow symlinks
O_SYMLINK       allow open of symlinks
O_EVTONLY       descriptor requested for event notifications only
O_CLOEXEC       mark as close-on-exec复制代码

两个函数就是一个绝对路径和相对路径的区别，oflag能够进行组合，使用|或运算符构成新的参数。
具体的详情能够看原著解释，里面已经很是详细。
在原著中写了五个常量安全

O_RDONLY 只读打开
O_WRONLY 只写打开
O_RDWR   读、写打开
O_EXEC   只执行打开
O_SEARCH 只搜索打开复制代码

而且指明这五个常量必须指定一个并且只能指定一个，可是根据笔者实际查看头文件，发现O_EXEC和O_SEARCH常量并无在头文件中出现，相反，头文件中只找到了cookie

/*
 * File status flags: these are used by open(2), fcntl(2).
 * They are also used (indirectly) in the kernel file structure f_flags,
 * which is a superset of the open/fcntl flags.  Open flags and f_flags
 * are inter-convertible using OFLAGS(fflags) and FFLAGS(oflags).
 * Open/fcntl flags begin with O_; kernel-internal flags begin with F.
 */
/* open-only flags */
#define O_RDONLY        0x0000          /* open for reading only */
#define O_WRONLY        0x0001          /* open for writing only */
#define O_RDWR          0x0002          /* open for reading and writing */
#define O_ACCMODE       0x0003          /* mask for above modes */复制代码

虽而后面还定义了一些POSIX实际上须要的函数，可是却使用#if !defined(_POSIX_C_SOURCE) || defined(_DARWIN_C_SOURCE)条件编译将其分类在了OS X自有源代码下。数据结构

建立文件函数

int creat(const char *path, mode_t mode);

The creat() function is the same as: open(path, O_CREAT | O_TRUNC | O_WRONLY, mode);复制代码

其实把open和creat函数对比，能够发现，creat功能以及彻底被open函数替代了，实际上这是一个历史遗留产物，并且因为creat函数有着诸多的限制，实际开发中极少使用到。从creat函数说明页极少的说明也能够看出官方也并不推荐使用creat。app

关闭文件

int close(int fildes);复制代码

原著里认为当进程终止时内核会自动关闭全部打开文件，因此不须要显式关闭，实际上在Unix手册中是推荐使用close关闭async

When a process exits, all associated file descriptors are freed, but since there is a limit on active descriptors per processes, the
close() function call is useful when a large quantity of file descriptors are being handled. When a process forks (see fork(2)), all descriptors for the new child process reference the same objects as they did in the parent before the fork. If a new process is then to be run using execve(2), the process would normally inherit these descriptors. Most of the descriptors can be rearranged with dup2(2) or deleted with close() before the execve is attempted, but if some of these descriptors will still be needed if the execve fails, it is necessary to arrange for them to be closed if the execve succeeds. For this reason, the call ``fcntl(d, F_SETFD, 1)'' is provided, which arranges that a descriptor will be closed after a successful execve; the call ``fcntl(d, F_SETFD, 0)'' restores the default, which is to not close the descriptor.复制代码

手册页还讲到了关于fork进程致使的文件描述符继承的状况。ide

文件偏移

在学习C语言FILE文件操做的时候一般也会讲到文件偏移量，文件偏移量其实是一个非负整数，可是能够理解为一个指针，指向当前文件从开头开始的字节数，正常非O_APPEND方式打开，偏移量会被重置为0。

off_t lseek(int fildes, off_t offset, int whence);复制代码

whence参数只有三种值，0、一、2，不过都已经如同标准文件描述符同样使用常量代替了，也就是SEEK_SET、SEEK_CUR、SEEK_END，很是的简洁易懂。
在原著中提到了lseek能够造成文件空洞，实际上这个和内核实现无关，而是和文件系统相关，也就是说，容许空洞存在，如何存储空洞，都是归给文件系统的。目前大部分文件系统都是使用null填充。

#include "include/apue.h"
#include <fcntl.h>

char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";

int main(int argc, char *argv[])
{
    int fd;

    if ((fd = creat("file.hole", FILE_MODE)) < 0)
        err_sys("creat error");

    if (write(fd, buf1, 10) != 10)
        err_sys("buf1 write error");

    if (lseek(fd, 16384, SEEK_SET) == -1)
        err_sys("lseek error");

    if (write(fd, buf2, 10) != 10)
        err_sys("buf2 write error");

    close(fd);
    exit(0);
}复制代码

而后编译运行后生成了file.hole文件，使用od命令查看

> od -c file.hole
0000000    a   b   c   d   e   f   g   h   i   j  \0  \0  \0  \0  \0  \0
0000020   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0040000    A   B   C   D   E   F   G   H   I   J
0040012复制代码

很好换算，16384转换为8进制就是40000。顺便说一句，od命令在查询二进制文件的时候很是好用。

读取写入函数

ssize_t read(int fildes, void *buf, size_t nbyte);
ssize_t pread(int d, void *buf, size_t nbyte, off_t offset);
ssize_t readv(int d, const struct iovec *iov, int iovcnt);

ssize_t write(int fildes, const void *buf, size_t nbyte);
ssize_t pwrite(int fildes, const void *buf, size_t nbyte, off_t offset);
ssize_t writev(int fildes, const struct iovec *iov, int iovcnt);复制代码

read/write函数族包含三个函数，可是实际上用到的就是read/write函数，其余函数一之后介绍。

文件共享

在学习操做系统原理的时候，你们应该学习过锁的使用。文件是一种资源，当一个进程打开文件的同时另外一个进程也持有了此文件的使用权，那么很容易形成文件被覆盖和误读。庆幸的是，操做系统已经为咱们准备好了快捷安全的方式共享文件。
首先先讲解一下Unix内核对文件的数据结构

每一个进程都自行维护了一个链表，里面记录了文件描述符(file descriptor)和文件指针的映射
内核为全部打开的文件维护了一个文件表，注意，是全部打开文件，也就是说，一个文件被多个进程打开，就会出现多个文件表项，这是很是正常的。每一个文件表项包含了文件状态标志(读、写等等)、文件偏移量、文件系统逻辑指针
最后就是文件系统本身的逻辑指针

原子操做

可能对于使用过数据库的朋友来讲，原子操做已经听过了。因为操做系统是基于多任务操做的，内核有可能在执行任何代码后挂起线程而后切换到另一个线程或者说是另一个进程的线程，因此说没法保证后一个代码执行时候前一行代码执行结果是有效的，由于颇有可能被其余线程改变了。原子操做就是这样的一个方案，就如同数据库中的事务，在提交事务以前，全部的资源都是被锁定。或者说能保证相关的代码执行不中断。

正如open文件后lseek到文件末尾和直接用O_APPEND参数打开文件，二者之间的区别就是原子性和非原子性的区别。

在前面关于读写函数的时候介绍的pread和pwirte就是一个原子操做，将lseek和read、wirte函数合并。可是请注意，因为这两个函数并不是更新了文件偏移量而是自行加上了offset，因此内核中的文件偏移量是不会改变的。

前文open函数在同时使用O_CREAT和O_EXCL参数的时候也是一种原子操做，能在建立文件的时候就判断文件是否已经存在。

复制文件描述符

int dup(int fildes);
int dup2(int fildes, int fildes2);复制代码

dup执行后将会返回最小的可返回的文件描述符，dup2则是自定义文件描述符值，在手册中有

In dup2(), the value of the new descriptor fildes2 is specified.  If fildes and fildes2 are equal, then dup2() just returns fildes2; no other changes are made to the existing descriptor.  Otherwise, if descriptor fildes2 is already in use, it is first deallocated as if a close(2) call had been done first.复制代码

也就是若是fildes2正在使用则关闭后再分配;若是fildes等于fildes2则只返回fildes2，且不关闭。

数据同步到磁盘

为了确保磁盘读写能高速有效，Unix系统在内核中设置了高速缓冲区，大多数状况下，咱们都使用带有缓冲的I/O函数，在某些状况下，咱们须要马上将缓冲区内数据写入到磁盘，Unix系统提供了sync、fsync和fdatasync三个函数，可是在FreeBSD系Unix实现不包含fdatasync函数，包括Mac OS X系统，具体详细介绍能够查看原著和Unix
系统手册。

void sync(void);
int fsync(int fd);复制代码

文件控制函数

int fcntl(int fildes, int cmd, ...);复制代码

原书中列出了11中参数，实际上，现代Unix实现除了这11种参数意外还设置了其余的参数，这里再也不探讨。

#include "include/apue.h"
#include <fcntl.h>

char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";

int main(int argc, char *argv[])
{
    int val;

    if (argc != 2)
        err_quit("usage: a.out <descriptor#>");

    if ((val = fcntl(atoi(argv[1]), F_GETFL)) < 0)
        err_sys("fcntl error for fd %d", atoi(argv[1]));

    switch (val & O_ACCMODE) {
        case O_RDONLY:
            printf("read only");
            break;
        case O_WRONLY:
            printf("write only");
            break;
        case O_RDWR:
            printf("read write");
            break;

        default:
            err_dump("unknown access mode");
    }

    if (val & O_APPEND)
        printf(", append");
    if (val & O_NONBLOCK)
        printf(", nonblocking");
    if (val & O_SYNC)
        printf(", synchronous writes");
#if !defined(_POSIX_C_SOURCE) && defined(O_FSYNC) && (O_FSYNC != O_SYNC)
    if (val & O_FSYNC)
        printf(", synchronous writes");
#endif

    putchar('\n');
    exit(0);
}复制代码

代码很是简洁易懂，可能有些朋友对位运算的技巧不是很了解，因此看不懂一些代码，例如，O_ACCMODE其实是一个掩码值，它不表明实际意义，而是为了可以快速运算取得每一位的具体数值，通常来讲，二进制每一位都表明一个具体含义，当这一位是1的时候，表示这个开关打开，当为0的时候开关关闭，而O_APPEND其实是(1000)b，O_NONBLOCK则是(100)b，都是各占一位的，因此能够用AND运算取得。
原著后面的两个封装的set_fl和clr_fl函数实际上也是跟位运算相关

val |= flags;
val &= ~flags;复制代码

一个是或运算，一个是按位取反后进行和运算。都是很是实用的小技巧。

设备控制函数

int ioctl(int fildes, unsigned long request, ...);复制代码

在Unix手册中，ioctl函数被用于一些底层设备参数的设置和获取，ioctl函数能够控制一些特殊字符设备文件。可是实际上I/O操做不能杂类都是归给这个函数，正如说明文件中说的，终端多是使用这个函数最多的地方，可是随着标准推动，更多的终端操做函数被提出来用于替代ioctl，实际上不多用到这个函数。

文件描述符设备

在大多数的Unix实现中，都提供了/dev/fd文件夹，里面有若干个文件，打开这些文件，等同于复制文件描述符，实际上因为Linux系统和Unix系统不少不一样的实现，在操做这个设备文件的时候须要很是当心，在实际开发中咱们有更好的方式来复制文件描述符，正如原著所说，/dev/fd文件夹更多的被使用在shell脚本中。