文件IO

IO: input/outpython

本篇讲的IO专指文件IObash

1、打开和关闭文件

In [1]: help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, clo
sefd=True, opener=None)
    Open file and return a stream.  Raise IOError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
...
In [2]: f = open('./hello.py')

In [3]: f
Out[3]: <_io.TextIOWrapper name='./hello.py' mode='r' encoding='UTF-8'>

In [4]: f.read()
Out[4]: ''

In [5]: f.close()

In [6]: f.read()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-571e9fb02258> in <module>()
----> 1 f.read()

ValueError: I/O operation on closed file.

In [7]:

2、文件对象的操做

In [7]: f = open('./hello.py')  

In [8]: f.write('test')           # mode=r  默认权限:只读
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-8-c2c133bc4fec> in <module>()
----> 1 f.write('test')

UnsupportedOperation: not writable

In [9]: f.close()
In [10]: f = open('./hello.py', mode='r')

In [11]: f.write('test')
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-11-c2c133bc4fec> in <module>()
----> 1 f.write('test')

UnsupportedOperation: not writable

In [12]: f.read()
Out[12]: ''

In [13]: f.close()

In [14]:
In [14]: f = open('./hello.py', mode='w')  # mode=w 文件不可读;即便文件打开后不作任何操做,也会清空文件

In [15]: %cat hello.py

In [16]: f.read()
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-16-571e9fb02258> in <module>()
----> 1 f.read()

UnsupportedOperation: not readable         # mode=w 当文件不存在的时候,会建立该文件   

In [17]: f.write('aaaaa')
Out[17]: 5

In [18]: f.close()

In [19]: %cat hello.py
aaaaa
In [20]: f = open('./hello.py', mode='w')

In [21]: f.close()

In [22]: %cat hello.py

In [23]:
In [28]: f = open('./hello.py', mode='x')   # 当文件存在的时候, 会报错
---------------------------------------------------------------------------
FileExistsError                           Traceback (most recent call last)
<ipython-input-28-d2f14f50d8d0> in <module>()
----> 1 f = open('./hello.py', mode='x')

FileExistsError: [Errno 17] File exists: './hello.py'

In [29]: %rm hello.py

In [30]: f = open('./hello.py', mode='x')   # 当文件存在的时候, 会报错, 不可读,可写

In [31]: f.read()
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-31-571e9fb02258> in <module>()
----> 1 f.read()

UnsupportedOperation: not readable

In [32]: f.write('abcd')
Out[32]: 4

In [33]: f.close()

In [34]: %cat hello.py
abcd
In [35]:
In [40]: %rm hello.py

In [41]: f = open('./hello.py', mode='a')   # mode=a, 不可读, 可写;而且是追加写 

In [42]: f.write('abcd')
Out[42]: 4

In [43]: f.read()
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-43-571e9fb02258> in <module>()
----> 1 f.read()

UnsupportedOperation: not readable

In [44]: f.close()

In [45]: %cat hello.py
abcd
In [46]:

总结:

从 控制读写的模式:

  • r 只读,文件必须存在
  • w 只写,清空文件,文件不存在会建立文件
  • x 只写,文件必须不存在
  • a 只写,追加到文件末尾,文件不存在会建立文件

从其余方面来看:

  • 从读写方面来看,只有r 是 可读不可写的,其余都是可写不可读
  • 从文件不存在来看,只有r 抛出异常,其余都是建立新文件
  • 从文件存在来看,只有x 抛出异常
  • 从是否影响内容来看,只有w 会清空文件
In [46]: f = open('./hello.py', mode='rt')  # mode=t 是以文本方式打开文件

In [47]: s = f.read()

In [48]: s
Out[48]: 'abcd'

In [49]: f.close()

In [50]: f = open('./hello.py', mode='rb')  # mode=b 是以字节方式打开文件

In [51]: s = f.read()

In [52]: s
Out[52]: b'abcd'

In [53]: f.close()

In [54]:
  • mode=t  按字符操做
  • mode=b 按字节操做
In [55]: f = open('./hello.py', mode='r+')  # mode=r+  可读可写,而且是一个追加写

In [56]: f.read()
Out[56]: 'abcd'

In [57]: f.write('efgh')
Out[57]: 4

In [58]: f.read()
Out[58]: ''

In [59]: f.close()

In [60]: %cat hello.py
abcdefgh
In [61]:
In [62]: f = open('./hello.py', mode='w+')    # mode=w+ 可读可写, 同时会清空文件

In [63]: f.read()
Out[63]: ''

In [64]: f.write('马哥教育')
Out[64]: 4

In [65]: f.close()

In [66]: %cat hello.py
马哥教育
In [67]:
  • 因为 w+ 会先清空文件,因此通常打开文件都会使用 r+
  • rwxa 模式有且仅有一种,存在于场上;而且 + 不能单独使用
     

3、文件指针

In [67]: help(f.tell)

Help on built-in function tell:

tell() method of _io.TextIOWrapper instance
    Return current stream position.
~   # 返回当前流的位置

(END)
In [69]: f = open('./hello.py')             # mode=rt   

In [70]: f.tell()
Out[70]: 0

In [71]: f.read()
Out[71]: '马哥教育'

In [72]: f.tell()
Out[72]: 12

In [73]: f.close()

In [74]: f = open('./hello.py', mode='a')   

In [75]: f.tell()
Out[75]: 12

In [76]: f.close()

In [77]:
In [78]: f = open('./hello.py')

In [79]: help(f.seek)

Help on built-in function seek:

seek(cookie, whence=0, /) method of _io.TextIOWrapper instance
    Change stream position.
    
    Change the stream position to the given byte offset. The offset is
    interpreted relative to the position indicated by whence.  Values
    for whence are:
                                          # 偏移量应当为0 或正整数  
    * 0 -- start of stream (the default); offset should be zero or positive 
    * 1 -- current stream position; offset may be negative  # 偏移量能够为0或负整数
    * 2 -- end of stream; offset is usually negative  # 偏移量 一般为 负整数
    
    Return the new absolute position.
    # 返回新的绝对位置
~
(END)

In [91]: f = open('./hello.py', 'w+')

In [92]: f.write('magedu')
Out[92]: 6

In [93]: f.close()

In [94]: %cat hello.py
magedu
In [95]: f = open('./hello.py')

In [96]: f.tell()
Out[96]: 0

In [97]: f.read()
Out[97]: 'magedu'

In [98]: f.tell()
Out[98]: 6

In [99]: f.seek(0, 0)
Out[99]: 0

In [100]: f.tell()
Out[100]: 0

In [101]: f.read()
Out[101]: 'magedu'

In [102]: f.seek(2, 0)
Out[102]: 2

In [103]: f.tell()
Out[103]: 2

In [104]: f.read()
Out[104]: 'gedu'

In [105]: f.seek(0, 0)
Out[105]: 0

In [106]: f.read()
Out[106]: 'magedu'

In [107]: f.seek(4, 1)
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-107-d74820aa25fe> in <module>()
----> 1 f.seek(4, 1)

UnsupportedOperation: can't do nonzero cur-relative seeks

In [108]: f.seek(0, 1)
Out[108]: 6

In [109]: f.seek(0, 2)
Out[109]: 6

In [110]: f.tell()
Out[110]: 6

In [111]: f.close()

In [112]:

总结:

mode=t

  • 按字节移动文件指针
  • 当whence (第二个参数)位 start(0) (默认值) , 能够移动任意位置,offset 能够是任意整数,(offset 就是他的第一个参数)
  • 当whence 位current 也就是1,或者是end 也就是2的时候,offset 只能为0
In [113]: f = open('./hello.py', mode='w')

In [114]: f.write('马哥教育')
Out[114]: 4

In [115]: f.close()

In [116]: f = open('./hello.py', mode='rb')

In [117]: f.tell()
Out[117]: 0

In [118]: f.seek(3)
Out[118]: 3

In [119]: f.read()
Out[119]: b'\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'

In [120]: f.seek(3)
Out[120]: 3

In [121]: f.read().decode()
Out[121]: '哥教育'

In [122]: f.seek(3)
Out[122]: 3

In [123]: f.seek(3, 1)
Out[123]: 6

In [124]: f.seek(3, 2)
Out[124]: 15

In [125]: f.read()
Out[125]: b''

In [126]: f.read().decode()
Out[126]: ''

In [127]: f.seek(-3, 2)
Out[127]: 9

In [128]: f.read().decode()
Out[128]: '育'

In [129]: f.seek(13)
Out[129]: 13

In [130]: f.seek(-13, 2)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-130-6a41921c313a> in <module>()
----> 1 f.seek(-13, 2)

OSError: [Errno 22] Invalid argument

In [131]:

总结:

mode=b

  • 按字节移动指针
  • 当whence start(0), 能够移动任意位置,offset 能够是任意整数
  • 当whence 位 为 current 也就是1,end 也就是2的时候,也能够是任意整数

移动文件指针:

  • 文件指针按字节操做
  • tell 方法返回当前文件指针位置
  • seek 方法移动文件指针
  • whence 参数start(0), current(1), end(2) 事实上,这些变量有咱们的常量 SEEK_SET(0), SEEK_CUR(1), SEEK_END(2)
    • SEEK_SET(0) 从0开始向后移动offset 个字节
    • SEEK_CUR(1) 从当前位置向后移动offset个字节
    • SEEK_END(2) 从EOF向后移动offset个字节  EOF:END OF FILE
  • offset 是整数
  • 当mode 为t 时,whence 为SEEK_CUR 或者SEEK_END ,offset 只能为0
  • 文件指针不能为负数
  • 读文件的时候,从文件指针开始向后读
  • 写文件的时候,从min(EOF)处 开始向后写
  • 当以mode 为a 模式打开的时候,不管文件指针在何处,都从EOF开始写

一、落盘操做(flush)

In [132]: f = open('./hello.py', 'wb')

In [133]: f.write(b'abc')
Out[133]: 3

In [134]: %cat hello.py

In [135]: f.flush()          # 落盘

In [136]: %cat hello.py
abc
In [137]: f.write(b'apapapa')
Out[137]: 7

In [138]: %cat hello.py
abc
In [139]: f.close()          # 自带 flush

In [140]: %cat hello.py
abcapapapa
In [141]:

二、buffering(缓冲区)

In [145]: f = open('./hello.py', 'wb', buffering=5)

In [146]: f.write(b'abc')
Out[146]: 3

In [147]: %cat hello.py

In [148]: f.write(b'abc')
Out[148]: 3

In [149]: %cat hello.py
abc
In [150]: f.close()

In [151]: %cat hello.py
abcabc
In [152]: f = open('./hello.py', 'wb', buffering=5)

In [153]: f.write(b'a')
Out[153]: 1

In [154]: f.write(b'b')
Out[154]: 1

In [155]: f.write(b'q')
Out[155]: 1

In [156]: f.write(b'ab')
Out[156]: 2

In [157]: f.write(b'1')
Out[157]: 1

In [158]: %cat hello.py
abqab
In [159]: f.close()

In [160]:
In [161]: f = open('./hello.py', 'wb', buffering=0)

In [162]: f.write(b'a')
Out[162]: 1

In [163]: %cat hello.py
a
In [164]: f.write(b'b')
Out[164]: 1

In [165]: %cat hello.py
ab
In [166]: f.close()

In [167]: f = open('./hello.py', 'wb', buffering=0)

In [168]: f.write(b'abcdefg')
Out[168]: 7

In [169]: %cat hello.py
abcdefg
In [170]:
In [170]: import io

In [171]: io.DEFAULT_BUFFER_SIZE
Out[171]: 8192

In [172]: f = open('./hello.py', 'wt', buffering=1)   # 只有当buffering=1 的时候,才是 line buffer

In [173]: f.write('abc')
Out[173]: 3

In [174]: %cat hello.py

In [175]: f.write('\n')
Out[175]: 1

In [176]: %cat hello.py
abc

In [177]: f.close()
In [185]: f = open('./hello.py', 'wt', buffering=5)

In [186]: f.write('abc')
Out[186]: 3

In [187]: %cat hello.py

In [188]: f.write('\n')
Out[188]: 1

In [189]: %cat hello.py

In [190]: f.write('a' * io.DEFAULT_BUFFER_SIZE)
Out[190]: 8192

In [191]: f.close()

In [192]: f = open('./hello.py', 'wt', buffering=5)

In [193]: f.write('a' * io.DEFAULT_BUFFER_SIZE)
Out[193]: 8192

In [194]: %cat hello.py

In [195]: f.write('b')
Out[195]: 1

In [196]: %cat hello.py

In [197]: f.close()

In [198]:

buffering>1 的时候, 缓冲区的大小为 io.DEFAULT_BUFFER_SIZE; 当缓冲区满时,flush缓冲区连同本次写入的内容一块儿 落盘。

总结:

  • buffering = -1
    • 二进制模式:io.DEFAULT_BUFFER_SIZE
    • 文本模式:io.DEFAULT_BUFFER_SIZE
  • buffering = 0
    • 二进制模式:关闭buffering 也就是unbuffered
    • 文本模式:不容许
  • buffering = 1
    • 二进制模式:1
    • 文本模式: line buffer
  • buffering > 1
    • 二进制模式:buffering
    • 文本模式:io.DEFAULT_BUFFER_SIZE
    • 二进制模式:判断缓冲区的位置是否足够存放当前字节,若是不能,先flush,再把当前字节写入缓冲区,若是当前字节大于缓冲区大小,直接flush
    • 文本模式:line buffering,遇到换行就flush;非line buffering,当前字节加缓冲区中的字节超出缓冲区大小,直接flush 缓冲区和当前字节

flush 和 close 方法能够强制刷新缓冲区cookie

 

In [217]: f = open('hello.py', 'w+', buffering=1)

In [218]: f.write('abcd')
Out[218]: 4

In [219]: f.read()
Out[219]: ''

In [220]: f.seek(0)
Out[220]: 0

In [221]: f.tell()
Out[221]: 0

In [222]: f.read()
Out[222]: 'abcd'

In [223]: f.close()
  • readline 方法
In [242]: f = open('hello.py', 'w+', buffering=1)

In [243]: f.write('abc')
Out[243]: 3

In [244]: f = open('hello.py', 'r+')

In [245]: f.write('1234\n')
Out[245]: 5

In [246]: f.write('hjhk\n')
Out[246]: 5

In [247]: f.write('op976\n')
Out[247]: 6

In [248]: f.seek(0)
Out[248]: 0

In [249]: f.read(4)       # 跟着指针走
Out[249]: '1234'

In [250]: f.read(4)
Out[250]: '\nhjh'

In [251]: f.read(-1)
Out[251]: 'k\nop976\n'

In [252]: f.seek(0)
Out[252]: 0

In [253]: f.read(0)
Out[253]: ''

In [254]: f.readline()
Out[254]: '1234\n'

In [255]: help(f.readline)

Help on built-in function readline:

readline(size=-1, /) method of _io.TextIOWrapper instance
    Read until newline or EOF.
    
    Returns an empty string if EOF is hit immediately.
~

(END)
In [256]: f.seek(0)
Out[256]: 0

In [257]: f.readline(2)
Out[257]: '12'

In [258]: f.readlines()
Out[258]: ['34\n', 'hjhk\n', 'op976\n']

In [259]: f.seek(0)
Out[259]: 0

In [260]: for line in f:
     ...:     print(line)
     ...:     
1234

hjhk

op976


In [261]: f.writable()
Out[261]: True

In [262]: f.write('bnm\n')
Out[262]: 4

In [263]: f.writelines(['bnm\n', '098765fgh\n'])

In [264]: f.flush()

In [265]: %cat hello.py
1234
hjhk
op976
bnm
bnm
098765fgh

In [266]: f.seekable()
Out[266]: True

In [267]:

二、sys 模块

In [272]: import sys

In [273]: sys.stderr.seekable()     # 进行指针测试
Out[273]: False

In [274]: f.fileno()                # 若是 f 含有文件描述符则返回;若是 f 没有使用文件描述符,则返回osError 
Out[274]: 12

In [275]: f.isatty()                # 返回这是不是一个“交互”流 ;若是不能肯定,则返回 false
Out[275]: False

In [276]: f.name                    # 返回 f 的名字 
Out[276]: 'hello.py'

In [277]: f.buffer                  # 返回缓冲区 信息?
Out[277]: <_io.BufferedRandom name='hello.py'>

In [278]: f.truncate()    # 文件指针保持不变。大小默认为当前IO ;根据据tell()位置,返回新的大小。
Out[278]: 34

In [279]: f.close()
In [322]: f = open('hello.py', 'r+b')

In [323]: f1 = open('test.txt', 'r+b')

In [324]: f1.write(b'lkjhgfdsajytr')
Out[324]: 13

In [325]: f1.close()

In [326]: f.seek(0)
Out[326]: 0

In [327]: f.readinto(f1)    # readinto(缓冲,/)_io.bufferedrandom的实例方法
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-327-d3164cf92344> in <module>()
----> 1 f.readinto(f1)

TypeError: readinto() argument must be read-write bytes-like object, not _io.BufferedRandom
# TypeError:readinto()参数必须是可读写的, 字节的形式的对象,不可以是_io.bufferedrandom
In [328]: 

In [328]: f.close()

In [329]: f1.close()

In [330]: f1 = open('test.txt', 'r+b')

In [331]: f = open('hello.py', 'r+b')

In [332]: buffer = bytearray()

In [333]: f.readinto(buffer)
Out[333]: 0

In [334]: f.close()

In [335]: f1.close()

In [336]:
相关文章
相关标签/搜索