1、打开和关闭文件python
一、文件打开和关闭shell
In [1]: help(open) Help on built-in function open in module io: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) Open file and return a stream. Raise IOError upon failure. ========= =============================================================== Character Meaning --------- --------------------------------------------------------------- 'r' open for reading (default) 'w' open for writing, truncating the file first 'x' create a new file and open it for writing 'a' open for writing, appending to the end of the file if it exists 'b' binary mode 't' text mode (default) '+' open a disk file for updating (reading and writing) 'U' universal newline mode (deprecated) ========= =============================================================== In [6]: f = open("/tmp/shell/test.txt") # 打开一个文件,得到一个文件对象 In [7]: type(f) Out[7]: _io.TextIOWrapper In [8]: f Out[8]: <_io.TextIOWrapper name='/tmp/shell/test.txt' mode='r' encoding='UTF-8'> In [9]: f.mode # 文件对象的打开模式 Out[9]: 'r' In [11]: f.name # 文件名 Out[11]: '/tmp/shell/test.txt' In [13]: f.read() # 读取文件的内容 Out[13]: 'Hello World!\nI love python\n' In [15]: f.readable() # 是否可读 Out[15]: True In [16]: f.writable() # 是否可写 Out[16]: False In [17]: f.closed # 文件对象是否关闭 Out[17]: False In [20]: f.close() # 关闭文件对象 In [21]: f.name Out[21]: '/tmp/shell/test.txt' In [22]: f.read() # 关闭后不能再查看了 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-22-bacd0e0f09a3> in <module>() ----> 1 f.read() ValueError: I/O operation on closed file. In [25]: f.closed Out[25]: True
文件对象的操做和打开方式是相关
bash
二、open函数mode参数详解cookie
1)控制读写的模式
app
'r' :即mode=r,默认,只读打开,不可写;当文件不存在时,会抛出FileNotFoundError
'w':只写打开,不可读;会清空原文件(既使打开后没有作任何操做也会清空),当文件不存在时,会新建
'x' :仅新建文件,只写打开,不可读;当文件存在时,会抛出FileExistError
'a' :追加内容到文件末尾(最后一行的下面一行),只写,不可读;当文件不存在时,会新建socket
从读写的方面来看,只有r可读不可写,其它都是可写不可读ide
当文件不存在时,只有r抛出异常,其它的都建立新文件函数
当文件存在时,只有x抛出异常
学习
从是否影响文件原始内容来看,只有w会清空文件
ui
2)控制打开方式的模式
't':以文本模式打开,默认,读出和写入的是字符串;按字符操做
'b' :以二进制的模式打开,读出和写入的都是bytes;按字节操做
In [85]: f = open('/root/1.txt', mode='w') In [86]: f.write('马哥Python') Out[86]: 8 # 按字符写入,写入了8个字符 In [87]: f.close() In [88]: cat /root/1.txt 马哥Python In [92]: f = open('/root/1.txt', mode='wb') In [93]: f.write('马哥 Python') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-93-582947144dc2> in <module>() ----> 1 f.write('马哥 Python') TypeError: a bytes-like object is required, not 'str' In [94]: f.write('马哥 Python'.encode()) Out[94]: 13 # 按字节写入,写入了13个字节 In [95]: f.close() In [96]: f = open('/root/1.txt') In [97]: f.read() Out[97]: '马哥 Python' In [98]: f.close() In [99]: f = open('/root/1.txt', mode='rb') In [100]: f.read() Out[100]: b'\xe9\xa9\xac\xe5\x93\xa5 Python'
注意:
mode的参数rw不能一块儿写:mode='rw',mode里必须有且仅有rwax中的一种,不指定则默认rt
'+':可读可写;+不能单独使用,
会增长额外的读写操做,也就是说原来是只读的,会增长可写的操做,原来只写的,增长可读的操做,但不改变其它行为
r+:可读可写,从当前指针覆盖写,
w+:可读可写,先清空文件,
a+:可读可写,从文件末尾追加(第一次写老是在文件末尾追加,尽管写以前移动了指针)
'U':已被废弃
三、文件位置指针
当open()打开一个文件的时候,解释器会持有一个指针,指向文件的某个位置
当咱们读写文件时,老是从指针处开始向后操做,而且移动指针
当mode=r(或w,w会先清空文件)时,指针是指向0(文件的开始)
当mode=a时,指针指向EOF(End Of File 文件末尾)
查看当前文件的指针的位置:
In [7]: help(f.tell) Help on built-in function tell: tell() method of _io.TextIOWrapper instance Return current stream position. (END) In [1]: f = open('/root/passwd') In [2]: f.tell() Out[2]: 0 In [3]: f.readline() Out[3]: 'root:x:0:0:root:/root:/bin/bash\n' In [4]: f.tell() Out[4]: 32 In [5]: f.readline() Out[5]: 'bin:x:1:1:bin:/bin:/sbin/nologin\n' In [6]: f.tell() Out[6]: 65
移动文件指针的位置:
seek()有2个参数:
cookie 表示移动到哪里
whence 表示从哪一个位置开始移动,有3个值
0:表示文件的开头,默认
1:表示当前指针的位置
2:表示文件末尾
In [23]: help(f.seek) Help on built-in function seek: seek(cookie, whence=0, /) method of _io.TextIOWrapper instance Change stream position. Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are: * 0 -- start of stream (the default); offset should be zero or positive * 1 -- current stream position; offset may be negative * 2 -- end of stream; offset is usually negative Return the new absolute position. In [35]: f.tell() Out[35]: 65 In [36]: f.seek(0) Out[36]: 0 In [37]: f.tell() Out[37]: 0 In [38]: f.readline() Out[38]: 'root:x:0:0:root:/root:/bin/bash\n' In [39]: f.tell() Out[39]: 32 In [40]: f.seek(5) Out[40]: 5 In [41]: f.tell() Out[41]: 5 In [42]: f.readline() Out[42]: 'x:0:0:root:/root:/bin/bash\n' # whence不为0时,cookie必须为0 In [95]: f.seek(10,1) --------------------------------------------------------------------------- UnsupportedOperation Traceback (most recent call last) <ipython-input-95-aa8451ab67ab> in <module>() ----> 1 f.seek(10,1) UnsupportedOperation: can't do nonzero cur-relative seeks In [96]: f.seek(0,2) Out[96]: 1184
mode=t 时:
当whence为0(默认值)时,cookie(offset)能够是任意整数
当whence为1或2时,cookie只能为0
mode=b 时:
当whence为0(默认值)时,cookie(offset)能够是任意整数
当whence为1或2时,cookie也能够是任意整数
小结:
seek()能够向后超出范围(但老是从文件末尾开始写),但不能向前超出范围;当seek()超出文件末尾时,不会有有异常,tell()返回的值也会超出文件末尾,可是写数据的时候,仍是会从文件末尾开始写,而不是tell()返回的超过的值;即 write操做从Min(EOF,tell())中小的处开始
seek(),tell()老是以字节来计算
四、buffering
缓冲区设置
f.flush()、f.close()和f.seek()会刷新缓冲区
buffering=-1,默认值,8192个字节,超过这个值会自动刷新缓冲区的内容,再写
二进制模式:缓冲区大小为 io.DEFAULT_BUFFER_SIZE
文本模式:缓冲区大小为 io.DEFAULT_BUFFER_SIZE
In [48]: import io In [49]: io.DEFAULT_BUFFER_SIZE Out[49]: 8192
buffering=0,关闭缓冲区(不启用缓冲区)
二进制模式:unbuffered 关闭缓冲区
文本模式:不容许
buffering=1
二进制模式:缓冲区大小为1
文本模式:line buffering 遇到换行符就flush
buffering > 1
二进制模式:buffering
会先判断缓冲区剩余位置是否足够存放当前字节,若是不能,先flush,再把当前字节写入缓冲区,若是当前字节大于缓冲区大小,直接flush
文本模式:io.DEFAULT_BUFFER_SIZE
若是当前字节加缓冲区中的字节,超出缓冲区大小,直接flush缓冲区和当前字节
注意:
通常在读写问件时,不考虑buffering,但在socket时须要考虑
特殊文件对象有特殊的刷新方式
2、文件对象
一、文件对象是可迭代对象
In [85]: cat /tmp/1.log 1234 5 5555 In [86]: f.tell() Out[86]: 12 In [87]: for i in f: # 文件对象是可迭代对象,每次迭代一行 ...: print(i) ...: In [88]: f.seek(0) Out[88]: 0 In [89]: for i in f: ...: print(i) ...: 1234 5 5555 In [90]:
二、文件对象上下文管理
上下文管理:会在离开时自动关闭文件,可是不会开启新的做用域
In [141]: with open('/tmp/1.log') as f: ...: print(f.read()) ...: f.wirte('sb') ...: 1234 5 5555aaa bbb aaa bbb [aaa bbb ] --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-141-d87fc5f222e1> in <module>() 1 with open('/tmp/1.log') as f: 2 print(f.read()) ----> 3 f.wirte('sb') 4 AttributeError: '_io.TextIOWrapper' object has no attribute 'wirte' In [142]: f.closed Out[142]: True
三、File-like 类文件对象
StringI0
In [189]: from io import StringIO In [190]: sio = StringIO() In [190]: sio = StringIO() In [191]: sio.readable() Out[191]: True In [193]: sio.writable() Out[193]: True In [194]: sio.write('python') Out[194]: 6 In [195]: sio.seek(0) Out[195]: 0 In [196]: sio.read() Out[196]: 'python' In [197]: sio.tell() Out[197]: 6 In [198]: sio.getvalue() # 将文件的内容所有读出来,无论指针在哪,并能够重复读取屡次 Out[198]: 'python'
类文件对象还有:BytesIO, socket
StringIO和BytesIO对应于打开模式为t和b的文件对象
做用:
在内存中模拟文件对象,速度更快
类文件对象能够不close,但会占用内存
3、pathlib
在python 3.4以前只有使用os.path这一种方法来操做路径
os.path是以字符串的方式操做路径的
在python 3.4引入了pathlib库以面向对象的方式来操做路径
In [196]: import pathlib In [199]: pwd = pathlib.Path('.') # .表明当前目录 In [200]: pwd Out[200]: PosixPath('.') In [206]: pwd.absolute() # 绝对路径 Out[206]: PosixPath('/root/magedu/python3')
一、对目录的操做
In [209]: pwd Out[209]: PosixPath('.') In [210]: pwd.is_absolute() Out[210]: False In [211]: pwd.is_dir() Out[211]: True
遍历目录:
In [212]: pwd.iterdir() # 返回一个目录生成器 Out[212]: <generator object Path.iterdir at 0x7fee8a140e08> In [213]: for i in pwd.iterdir(): ...: i ...: In [214]: for i in pwd.iterdir(): # 遍历目录,只遍历子目录,不会递归遍历 ...: print(i) ...: .ipynb_checkpoints 第一天.ipynb nohup.out .python-version 元组及其操做.ipynb test test.py In [216]: for i in pwd.iterdir(): ...: print(type(i)) ...: print(i) ...: <class 'pathlib.PosixPath'> # 能够继续循环,实现递归遍历 .ipynb_checkpoints <class 'pathlib.PosixPath'> 第一天.ipynb <class 'pathlib.PosixPath'> nohup.out <class 'pathlib.PosixPath'> .python-version <class 'pathlib.PosixPath'> 元组及其操做.ipynb <class 'pathlib.PosixPath'> test <class 'pathlib.PosixPath'> test.py In [217]: type(pwd) Out[217]: pathlib.PosixPath In [218]: print(type(pwd)) <class 'pathlib.PosixPath'>
建立文件:
In [219]: d = pathlib.Path('/tmp/1.txt') In [220]: d.exists() Out[220]: True In [221]: d.mkdir(755) --------------------------------------------------------------------------- FileExistsError Traceback (most recent call last) <ipython-input-221-ea2331bf5a68> in <module>() ----> 1 d.mkdir(755) /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in mkdir(self, mode, parents, exist_ok) 1225 if not parents: 1226 try: -> 1227 self._accessor.mkdir(self, mode) 1228 except FileExistsError: 1229 if not exist_ok or not self.is_dir(): /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in wrapped(pathobj, *args) 388 @functools.wraps(strfunc) 389 def wrapped(pathobj, *args): --> 390 return strfunc(str(pathobj), *args) 391 return staticmethod(wrapped) 392 FileExistsError: [Errno 17] File exists: '/tmp/1.txt' In [223]: d1 = pathlib.Path('/tmp/11.txt') # 先建立一个对象 In [224]: d1.exists() # 判断此对象不存在 Out[224]: False In [225]: help(d1.mkdir) Help on method mkdir in module pathlib: mkdir(mode=511, parents=False, exist_ok=False) method of pathlib.PosixPath instance # mode 指定权限 # parents 建立父目录,当父目录存在时会报错,当exist_ok=True时,不报错,就至关于shell命令的mkdir -p In [226]: d1.mkdir(0o755) # 新建目录,并指定权限,默认511;0o表示八进制 In [227]: d1.exists() Out[227]: True In [237]: ls -ld /tmp/11.txt drwxr-xr-x 2 root root 4096 Nov 2 14:24 /tmp/11.txt/
删除目录:
In [239]: d1.rmdir() # 只能删除空目录 In [240]: d1.exists() Out[240]: False In [246]: d2 = pathlib.Path('/tmp') In [247]: d2.rmdir() --------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-247-ffe14b7c0399> in <module>() ----> 1 d2.rmdir() /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in rmdir(self) 1273 if self._closed: 1274 self._raise_closed() -> 1275 self._accessor.rmdir(self) 1276 1277 def lstat(self): /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in wrapped(pathobj, *args) 388 @functools.wraps(strfunc) 389 def wrapped(pathobj, *args): --> 390 return strfunc(str(pathobj), *args) 391 return staticmethod(wrapped) 392 OSError: [Errno 39] Directory not empty: '/tmp' # 非空目录如何删除,在后面学习
二、对文件和目录的通用操做
In [250]: f = pathlib.Path('/tmp/xj/a.txt') In [250]: f = pathlib.Path('/tmp/xj/a.txt') In [251]: f.exists() Out[251]: False In [252]: f.is_file() # 当文件不存在时,is类方法返回的都是false Out[252]: False In [253]: f.is_dir() Out[253]: False In [257]: f = pathlib.Path('/tmp/src') In [258]: f.exists() Out[258]: True In [259]: f.cwd() # 获取当前工做目录 Out[259]: PosixPath('/root/magedu/python3') In [260]: f.absolute() # 获取绝对路径 Out[260]: PosixPath('/tmp/src') In [261]: f.as_uri() # 绝对路径转化为uri Out[261]: 'file:///tmp/src' In [265]: pathlib.Path('~') Out[265]: PosixPath('~') In [266]: pathlib.Path('~').expanduser() # 将~转化为具体的用户家目录 Out[266]: PosixPath('/root') In [275]: f.home() # 获取家目录 Out[275]: PosixPath('/root') 如何一个路径是一个符号连接,修改符号连接的权限就须要使用lchmod In [267]: f.name # 基名,至关于basename Out[267]: 'src' In [278]: f.parent # 至关于dirname Out[278]: PosixPath('/tmp') In [279]: f.parents # 父类 Out[279]: <PosixPath.parents> In [273]: f.owner() # 文件属主 Out[273]: 'root' In [274]: f.group() Out[274]: 'root' In [285]: f = pathlib.Path('/tmp/shell/not_exist.txt') In [286]: f.exists() Out[286]: True In [287]: f.suffix # 文件名的后缀(.后面的) Out[287]: '.txt' In [288]: f.suffixs --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-288-c25eaddee638> in <module>() ----> 1 f.suffixs AttributeError: 'PosixPath' object has no attribute 'suffixs' In [289]: f.suffixes Out[289]: ['.txt'] In [290]: f = pathlib.Path('/tmp/shell/not_exist.txt.log') In [291]: f.exists() Out[291]: False In [292]: f.suffixes # 文件不存在也能够获取到 Out[292]: ['.txt', '.log'] In [293]: f.suffix Out[293]: '.log'
f.stat:文件状态
In [297]: f = pathlib.Path('/tmp/shell/not_exist.txt') In [298]: f.stat() Out[298]: os.stat_result(st_mode=33188, st_ino=530232, st_dev=2050, st_nlink=1, st_uid=0, st_gid=0, st_size=7, st_atime=1499241423, st_mtime=1499241417, st_ctime=1499241417) In [299]: f.lstat() # 针对符号连接自己 Out[299]: os.stat_result(st_mode=33188, st_ino=530232, st_dev=2050, st_nlink=1, st_uid=0, st_gid=0, st_size=7, st_atime=1499241423, st_mtime=1499241417, st_ctime=1499241417)
f.glob:通配符
f.rglob:递归遍历
In [307]: d = pathlib.Path('/tmp') In [314]: for i in d.glob('*/*.txt'): # *表示当前目录 ...: print(i) ...: /tmp/src/helloworld.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt In [315]: for i in d.glob('**/*.txt'): # **表示递归子目录 ...: print(i) ...: /tmp/1.txt /tmp/src/helloworld.txt /tmp/src/cdn/t2/t2_201706150000_charge.txt /tmp/src/cdn/t2/t2_201706150000_role.txt /tmp/src/cdn/t2/t2_201706150000_activity_record.txt /tmp/src/cdn/t2/t2_201706150000_role_face.txt /tmp/src/cdn/t2/t2_201706150000_enter_role_successful.txt /tmp/src/cdn/t2/t2_201706150000_task_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login_success.txt /tmp/src/cdn/t2/t2_201706150000_consumption_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login.txt /tmp/src/cdn/t2/t2_201706150000_enter_scene.txt /tmp/src/cdn/t2/t2_201706150000_mall_transactions.txt /tmp/src/cdn/t2/t2_201706150000_enter_role.txt /tmp/src/cdn/t2/t2_201706150000_props_record.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt In [320]: for i in d.rglob('*/*.txt'): # gglob 递归遍历 ...: print(i) ...: /tmp/src/helloworld.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt /tmp/src/cdn/t2/t2_201706150000_charge.txt /tmp/src/cdn/t2/t2_201706150000_role.txt /tmp/src/cdn/t2/t2_201706150000_activity_record.txt /tmp/src/cdn/t2/t2_201706150000_role_face.txt /tmp/src/cdn/t2/t2_201706150000_enter_role_successful.txt /tmp/src/cdn/t2/t2_201706150000_task_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login_success.txt /tmp/src/cdn/t2/t2_201706150000_consumption_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login.txt /tmp/src/cdn/t2/t2_201706150000_enter_scene.txt /tmp/src/cdn/t2/t2_201706150000_mall_transactions.txt /tmp/src/cdn/t2/t2_201706150000_enter_role.txt /tmp/src/cdn/t2/t2_201706150000_props_record.txt
路径的拼接:
In [321]: '/' + 'xj' + '/sb' Out[321]: '/xj/sb' In [322]: pathlib.Path('/', 'home', 'xj', 'workspace') Out[322]: PosixPath('/home/xj/workspace') In [323]: pathlib.Path('home', 'xj', 'workspace') Out[323]: PosixPath('home/xj/workspace') In [324]: print(pathlib.Path('home', 'xj', 'workspace')) home/xj/workspace In [325]: print(pathlib.Path('/', 'home', 'xj', 'workspace')) /home/xj/workspace In [331]: print(pathlib.Path('/', '/home', 'xj', 'workspace')) /home/xj/workspace In [331]: print(pathlib.Path('/', '/home', 'xj', 'workspace')) /home/xj/workspace # 能自动添加和删除/,拼接路径更加方便
4、文件对象实现copy,move,rm
一、copy
shutil标准库
shutil.
copyfileobj
(fsrc, fdst[, length]) # 2个文件对象间内容的copy
shutil.
copyfile
(src, dst, *, follow_symlinks=True) # 仅复制内容,不复制元数据
shutil.
copymode
(src, dst, *, follow_symlinks=True) # 仅复制权限
shutil.
copystat
(src, dst, *, follow_symlinks=True) # 仅复制元数据
shutil.
copy
(src, dst, *, follow_symlinks=True) # 复制文件内容和权限,至关于copyfile + copymode
shutil.
copy2
(src, dst, *, follow_symlinks=True) # 复制文件内容和元数据,至关于copyfile + copystat
shutil.
ignore_patterns
(*patterns)
shutil.
copytree
(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False)
shutil.
rmtree
(path, ignore_errors=False, onerror=None)
shutil.
move
(src, dst, copy_function=copy2)
以上函数都只针对文件或当前这一层的目录
copyfileobj 操做的是文件对象,后面的函数都是操做路径
shutil.
copytree
(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False)
递归复制目录,其中copy_function参数指定用何种方法复制文件,能够是以上函数中除了copyfileobj意外任意一个
shutil.
rmtree
(path, ignore_errors=False, onerror=None)
递归删除目录,
ignore_errors 表示是否忽略错误,onerror 表示如何处理错误,仅当ignore_errors=False时,onerror才生效,ignore_errors 为True是遇到错误直接抛出异常
二、move
shutil.
move
(src, dst, copy_function=copy2)
具体实现依赖操做系统,若是操做系统实现了renmae系统调用,直接走rename系统调用,若是没有实现,先使用copytree复制,而后使用rmtree
shutil.
disk_usage
(path)
Return disk usage statistics about the given path as a named tuple with the attributes total, used and free, which are the amount of total, used and free space, in bytes.
New in version 3.3.
Availability: Unix, Windows.
shutil.
chown
(path, user=None, group=None)
Change owner user and/or group of the given path.
user can be a system user name or a uid; the same applies to group. At least one argument is required.
See also os.chown()
, the underlying function.
Availability: Unix.
New in version 3.3.
shutil.
which
(cmd, mode=os.F_OK | os.X_OK, path=None)
In [68]: shutil.which('python') Out[68]: '/root/.pyenv/versions/commy/bin/python' In [69]: shutil.which('ls') Out[69]: '/bin/ls'