前言:日常在python中从文件夹中获取文件名的简单方法 os.system('ll /data/') 可是当文件夹中含有巨量文件时,这种方式彻底是行不通的;python
在/dd目录中生成了近6百万个文件,接下来看看不一样方法之间的性能对比 快速生成文件的shell脚本 shell
for i in $(seq 1 1000000);do echo text >>$i.txt;done
一、系统命令 ls -l性能
# 系统命令 ls -l import time import subprocess start = time.time() result = subprocess.Popen('ls -l /dd/', stdout=subprocess.PIPE,shell=True) for file in result.stdout: pass print(time.time()-start) # 直接卡死
二、glob 模块spa
# glob 模块 import glob import time start = time.time() result = glob.glob("/dd/*") for file in result: pass print(time.time()-start) # 49.60481119155884
三、os.walk 模块blog
# os.walk 模块 import os import time start = time.time() for root, dirs, files in os.walk("/dd/", topdown=False): pass print(time.time()-start) # 8.906772375106812
四、os.scandir 模块排序
# os.scandir 模块 import os import time start = time.time() path = os.scandir("/dd/") for i in path: pass print(time.time()-start) # 4.118424415588379
五、shell find命令pdo
# shell find命令 import time import subprocess start = time.time() result = subprocess.Popen('find /dd/', stdout=subprocess.PIPE,shell=True) for file in result.stdout: pass print(time.time()-start) # 6.205533027648926
六、shell ls -1 -f 命令 不进行排序class
# shell ls -1 -f 命令 import time import subprocess start = time.time() result = subprocess.Popen('ls -1 -f /dd/', stdout=subprocess.PIPE,shell=True) for file in result.stdout: pass print(time.time()-start) # 3.3476643562316895
七、os.listdirimport
# os.listdir import os import time start = time.time() result = os.listdir('/dd') for file in result: pass print(time.time()-start) # 2.6720399856567383