python 实时遍历日志文件

时间 2019-11-13

标签 python 实时遍历日志文件栏目 Python 繁體版

原文原文链接

推荐日志处理项目：https://github.com/olajowon/loggrovepython

首先尝试使用 python open 遍历一个大日志文件，git

使用 readlines() 仍是 readline() ?github

整体上 readlines() 不慢于python 一次次调用 readline()，由于前者的循环在C语言层面，而使用readline() 的循环是在Python语言层面。shell

可是 readlines() 会一次性把所有数据读到内存中，内存占用率会太高，readline() 每次只读一行，对于读取大文件，须要作出取舍。spa

若是不须要使用 seek() 定位偏移， for line in open('file') 速度更佳。日志

使用 readlines()，适合量级较小的日志文件code

 1 p = 0
 2 with open(filepath, 'r+') as f:
 3     f.seek(p, 0)
 4     while True:
 5         lines = f.readlines()
 6         if lines:
 7             print lines
 8             p = f.tell()
 9             f.seek(p, 0)
10         time.sleep(1)

使用 readline()，避免内存占用率过大blog

1 p = 0
2 with open('logs.txt', 'r+') as f:
3     while True:
4         line = f.readline()
5         if line:
6             print line

################## 华丽分割 ##########################进程

如今尝试使用 tail -F log.txt 动态输出内存

因为 os.system() , commands.getstatusoutput() 属于一次性执行就拜拜，最终选择 subprocess.Popen()，

subprocess 模块目的是启动一个新的进程并与之通讯，最经常使用是定义类Popen，使用Popen能够建立进程，并与进程进行交互。

1 import subprocess
2 import time
3 
4 p = subprocess.Popen('tail -F log.txt', shell=True, stdout=subprocess.PIPE,stderr=subprocess.PIPE,)
5 while True:
6    line = p.stdout.readline()
7    if line:
8         print line