如下是Python数据处理的题目说明与要求:
The attachment is a log file used to show running status of set-top-box, and each line in the file follows the format of “LineNumber + Time + ProcessName + (ProcessID) + Logs”, currently the logs are displayed in time order. Please write one script with Python language to support the following features:markdown
这是机顶盒执行的blog文本文件,打开后部分截图例如如下:
app
一看很是乱,事实上不该该用微软的txt打开,尝试用notepad++打开后,结构清楚了很是多,部分截图例如如下:
post
如下给出代码:
第1题的代码例如如下:rest
#coding=utf-8 import re f1=open('stblog.txt','r') f2=open('cc1.txt','w') list1=f1.readlines() list_process=[] #定义列表存放Process res='\d\D\d\d:\d\d:\d\d\.\d{3}\s([a-z]+)' for i in range(len(list1)): list_process.append(re.findall(res,str(list1[i]))) for i in range(len(list_process)): #測试正则是否可行 if len(list_process[i])>1: print 'zheng ze fail' #print len(list_process) #print len(list1) #print list_process[141] #print list1[141] for m in range(len(list1)): #冒泡排序 for n in range(m+1,len(list1)): if cmp(list_process[m],list_process[n])>0: list_process[m],list_process[n]=list_process[n],list_process[m] list1[m],list1[n]=list1[n],list1[m] f2.writelines(list1)
第2,3题代码例如如下:code
#coding=utf-8 import re f1=open('stblog.txt','r') f2=open('cc2.txt','w') list1=f1.readlines() list_process=[] #定义列表存放Process list2=[] count=0 res='\d\D\d\d:\d\d:\d\d\.\d{3}\s([a-z\.\-]+)' for i in range(len(list1)): list_process.append(re.findall(res,str(list1[i]))) for i in range(len(list_process)): #測试正则是否可行 if len(list_process[i])>1: print 'zheng ze fail' s=raw_input("please input the log you interested:") for i in range(len(list_process)): if list_process[i]==s.split(): list2.append(list1[i]) #将相应的process行加入到cc2.txt count+=1 print count f2.writelines(list2)