背景:node
最近要解析一些树状结构的debug trace文本,为了便于阅读,但愿解析成a.b.c 的结构。python
每一个父节点和子节点靠一个Tab识别,叶子节点以ptr开头(除了Tab)。算法
核心思想:优化
首先找到叶子节点,而后依次向前找到父节点(父节点比当前节点少一个Tab),当遇到“}”, 表示这棵树结束了。this
现模拟debug trace 建一个文本文件1.txt。spa
内容以下:debug
service[hi] name: [1] { name:[11] { name: [111] { ptr -- [1111]--[value0] ptr -- [1112]--[value1] } name: [112] { name: [1121] { ptr -- [111211]--[value2] } } } name:[12] { ptr -- [121]--[value3] } name:[13] { ptr -- [131]--[value4] } } service[Jeff] name: [1] { name:[11] { name: [111] { ptr -- [1111]--[value0] ptr -- [1112]--[value1] } name: [112] { name: [1121] { ptr -- [111211]--[value2] } } } name:[12] { ptr -- [121]--[value3] } name:[13] { ptr -- [131]--[value4] } }
解析程序以下:code
1.common.pyorm
''' Created on 2012-5-26 author: Jeff ''' def getValue(string,key1,key2): """ get the value between key1 and key2 in string """ index1 = string.find(key1) index2 = string.find(key2) value = string[index1 + 1 :index2] return value def getFiledNum(string,key,begin): """ get the number of key in string from begin position """ keyNum = 0 start = begin while True: index = string.find(key, start) if index == -1: break keyNum = keyNum + 1 start = index + 1 return keyNum
2. main.pyxml
''' Created on 2012-5-26 author: Jeff ''' import common import linecache fileName = "1.txt" fileNameWrite = "result.txt" leafNode = "ptr" curLine = 0 nextLine = 0 f = open(fileName,'r') fw = open(fileNameWrite,'w') # read line while True: data = f.readline() if not data: break curLine = curLine + 1 # find the leafNode if data.startswith("service"): index = data.find('\n') print data[0:index] fw.write(data[0:index] + '\n') continue if data.find(leafNode) != -1: nextLine = curLine + 1 #print "data is %s, current line is %d, next line is %d." %(data,curLine,nextLine) # value of leaf node value = common.getValue(data, '[', ']') string = value #print "value of leaf node is %s" % value # get the number of tab tabNum = common.getFiledNum(data, '\t', 0) #print( "Tab number is %d" % tabNum ) # i for read previous line # j for create perfix i = curLine - 1 j = tabNum - 1 while True: prefix = '\t' * j + 'name' # get previous line preline=linecache.getline(fileName,i) #print "previous line is %s" % preline if preline.startswith("{"): break if preline.startswith(prefix): #print "this line start with prefix value = common.getValue(preline, '[', ']') string = value + "." + string i = i - 1 j = j - 1 else: i = i - 1 print string fw.write(string + '\n') fw.close() f.close()
解析结果result.txt:
service[hi] 1.11.111.1111 1.11.111.1112 1.11.111.1121.111211 1.12.121 1.13.131 service[jeff] 1.1.11.111.1111 1.1.11.111.1112 1.1.11.111.1121.111211 1.1.12.121 1.1.13.131
优化:
1.字符串相加的部分改为 all = ‘%s%s%s%s’ % (str0, str1, str2, str3) 形式 或者 ''.join 的形式。
2.要写入得内容保存在List中,最后用f.writelines(list)一块儿写入。
3.算法优化,请参考 个人博客 :《Python 解析树状结构文件(算法优化)》