想作一些简单的文件操做,用java过重量级,python是一个不错的选择。java
因为本地资源有限,部署到阿里云服务器上是更好的选择。python
有一个需求是将一个文件夹中全部的文件的内容提取出来分别填入excel的一个单元格中,git
用os就能够对文件进行遍历,读文件信息github
import os # Get the all files & directories in the specified directory (path). def get_recursive_file_list(path): current_files = os.listdir(path) all_files = [] for file_name in current_files: full_file_name = os.path.join(path, file_name) all_files.append(full_file_name) if os.path.isdir(full_file_name): next_level_files = get_recursive_file_list(full_file_name) all_files.extend(next_level_files) return all_files all_files=get_recursive_file_list('C:\Users\green_pasture\Desktop\korea\key_words') for filename in all_files: print filename f1=open(filename,'r+') for line1 in f1: print "\n" print line1, f1.close
可是遇到一个问题,IndentationError:unindent does not match any outer indentation level,因而去查找了一下python的indentation:服务器
http://www.secnetix.de/olli/Python/block_indentation.hawkapp
“关于python缩进的迷思”中说道:只有缩进层(即语句最左边)的空格是有意义的,而且跟缩进的确切数目无关,只和代码块的相对缩进有关。ide
同时,在你使用显式或者隐式的continue line时缩进会被忽略。测试
你能够把内层的代码同时写在一行,用分号隔开。若是要将他们写到不一样的行,那么python会强制你使用它的indentation规则。在python中,缩进的层次和代码的逻辑结构是一致的。阿里云
不要把tab和space混在一块儿,一般,tab能够自动用8个空格来代替。spa
import os # Get the all files & directories in the specified directory (path). def get_recursive_file_list(path): current_files = os.listdir(path) all_files = [] for file_name in current_files: full_file_name = os.path.join(path, file_name) all_files.append(full_file_name) if os.path.isdir(full_file_name): next_level_files = get_recursive_file_list(full_file_name) all_files.extend(next_level_files) return all_files all_files=get_recursive_file_list('C:\Users\green_pasture\Desktop\korea\key_words') for filename in all_files: print filename f1=open(filename,'r+') for line1 in f1: print line1,
我将for里面的语句跟for写在了同一行,程序没有错误了。
接着我要把打印出来的文件内容写入到excel单元格当中。
能够使用xlsxWriter https://xlsxwriter.readthedocs.org/
xlwt http://www.python-excel.org/
openpyxl http://pythonhosted.org/openpyxl/
xlsxWriter文档挺全,就考虑用这个
samplecode是:
############################################################################## # # A simple example of some of the features of the XlsxWriter Python module. # # Copyright 2013-2014, John McNamara, jmcnamara@cpan.org # import xlsxwriter # Create an new Excel file and add a worksheet. workbook = xlsxwriter.Workbook('demo.xlsx') worksheet = workbook.add_worksheet() # Widen the first column to make the text clearer. worksheet.set_column('A:A', 20) # Add a bold format to use to highlight cells. bold = workbook.add_format({'bold': True}) # Write some simple text. worksheet.write('A1', 'Hello') # Text with formatting. worksheet.write('A2', 'World', bold) # Write some numbers, with row/column notation. worksheet.write(2, 0, 123) worksheet.write(3, 0, 123.456) # Insert an image. worksheet.insert_image('B5', 'logo.png') workbook.close()
如何安装呢?能够使用pip installer,个人python目录下C:\Python33\Scripts已经有pip.exe,把当前路径设置到path环境变量,在命令行执行pip install XlsxWriter
出现了错误:
Fatal error in launcher: Unable to create process using C:\Python33\Scripts\pip.exe install XlsxWriter。
改用从github下载,
$ git clone https://github.com/jmcnamara/XlsxWriter.git $ cd XlsxWriter $ sudo python setup.py install
建立了一个测试程序:
import xlsxwriter workbook = xlsxwriter.Workbook('hello.xlsx') worksheet = workbook.add_worksheet() worksheet.write('A1', 'Hello world') workbook.close()
测试成功。
import os import xlsxwriter # Get the all files & directories in the specified directory (path). def get_recursive_file_list(path): current_files = os.listdir(path) all_files = [] for file_name in current_files: full_file_name = os.path.join(path, file_name) all_files.append(full_file_name) if os.path.isdir(full_file_name): next_level_files = get_recursive_file_list(full_file_name) all_files.extend(next_level_files) return all_files workbook = xlsxwriter.Workbook('keywords.xlsx') worksheet = workbook.add_worksheet() all_files=get_recursive_file_list('C:\Users\green_pasture\Desktop\korea\key_words') row=0 for filename in all_files: print filename f1=open(filename,'r+') keywords="" list=[] a="\n" for line in f1: list.append(line), keywords=a.join(list) print keywords worksheet.write(row,0, filename) worksheet.write(row,1,keywords.decode("utf-8")) row=row+1 workbook.close()