当咱们要导出数据库数据到Excel文件时,若是数据量特别大,那么可能须要耗费较多内存形成OOM。即便没有OOM,也有可能由于生成Excel文件的时间过久致使请求超时。这时候就须要POI的SXSSF(org.apache.poi.xssf.streaming)功能了。html
Excel 97(-2007) file formatjava
Excel 2007 OOXML (.xlsx) file format数据库
HSSF is the POI Project's pure Java implementation of the Excel '97(-2007) file format. XSSF is the POI Project's pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.apache
HSSF and XSSF provides ways to read spreadsheets create, modify, read and write XLS spreadsheets. They provide:api
Since 3.8-beta3, POI provides a low-memory footprint SXSSF API built on top of XSSF.bash
SXSSF is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk.app
In auto-flush mode the size of the access window can be specified, to hold a certain number of rows in memory. When that value is reached, the creation of an additional row causes the row with the lowest index to to be removed from the access window and written to disk. Or, the window size can be set to grow dynamically; it can be trimmed periodically by an explicit call to flushRows(int keepRows) as needed.xss
Due to the streaming nature of the implementation, there are the following limitations when compared to XSSF:ide
SXSSF是如何减少内存消耗的呢?它经过将数据写到临时文件来减小内存使用,下降发生OOM错误的几率。字体
// turn off auto-flushing and accumulate all rows in memory
SXSSFWorkbook wb = new SXSSFWorkbook(-1);
复制代码
你也能够在构造方法里,指定-1来关闭自动写入数据到文件,将全部数据内容保持在内存里。
虽然这里处理了内存OOM的问题,可是仍是必须将所有数据写到一个临时文件以后才能响应请求,请求超时的问题没有解决。
Excel 2007 OOXML (.xlsx) 文件格式其实本质上是一个zip文件,咱们能够把.xlsx
文件后缀名改成.zip
,而后解压:
$ mv output.xlsx output.zip
$ unzip output.zip
$ tree output/
output/
├── [Content_Types].xml
├── _rels
├── docProps
│ ├── app.xml
│ └── core.xml
└── xl
├── _rels
│ └── workbook.xml.rels
├── sharedStrings.xml
├── styles.xml
├── workbook.xml
└── worksheets
└── sheet1.xml
5 directories, 8 files
复制代码
咱们能够看到这个Excel文件解压后包含了上面那些文件,其中styles是咱们定义的样式格式(包括字体、文字大小、颜色、居中等属性),worksheets目录下是咱们的数据内容。
经过具体分析数据格式,咱们能够本身控制xlsx文件的写入过程,将数据直接写到响应流上而非临时文件就能够完美解决请求超时的问题。
示例代码:
XSSFWorkbook wb = new XSSFWorkbook()
XSSFCellStyle headerStyle = genHeaderStyle(wb)
sheets.each { sheet ->
def xssfSheet = wb.createSheet(sheet.name)
sheet.setXSSFSheet(xssfSheet)
sheet.setHeaderStyle(headerStyle)
}
File template = genTemplateFile(wb)
ZipOutputStream zos = new ZipOutputStream(responseStream);
ZipFile templateZip = new ZipFile(template);
Enumeration<ZipEntry> templateEntries = templateZip.entries();
try {
while (templateEntries.hasMoreElements()) {
// copy all template content to the ZipOutputStream zos
// except the sheet itself
}
zos.putNextEntry(new ZipEntry(sheetName)); // now the sheet
OutputStreamWriter sheetOut = new OutputStreamWriter(zos, "UTF-8");
try {
sheetOut.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
sheetOut.write("<worksheet><sheetData>");
// write the content – rows and cells
sheetOut.write("</sheetData></worksheet>");
} finally { sheetOut.close(); }
} finally { zos.close(); }
复制代码
其中,template包含了一些索引信息,好比建了哪些样式、几个sheet等,这些信息是放到ZIP文件的最前面的,最后才是sheet内容数据。
个人博客原文地址: blog.yu000hong.com/2018/07/24/…