excel_import()解析php
前言:由于业务须要,我开始的需求是:既拿到正确的日期格式,读取到的excel数据格式也跟以前相同,后期能尽可能减小代码的修改。可是,我后期虽然拿到了相同的数据格式,最后由于执行速度不佳,被迫放弃,用了createdreader->getActiveSheet->toArray方法来达到目的。可是这是我第一次有修改/研究源码的行为,仍是值得mark一下的。数据库
-----------------------------------------------------------------------------------------------post
如下是前期想法:性能
缺陷一:spa
drupal 利用 PHPExcel_IOFactory 对象封装了一个excel_import的方法,可是这个方法有个比较神奇的属性, setReadDataOnly。excel
看了看源码,这个属性大意是,setReadDataOnly为true的时候就不会读excel中每一个cell的style,例如border、color,从而增长了读取速度。orm
开启这个模式的优点:大概是性能比较好。具体运行速度我试了一下,1000行的excel读取速度是7s,下面我将这个方法的setReadDataOnly设置为false,1000行的excel读取速度为30秒,差异有点大哦。对象
setReadDataOnly的影响:get
(1):丢掉了excel中每一个cell单独的formatCode,并统一将他们的formatCode设置为general。源码
(2):删掉了无效的cell,即便cell有border这样的干扰因素。
这样作的结果是,某些特殊的格式在倒入数据库中得不到正确的格式/值。例如2019-03-06,获得的结果是43530
缺陷二:
没有转换格式的操做。
这是缺陷一带来的必然结果,由于每一个cell的formatCode都是general了,还换啥格式?
解决方法(我修改了一下excel_import方法):
思路很简单。
(1):拿掉setReadDataOnly模式。
(2):增长转换格式操做。
function phpexcel_import_change_format($path, $keyed_by_headers = TRUE, $keyed_by_worksheet = FALSE) {
if (is_readable($path)) {
$library = libraries_load('PHPExcel');
if (!empty($library['loaded'])) {
// Determine caching method.
list($cache_method, $cache_settings) = _phpexcel_get_cache_settings();
// Is it available ? If not, return an error.
if (empty($cache_method)) {
return PHPEXCEL_CACHING_METHOD_UNAVAILABLE;
}
PHPExcel_Settings::setCacheStorageMethod($cache_method, $cache_settings);
$xls_reader = PHPExcel_IOFactory::createReaderForFile($path);
//注释掉setReadDataOnly模式操做
// if (!empty($custom_calls)) {
// foreach ($custom_calls as $method => $args) {
// if (method_exists($xls_reader, $method)) {
// call_user_func_array(array($xls_reader, $method), $args);
// }
// }
// }
$xls_data = $xls_reader->load($path);
$data = array();
$headers = array();
$options = array(
'path' => $path,
'keyed_by_headers' => $keyed_by_headers,
'keyed_by_worksheet' => $keyed_by_worksheet,
);
$i = 0;
phpexcel_invoke('import', 'full', $xls_data, $xls_reader, $options);
foreach ($xls_data->getWorksheetIterator() as $worksheet) {
$j = 0;
phpexcel_invoke('import', 'sheet', $worksheet, $xls_reader, $options);
foreach ($worksheet->getRowIterator() as $row) {
if ($keyed_by_worksheet) {
$i = $worksheet->getTitle();
}
$k = 0;
$cells = $row->getCellIterator();
$cells->setIterateOnlyExistingCells(FALSE);
phpexcel_invoke('import', 'row', $row, $xls_reader, $options);
foreach ($cells as $cell) {
//转换格式操做
$style=$xls_data->getActiveSheet()->getParent()->getCellXfByIndex($cell->getXfIndex());
$value_t = $cell->getValue();
$value=PHPExcel_Style_NumberFormat::toFormattedString(
$value_t,
($style && $style->getNumberFormat()) ?
$style->getNumberFormat()->getFormatCode() :
PHPExcel_Style_NumberFormat::FORMAT_GENERAL
); $value = drupal_strlen($value) ? trim($value) : ''; if (!$j && $keyed_by_headers) { $value = drupal_strlen($value) ? $value : $k; phpexcel_invoke( 'import', 'pre cell', $value, $cell, $options, $k, $j ); $headers[$i][] = $value; } elseif ($keyed_by_headers) { phpexcel_invoke( 'import', 'pre cell', $value, $cell, $options, $k, $j ); $data[$i][$j - 1][$headers[$i][$k]] = $value; phpexcel_invoke( 'import', 'post cell', $data[$i][$j - 1][$headers[$i][$k]], $cell, $options, $k, $j ); } else { $col_index = $k; if ($cells->getIterateOnlyExistingCells()) { $col_index = PHPExcel_Cell::columnIndexFromString($cell->getColumn()) - 1; } phpexcel_invoke( 'import', 'pre cell', $value, $cell, $options, $col_index, $j ); $data[$i][$j][$col_index] = $value; phpexcel_invoke( 'import', 'post cell', $data[$i][$j][$col_index], $cell, $options, $col_index, $j ); } $k++; } $j++; } if (!$keyed_by_worksheet) { $i++; } } // Free up memory. $xls_data->disconnectWorksheets(); unset($xls_data); return $data; } else { watchdog('phpexcel', "Couldn't find the PHPExcel library. Excel import aborted.", array(), WATCHDOG_ERROR); return PHPEXCEL_ERROR_LIBRARY_NOT_FOUND; } } else { watchdog('phpexcel', "The path '@path' is not readable. Excel import aborted.", array('@path' => $path)); return PHPEXCEL_ERROR_FILE_NOT_READABLE; }}