2条shell命令搞定文件切割

时间 2020-04-26

标签 shell 命令搞定文件切割栏目 Unix 繁體版

原文原文链接

linux里的head和tail命令分别用来查看一个文件的头几行和尾几行。例如：

head -n 100 test用来查看test文件里头100行的内容……

如今用head和tail进行文件切割，即把一条包含100行记录的文件，切成n个平均长度的文件，例如：

./cutfile.sh test 4, 表示将test文件切为n个文件，每一个文件包含4行test的内容，若是test有100行，n即为25.若是test有10行，则n为3，3个文件分别包含，4，4，2行记录。

这里假设切割完的文件自动在原文件名后面加0，1，2……，例如：

test切割成3份文件，名称分别为：test0，test1和test2

下面来看看cutfile.sh具体怎么实现：

#print arguments
echo "argument 1 is:"$1
echo "argument 2 is:"$2

#variable initialization
file_no=0
curr_len=$2
total_len=$(cat $1|wc -l)

echo "total length is:"$total_len

#cut lines in file
while [ "$curr_len" -le "$total_len" ]
do
  echo "current length is:"$curr_len
  head -n $curr_len $1|tail -n $2 > $1$file_no
  let file_no++
  curr_len=$(expr $curr_len + $2)
done

#if total_len mod curr_len = 0
if [ $(expr $curr_len - $2) -eq $total_len ]
then
  exit
fi

#cut rest lines in file
if [ "$curr_len" -gt "$total_len" ]
then
  echo "cut rest file"
  head -n $curr_len $1|tail -n $(expr $2 - $curr_len + $total_len ) > $1$file_no
fi

基本思路(以./cutfile.sh test 4为例，其中test的内容为

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)：

第一次:head -n 4 test|tail -n 4 > test0

test0 内容为：1 2 3 4

第二次:head -n 8 test|tail -n 4 > test1

test1 内容为：5 6 7 8

第三次:head -n 12 test|tail -n 4 > test2

test2 内容为：9 10 11 12

最后一次有可能剩余的记录数不满4个，若是

head -n 16 test|tail -n 4 > test3

这样test3的内容为：12 13 14 15，而实际应该是13 14 15.

在这里，咱们对最后一次不彻底切割单独作，思路以下：

head -n 16 test|tail -n 4-16+15 > test3

其中4-16+15其实表示：参数2-当前叠加数(16)+test总长度，在代码中的表示即为：

$(expr $2 - $curr_len + $total_len )

因而test3的结果即为：

13 14 15