现象:hive 表中的小时数据,每隔几天就会缺失一个小时的,最后发现时在作数据聚合cat的时候,失败,致使:spa
修改脚本,作下面的方案,解决了:awk
##merge 5min data into hour data cat $datapath/news_5min_$xhour* > $localpath/data/channelnews_$hour.txt #####check tmppath="${localpath}/data/channelnews_${hour}.txt" i=0 while (( $i < 10)) do m=`du -b $path | awk '{print int($1)}'` if [ $m -lt 1024 ]; then echo "${path} is small ,is $m" sleep 5; else break fi let "i++" done echo "i is:$i" channel