文本工具cut、wc、sort、uniq、tr

时间 2019-12-07

标签文本工具 cut sort uniq 繁體版

原文原文链接

8.10 shell特殊符号&cut命令

特殊符号

“*” 表明零个或多个字符
“？” 表明一个字符
“#” 注释符号
“\” 脱意符号
“|” 管道符
“$” 该符号与“!”合用“!$”表示上一条命令中的最后一个变量
“；” 分隔符，在一行中运行两个及两个以上的命令时使用
“~” 用户的家目录（root用户“/root”，普通用户“/home/username”）
“&” 若是想把一条命令直接放到后台运行的话，能够在命令行加上这个符号（一般用于运行时间很是长的命令）
“[]” 中括号中间为字符组合，表明中间字符中的任意一个。

cut命令

cut命令用来显示行中的指定部分，删除文件中指定字段。cut常常用来显示文件的内容，相似于下的type命令。
说明：该命令有两项功能，其一是用来显示文件的内容，它依次读取由参数file所指明的文件，将它们的内容输出到标准输出上；其二是链接两个或多个文件，如cut fl f2 > f3将把文件fl和几的内容合并起来，而后经过输出重定向符“>”的做用，将它们放入文件f3中。linux

语法： cut -d ‘分隔符’ [-cf] n [filename] （这里n是正整数）
-d：指定分隔符号
-f：指定第几段
-c：后面只有一个数字表示截取第几个字符；后面跟一个数字区域，表示截取从几到几（该选项不和d，f共同使用）shell

[root@3 tmp]# cut -c1 1.txt |head -n2
r
b
[root@3 tmp]# cut -c1,3 1.txt |head -n2
ro
bn
[root@3 tmp]# cut -f1,3 -d ':' 1.txt |head -n2
root:0
bin:1
[root@3 tmp]# cut -f1-3 -d ':' 1.txt |head -n2
root:x:0
bin:x:1

8.11 sort、wc、uniq命令

sort命令

sort命令是在Linux里很是有用，它将文件进行排序，并将排序结果标准输出。sort命令既能够从特定的文件，也能够从stdin中获取输入。bash

语法： sort [-t 分隔符] [options] [filename]
Options:
-t：指定分隔符
-n：使用纯数字排序（系统默认全部字母为0）
-r：反向排序
-u：=unique 去重复
-kn1,n2：由n1区间排序到n2区间，能够只写-kn1，即对n1字段排序（n1 < n2）
sort不加任何选项，则从首字符向后，依次以ASCⅡ码值进行比较，最后将它们按升续输出。spa

[root@3 tmp]# head -n3 1.txt
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
[root@3 tmp]# head -n3 1.txt |sort
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
root:x:0:0:root:/root:/bin/bash

wc命令

wc命令用来计算数字。利用wc指令咱们能够计算文件的Byte数、字数或是列数。命令行

语法： wc [options] [filename]
Options:
-l：=line 统计行数
-m：=member 统计字符数
-w：=Word 统计词数日志

[root@3 tmp]# wc -l !$
wc -l 2.txt
2 2.txt
[root@3 tmp]# wc -m 2.txt
10 2.txt
[root@3 tmp]# cat !$
cat 2.txt
1234
qwer
[root@3 tmp]# cat -A 2.txt
1234$
qwer$
[root@3 tmp]# wc -w 2.txt
4 2.txt
[root@3 tmp]# cat 2.txt
1234 456 789,10
qwer

说明： wc -m会统计文件内全部字符，包括隐藏的换行符“&”；wc -w是以空格做为分隔符进行词组统计的。code

uniq命令（unique）

uniq命令用于报告或忽略文件中的重复行，通常与sort命令结合使用（即：去重复）。排序

语法： uniq [options] [filename]
Options：
-c：=count 在每列旁边显示该行重复出现的次数it

[root@3 tmp]# cat !$
cat 2.txt
1234
456 789,10
1234
qwer
456
[root@3 tmp]# uniq -c 2.txt
      1 1234
      1 456 789,10
      1 1234
      1 qwer
      1 456
[root@3 tmp]# sort 2.txt |uniq -c
      2 1234
      1 456
      1 456 789,10
      1 qwer

说明： 直接使用uniq命令，2.txt内容显示并无变化，使用sort排序后再用uniq命令，重复行被合并，即：在对文件进行去重以前须要先进行排序！io

8.12 tee、tr、split命令

tee命令

tee命令用于将数据重定向到文件,会删除文件内原有内容，与“>”不一样的是，tee会把定向的文件内容显示出来。

语法： tee [options] [filename]
Options：
-a：向文件中重定向时使用追加模式（=“>>”）

[root@3 tmp]# cat 3.txt
00000000000
[root@3 tmp]# sort 2.txt |uniq -c |tee 3.txt
      2 1234
      1 456
      1 456 789,10
      1 qwer
[root@3 tmp]# cat 3.txt
      2 1234
      1 456
      1 456 789,10
      1 qwer
      [root@3 tmp]# sort 2.txt |uniq -c |tee -a 3.txt
      2 1234
      1 456
      1 456 789,10
      1 qwer
[root@3 tmp]# cat 3.txt
      2 1234
      1 456
      1 456 789,10
      1 qwer
      2 1234
      1 456
      1 456 789,10
      1 qwer

tr命令

tr命令能够对来自标准输入的字符进行替换、压缩和删除，它能够将一个字符变成另外一个字符，也能够将一组字符变成另外一组字符。

语法： tr [源字符] [目标字符]

[root@3 tmp]# echo "adailinux" |tr 'a' 'A'
AdAilinux   替换一个字符
[root@3 tmp]# echo "adailinux" |tr '[al]' '[AL]'
AdAiLinux   替换多个字符
[root@3 tmp]# echo "adailinux" |tr '[a-z]' '[A-Z]'
ADAILINUX

split命令

split命令能够将一个大文件分割成不少个小文件，有时须要将文件分割成更小的片断，好比为提升可读性，生成日志等。

语法： split [options] [filename]
-b：指定每一输出档案的大小，默认单位为 byte，可自定义单位，如 split -b 100M filename
-l：指定每个输出档案的行数多少
eg1： 指定大小

[root@3 tmp]# split -b 100 1.txt
[root@3 tmp]# ls
xaa
xab
xac
xad
[root@3 tmp]# rm -rf x*
[root@3 tmp]# split -b 100 1.txt adai.  
能够指定文件前缀！
[root@3 tmp]# ls
adai.aa 
adai.ab
adai.ac
adai.ad

eg2： 指定行数

[root@3 tmp]# wc -l 1.txt
20 1.txt
[root@3 tmp]# split -l 5 1.txt
[root@3 tmp]# ls
xaa
xab
xac
xad
[root@3 tmp]# wc -l x*
  5 xaa
  5 xab
  5 xac
  5 xad
 20 总用量

8.13 shell特殊符号（下）

命令链接符： “||”、“&&”、“;”

command1 ; command2 ：无论command1是否执行成功都会执行command2
command1 && command2 ：只有command1执行成功后才会执行command2
command1 || command2 ：表示command1执行成功后，command2不执行，不然执行command2