day03:shell特殊符号及经常使用文本工具cut sort wc uniq tr

时间 2019-11-26

标签 day03 day shell 特殊符号经常使用文本工具 cut sort uniq 栏目 Unix 繁體版

原文原文链接

本节会介绍：cut sort wc uniq tee tr split等文档处理工具：shell

一、shell特殊符号：经常使用的符号以下：bash

*：表示零个或多个任意字符：工具

?：表示一个任意字符：code

#：表示注释符号(shell脚本中经常使用)：排序

\：表示脱义字符(特殊字符失去原本的含义)：文档

$：与！结合使用，“!$”表示上一条命令中的最后一个变量：字符串

; 分号,运行两个或两个以上的命令时使用：it

~ 用户的家目录(root的家目录是/root/,普通用户的家目录是/home/username)：io

& 用这个命令能够某一条命令放在后台去运行：test

[ ] 中括号为一个字符的区间,表示中间的字符的任意有一个：

[root@localhost ~]# ls *.txt               # “*” 能够匹配出任意个任意字符：
11.txt  222.txt  2.txt  3.txt
[root@localhost ~]# ls ?.txt               # “?” 只能表示一个任意字符：
2.txt  3.txt
[root@localhost ~]# c='$a$b'               #用”单引号“也能够脱义：
[root@localhost ~]# echo $c
$a$b 
[root@localhost ~]# c=\$a\$b               #”斜杠“也可脱义，做用同上：
[root@localhost ~]# echo $c
$a$b

二、cut命令：用于截取字符串：

格式：cut -d '分隔符' [ -cf ]n #n是正整数：

-d：指定分隔符：分隔符要用单引号：(结合-f使用)

-f：后面接第几个区块：

-c：后面接第几个字符：

[root@localhost ~]# cat /etc/passwd|cut -d ':' -f1      #-f要结合-d一块儿使用：
bin
root

[root@localhost ~]# cat /etc/passwd|cut -c1|head -n2
r
b

三、sort命令：用于排序：[ -n -r -u -kn1 -kn1,2 -t ]

格式：sort [-t '分隔符'] [options] [filename]

options：

-t：指定分隔符：通常结合-kn使用：

-n：使用纯数字方式排序：(默认字母和数字为0,会排序在前面,)

-r：逆向排序：

-kn1,n2：表示由n1到n2区间排序，若是只写kn1,表示只对n1字段排序：

sort不见任何选项,则从首字符开始,依次以ASCII码值进行比较,最后按升序输出：

[root@localhost ~]# cat 2.txt |sort            #默认以ASCII码排序：
^*
，
#$
11111
2222
22222

-n和nr:纯数字方式排序和逆向排序：

[root@localhost ~]# cat 2.txt |sort -n           #以纯数字方式排序,特殊字符默认都为0：
*
#$
2222
11111
22222
55555
[root@localhost ~]# cat 2.txt |sort -nr         #也是是以纯数字方式排序，不过期倒序显示：
55555
22222
11111
2222
#$
*

四、wc命令：用于统计文档的行数、字符数和词数：

wc [ options ] filename

options：

-l：统计行数： list

-m：统计字符数： member

-w：统计词数： word

[root@localhost ~]# cat 3.txt          #编写次文档内容作实验：
123
abc
[root@localhost ~]# wc -l 3.txt        #统计文档的行数:
2 3.txt
[root@localhost ~]# wc -w 3.txt        #统计文档的词数：（单词）：
2 3.txt

[root@localhost ~]# wc -m 3.txt           #统计当前文档的字符数,发现明明是六个字符,却显示8个：
8 3.txt
[root@localhost ~]# cat -A 3.txt          #由于每行后面会有一个结束符‘$’:
123$
abc$

如上图：使用 wc -m 统计字符数后，会发现文档里明明六个字符,统计时却显示八个字符，这是由于每行都会有一个结束符存在：

五、uniq 去重复：用于统计文档中的重复的文字：

注：去重复以前须要先排序：通常须要结合sort使用，须要使用sort先排序：

uniq [ options ] filename

options：

-c：=count,在每列旁边显示该行重复的次数：

[root@localhost ~]# uniq -c 3.txt           #首先直接去重复，发现不完整：
      2 123
      1 abc
      1 def
      1 456
      1 123
      1 456
[root@localhost ~]# sort -n 3.txt |uniq -c     #再次排序后再去重复，发现能够的：
      1 abc
      1 def
      3 123
      2 456

如上：咱们第一次使用uniq去重复的时候，发现文字里仍是有重复；因此须要使用sort先排序再去重复：如图例2：

六、tee：输出重定向：重定向的同时，也会输出显示在屏幕上：

[root@localhost ~]# sort -n 2.txt |uniq -c |tee a.txt     #查看文件内容并重定向到a.txt:
      1 11111:222
      2 22222
      1 33333
      1 44444
      1 55555
[root@localhost ~]# cat 2.txt                       #查看文件内容：
11111:222
2222:111
33333
44444
22222
22222
55555
[root@localhost ~]#

-a：追加的命令：

[root@localhost ~]# sort -n 2.txt |uniq -c |tee -a a.txt     #再次追加文件内容：
      1 2222:111
      1 11111:222
      2 22222
      1 33333
      1 44444
      1 55555
[root@localhost ~]# cat a.txt                         #查看文件内容，发现发生变化：
      1 2222:111
      1 11111:222
      2 22222
      1 33333
      1 44444
      1 55555
      1 2222:111
      1 11111:222
      2 22222
      1 33333
      1 44444
      1 55555

七、tr：替换字符：

[root@localhost ~]# cat 1.txt |head -n2|tr '[a-z]' '[A-Z]'    #替换并再次查看文件内容：
ROOT:X:0:0:ROOT:/ROOT:/BIN/BASH
BIN:X:1:1:BIN:/BIN:/SBIN/NOLOGIN
[root@localhost ~]# cat 1.txt |head -n2|tr 'r' 'R'          #替换单个字符并查看文件内容：
Root:x:0:0:Root:/Root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin

八、split：切割文件内容： -b -l

split -b filename #按大小来切割文档，默认单位为byte：可自定义单位：

split -l filename #按行数来切割文档：

[root@localhost test]# split -b 100 a.txt            #按文件大小来划分：
[root@localhost test]# ls
1.txt  a.txt  xaa  xab  xac
[root@localhost test]# split -b 100 a.txt tt        #划分时可自定义文件名称：
[root@localhost test]# ls
1.txt  a.txt  ttaa  ttab  ttac
[root@localhost test]# split -l 7 a.txt             #按行数来划分：
[root@localhost test]# ls
1.txt  a.txt  xaa  xab  xac  xad

九、特殊字符： “；” “||” “&&”

command1；command2 : 不论command1是否执行成功，都会执行command2：

command1 && command2 : 只有command1执行成功后，才会执行command2：

command1 | | command2 : 表示command1执行成功后，command2不执行，不然执行 command2:

[root@localhost test]# touch 1.txt;ls 1.txt         #表示都执行：
1.txt
[root@localhost test]# ls
1.txt
[root@localhost test]# ls 1.txt && mkdir dir    #表示第一条命令成功，才执行第二条命令：
1.txt
[root@localhost test]# ls
1.txt  dir
[root@localhost test]# ls a.txt && mkdir dir2   #当第一条命令失败时，第二条也不执行：
ls: cannot access a.txt: No such file or directory
[root@localhost test]# ls
1.txt  dir

[root@localhost test]# ls a.txt || mkdir dir2      #表示第一条命令失败后，才会执行第二条命令：
ls: cannot access a.txt: No such file or directory
[root@localhost test]# ls
1.txt  dir  dir2
[root@localhost test]# ls 1.txt || mkdir dir3      #第一条命令成功了，则不会执行第二条命令：
1.txt
[root@localhost test]# ls
1.txt  dir  dir2