linux正则表达式

时间 2019-11-07

原文原文链接

1.什么是正则表达式

简单的说正则表达试就是为处理大量字符串而定义的一套规则和方法，例如：假设“@”表明nishishei，“！”表明linzhongniao。echo “@!”=”nishisheilinzhongniao”
经过定义的这些特殊符号的辅助，系统管理员就能够快速过滤，替换或输出须要的字符串，linux正则表达式通常以行为单位处理的。能够用man grep深刻研究node

2.为何要学习正则表达式？

在企业工做中，咱们天天作的linux运维工做中，时刻都会面对大量的有字符串的文本配置、程序、命令输出及日志文件等，而咱们常常会迫切的须要，从大量的字符串中查找符合工做须要的特定的字符串，这就要靠正则表达式了。例如：ifconfig命令只输出IP，access.log日志文件只取出ip等。linux正则表达式以行为单位处理。mysql

3.基础正则第一波命令说明

3.1 模拟数据

[root@linzhongniao ~]# cat linzhongniao.log 
I am linzhongniao!
I like linux.
空行
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
空行
my god,i am not linzhongniao,But to the birds of the forest!!!

3.2 “^”尖括号说明

“^” 匹配以什么字符开头的内容，vi/vim编辑器里面“^”表明一行的开头linux

实例：过滤以字母m开头的内容正则表达式

[root@linzhongniao ~]# grep "^m" linzhongniao.log 
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
my god,i am not linzhongniao,But to the birds of the forest!!!

3.3 “$”符号说明

“$”匹配以什么字符结尾的内容，vi/vim编辑器里面“$”表明一行的结尾。sql

实例；过滤出以8结尾的内容vim

[root@linzhongniao ~]# grep "8$" linzhongniao.log 
my qq num is 1200098

3.4 “^$”组合符号说明

“^$”表示空行bash

4.基础正则第二波命令说明

4.1 “.”点号说明

“.”点号表明且只能表明任意一个字符运维

实例：tcp

[root@linzhongniao ~]# grep "." linzhongniao.log
I am linzhongnieo!
I like linux.

I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098

my god,i am not linzhongniao,But to the birds of the forest!!!

匹配以linzhongni开头，以o结尾的内容，中间的字符能够任意多个。编辑器

[root@linzhongniao ~]# grep "linzhongni.*o" linzhongniao.log 
I am linzhongnieo!
my god,i am not linzhongniao,But to the birds of the FOREST!!!!

4.2 “\”反斜线符号说明

转义符号，例“\.”就只表明点自己了，让有着特殊身份意义的字符，脱掉马甲，还原原型。\$就表明着$符号。

只匹配以点号结尾的字符，须要对点号进行转义

[root@linzhongniao ~]# grep "\.$" linzhongniao.log 
I like linux.

4.3 “*”星号符号说明

重复0个或多个前面的一个字符，例如o*表明匹配有零个或多个字母o的内容。

[root@linzhongniao ~]# grep "linzhongniao*" linzhongniao.log
my god,i am not linzhongniao,But to the birds of the forest!!!
[root@linzhongniao ~]# grep "n*" linzhongniao.log
I am linzhongnieo!
I like linux.

I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098

my god,i am not linzhongniao,But to the birds of the forest!!!

4.4 “.*”组合符号说明

“.*”匹配全部（任意）多个字符，延伸“^.*”以任意多个字符开头，“.*$”以任意多个字符结尾。

实例：

匹配以goo开头的任意多个字符

[root@linzhongniao ~]# grep "goo.*" linzhongniao.log 
goodi
very good 
goood
good

匹配任意多个以字母d结尾的内容

[root@linzhongniao ~]# grep ".*d$" linzhongniao.log  
gd
goood
glad
good

匹配任意多个以数字2结尾的内容

[root@linzhongniao ~]# grep ".*2$" linzhongniao.log  
my blog id https://blog.51cto.com/10642812

匹配任意多个以叹号结尾的内容，注意反斜线的运用

[root@linzhongniao ~]#  grep ".*\!$" linzhongniao.log
I am linzhongnieo!
I like badminton ball,billiard ball and chinese chess !
my god,i am not linzhongniao,But to the birds of the FOREST!!!!

5.基础正则表达式第三波命令说明

5.1 [ abc ] 符号说明

匹配字符集内的任意一个字符[a-zA-Z],[0-9],[A-Z]。

[root@linzhongniao ~]# grep "[A-Z]" linzhongniao.log
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
LINZHONGNIAO
[root@linzhongniao ~]# grep "[a-z]" linzhongniao.log
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
not 1200000098
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
goodi
good
gd
goood
glad
[root@linzhongniao ~]# grep -i "[A-Z]" linzhongniao.log 
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
not 1200000098
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
goodi
good
gd
goood
glad
LINZHONGNIAO

5.2 [^abc]符号说明

中括号里的“^”尖括号为取反的意思，匹配不包含^后的任意一个字符的内容。注意和^在中括号外面是有区别的，^在中括号外面是表示以什么开头的意思。

5.3`a\{n,m\}`符号说明

重复n到m次前一个出现的字符（即重复字母a，n到m次），若是用egrep(grep -E)和sed –r 能够去掉斜线,它们能够识别扩展正则表达式。

5.4 `a\{n,\}`符号说明

重复至少n次(即重复a至少n次)，若是用egrep(grep -E)/sed –r 能够去掉斜线。

5.5 `a\{n\}`符号说明

重复n次，前一个出现的字符。若是用egrep(grep -E)和sed –r 能够去掉斜线。

[root@linzhongniao ~]# egrep "0{3}" linzhongniao.log
my qq num is 1200098
not 1200000098

5.6 `a\{,m\}`符号说明

重复最多m次, 前一个重复的字符。若是用egrep(grep -E)/sed –r 能够去掉斜线。

6.扩展的正则表达式

grep –E 以及egrep

【了解便可】

（1）“+”，加号表示重复“一个或一个以上”前面的字符（*是0或多个）。

[root@linzhongniao ~]# egrep "g+d" linzhongniao.log 
gd
[root@linzhongniao ~]# egrep "go+d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good

（2）* 星号表示0个或多个

[root@linzhongniao ~]# egrep "go*d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good
gd

（3）“？”问号表示重复“0个或一个”（“.”点号是有且只有一个）

[root@linzhongniao ~]# egrep "go?d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
gd
[root@linzhongniao ~]# egrep "go.d" linzhongniao.log
good

（4）“|”管道

表示同时过滤多个字符串。

[root@linzhongniao ~]# egrep "3306|1521" /etc/services 
mysql   3306/tcp# MySQL
mysql   3306/udp# MySQL
ncube-lm1521/tcp# nCube License Manager
ncube-lm1521/udp# nCube License Manager
[root@linzhongniao ~]# egrep "god|good" linzhongniao.log 
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good

（5）(|)小括号分组过滤，后向引用。

[root@linzhongniao ~]# egrep "g(la|oo)d" linzhongniao.log   
good
glad

7.元字符

元字符（meta character）是一种perl风格的正则表达式，只有一部分文本处理工具支持它，并非全部的文本处理工具都支持。

\b 单词边界

示例：

[root@linzhongniao ~]# grep "good" linzhongniao.log 
goodi
good

若是只想过滤good，不想过滤goodi；能够用\b定义边界，也能够用grep –w按单词搜索

[root@linzhongniao ~]# grep "good\b" linzhongniao.log 
good
[root@linzhongniao ~]# grep -w "good" linzhongniao.log 
good

8.正则表达式知识总结

9.企业级实战linux正则表达式结合三剑客实战

9.1 取下面的ip

解答：

sed -n 's#支持正则位置##gp' file

方法一：先把行给取出来，对目标前的内容进行匹配

[root@linzhongniao ~]# ifconfig eth0|sed -n '2'p|sed 's#^.*dr:##g'
 192.168.0.117  Bcast:192.168.0.255  Mask:255.255.255.0

再对目标后的内容进行匹配

[root@linzhongniao ~]# ifconfig eth0|sed -n '2p'|sed 's#^.*dr:##g'|sed 's#  B.*$##g'  《==这里#  B.*$中间有两个空格,最好复制粘贴
 192.168.0.117

处理技巧：

匹配须要的目标（获取的字符串如上文的ip）前的字符串通常用以^开头（^.*）来匹配到以实际字符结尾，如：“^.addr:”表示匹配以任意字符开头到addr:结尾的内容。而处理须要的目标后的内容通常在匹配的开头写上实际的字符，结尾是以$结尾（.$）来匹配。如B.*$部分表示匹配以空格大写B开头一直到结尾的内容。将匹配到的内容替换为空剩下的就是想要的内容。

方法二：

[root@linzhongniao ~]# ifconfig eth0|sed -n '2s#^.*dr:##gp'|sed  's#  B.*$##g'
 192.168.0.117

方法三：

sed的后向引用：

sed –nr ‘s#()()#\1\2#gp’file

参数：

-n 取消默认输出

-r 不用转义

sed反向引用演示：取出linzhongniao

[root@linzhongniao ~]# echo "I am linzhongniao linux" >f.txt
[root@linzhongniao ~]# cat f.txt
I am linzhongniao linux
[root@linzhongniao ~]# cat f.txt|sed -nr 's#^.*m (.*) l.*$#\1#gp'
linzhongniao

当在前面匹配的部分用小括号的时候，第一个括号内容，能够在后面的部分用\1输出，第二个括号的内容能够在后面部分用\2输出，以此类推。

[root@linzhongniao ~]# ifconfig eth0|sed -nr '2s#^.*dr:(.*)  B.*$#\1#gp'
 192.168.1.106

方法四：

[root@linzhongniao ~]# ifconfig eth0|awk -F "[ :]+" 'NR==2{print $4}'
 192.168.0.106

方法五：

[root@linzhongniao ~]# ifconfig eth0|sed -nr '/inet addr/s#^.*dr:(.*) B.*$#\1#gp' 
 192.168.0.117

方法六：

[root@linzhongniao ~]# ifconfig bond0|awk -F "(addr:| Bcast:)" 'NR==2{print $2}'  
 192.168.1.225

取出ip addr列出的ip

[root@linzhongniao ~]# ip addr|awk -F "[ /]+" 'NR==8 {print $3}'
 192.168.0.106
[root@linzhongniao ~]# ip addr|sed -nr '8s#^.*inet ##gp'|sed 's#/24 b.*$##g'
 192.168.1.106
[root@linzhongniao ~]# ip addr|sed -nr '8s#^.*inet (.*)/24.*$#\1#gp'
 192.168.1.106
[root@linzhongniao ~]# ip addr|awk -F "(inet |/24 brd)" NR==8'{print $2}'
 192.168.1.106

9.2 将/etc/passwd文件下的第一列和最后一列替换

[root@linzhongniao ~]# tail /etc/passwd|awk -F "[:]+" '{print $6":"$2":"$3":"$4"::"$5":"$1}'
/bin/bash:x:855:855::/home/stu1:stu1
/bin/bash:x:856:856::/home/stu2:stu2
/bin/bash:x:857:857::/home/stu3:stu3
/bin/bash:x:858:858::/home/stu4:stu4
/bin/bash:x:859:859::/home/stu5:stu5
/bin/bash:x:860:860::/home/stu6:stu6
/bin/bash:x:861:861::/home/stu7:stu7
/bin/bash:x:862:862::/home/stu8:stu8
/bin/bash:x:863:863::/home/stu9:stu9
/bin/bash:x:864:864::/home/stu10:stu10

9.3 取出文件权限

取出644

[root@linzhongniao ~]# stat /etc/hosts
  File: `/etc/hosts'
  Size: 218 Blocks: 8  IO Block: 4096   regular file
Device: 804h/2052d  Inode: 260125  Links: 2
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-18 10:09:51.759042316 +0800
Modify: 2018-07-11 16:18:38.646992646 +0800
Change: 2018-07-11 16:18:38.646992646 +0800

解答

方法一：

[root@linzhongniao ~]# stat /etc/hosts|sed -nr 's#^.*0(.*)/-rw.*$#\1#gp' 
644