grep(global search regular expression(RE) and print out the line)全局搜索正则表达式并把行打印。它能使用正则表达式搜索文本,并把匹配的行打印出来。核心在于正则表达式。php
grep -E "" 或 egrep ""
-P
:--perl-regexp
perl正则表达式。Bugbountrytips:One line crt.sh subdomain discover code。匹配<TD>
和</TD>
之间的内容:(?<=<TD>).*(?=</TD>)
。html
curl -fsSL -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:69.0) Gecko/20100101 Firefox/69.0" "https://crt.sh/?CN=%25.target.com" | sort -n | uniq -c | grep -o -P '(?<=<TD>).*(?=</TD>)' | sed -e '/white-space:normal/d'
-v
mark@mark-Pc:~$ grep 'root' /etc/passwd root:x:0:0:root:/root:/bin/bash
-l
:列出符合匹配内容的文件名称mark@mark-Pc:~$ grep -l 'root' /etc/passwd /etc/passwd mark@mark-Pc:~$ grep -l 'root' ./1.txt mark@mark-Pc:~$
-o
:只输出文件中匹配到的部分mark@mark-Pc:~$ grep -o 'root' /etc/passwd root root root
-c
:统计文件或者文本中包含匹配字符串的行数mark@mark-Pc:~$ grep -c 'root' /etc/passwd 1
-n
:输出包含匹配字符串的行数mark@mark-Pc:~$ grep -n 'root' /etc/passwd 1:root:x:0:0:root:/root:/bin/bash
-r
:在多级目录中对文本进行递归搜索,-i
表示忽略大小写mark@mark-Pc:~$ grep "root" ./shell -r -i -n ./shell/test.sh:3: echo 'root' ./shell/test.sh:5: echo 'not root'
--include
或者排除--exclude
指定文件# 在目录中搜索全部的php和html文件中包含`target`字符串的 grep "target" . -r --include *.{php,html}
sed(stream editor),流编辑器,用程序的方式处理文本。node
-e
:多点编辑,以指定的script来处理文本文件。sed -e "" -e ""
s
命令替换mark@mark-Pc:~$ cat 1.txt hello name mark@mark-Pc:~$ sed "s/name/mark/g" 1.txt hello mark mark@mark-Pc:~$ sed "s/l/L/" 1.txt heLlo name mark@mark-Pc:~$ sed "s/l/L/g" 1.txt heLLo name
s
命令将name
替换为mark
,/g
表示以行为单位进行匹配,不加g
则只匹配第一个符合的字符串。此时并无改变文件内容,只是将处理后的内容输出,可使用重定向写入文件。linux
-i
直接修改mark@mark-Pc:~$ cat 1.txt hello name mark@mark-Pc:~$ sed -i "s/name/mark/" 1.txt mark@mark-Pc:~$ cat 1.txt hello mark
mark@mark-Pc:~$ nl pets.txt 1 This is my cat 2 my cat's name is betty 3 This is my dog 4 my dog's name is frank 5 This is my fish 6 my fish's name is george 7 This is my goat 8 my goat's name is adam mark@mark-Pc:~$ sed "1,3s/T/t/g" pets.txt this is my cat my cat's name is betty this is my dog my dog's name is frank This is my fish my fish's name is george This is my goat my goat's name is adam
s
命令将T
替换为t
,加上1,3
则表示只匹配第1到3行。正则表达式
mark@mark-Pc:~$ cat pets.txt This is my cat,my cat's name is betty This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam mark@mark-Pc:~$ sed "s/m/M/3g" pets.txt This is my cat,my cat's naMe is betty This is my dog,my dog's naMe is frank This is my fish,my fish's naMe is george This is my goat,my goat's naMe is adaM
s
命令将m
替换为M
,加上3g
则表示只匹配第3个和后面的。shell
-n
只输出通过sed处理的行数,p
打印模版块的行。mark@mark-Pc:~$ cat -n pets.txt 1 This is my cat,my cat's name is betty 2 This is my dog,my dog's name is frank 3 This is my fish,my fish's name is george 4 This is my goat,my goat's name is adam mark@mark-Pc:~$ nl pets.txt | sed -n "1,2p" 1 This is my cat,my cat's name is betty 2 This is my dog,my dog's name is frank mark@mark-Pc:~$ nl pets.txt | sed -n "s/cat/Cat/p" 1 This is my Cat,my cat's name is betty
打印第1,2行,打印替换后的行。express
a
(append)和i
(insert)。a
在当前行下面插入文本,i
在当前行上面插入文本。mark@mark-Pc:~$ nl pets.txt 1 This is my cat,my cat's name is betty 2 This is my dog,my dog's name is frank 3 This is my fish,my fish's name is george 4 This is my goat,my goat's name is adam # insert mark@mark-Pc:~$ sed "1 i insert test" pets.txt insert test This is my cat,my cat's name is betty This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam # append mark@mark-Pc:~$ sed "1 a append test" pets.txt This is my cat,my cat's name is betty append test This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam # 匹配内容后追加 mark@mark-Pc:~$ sed "/cat/a match append test" pets.txt This is my cat,my cat's name is betty match append test This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam
# 删除空白行 sed '/^$/d' # 删除第2行 sed '2d' # 删除第2行到末尾 sed '2,$d' # 删除最后一行 sed '$d'
c
替换匹配行mark@mark-Pc:~$ sed "/cat/c change test" pets.txt change test This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam
&
mark@mark-Pc:~$ cat pets.txt This is my cat,my cat's name is betty This is my dog,my dog's name is frank This is my fish,my fish's name is george This is my goat,my goat's name is adam mark@mark-Pc:~$ sed "s/my/[&]/g" pets.txt This is [my] cat,[my] cat's name is betty This is [my] dog,[my] dog's name is frank This is [my] fish,[my] fish's name is george This is [my] goat,[my] goat's name is adam
awk是一个强大的文本分析工具。awk有不少内建的功能,好比数组、函数等,这是它和C语言的相同之处,灵活性是awk最大的优点。编程
awk 'BEGIN{ print "start" } pattern{ commands } END{ print "end" }' file
一个awk脚本一般由:BEGIN语句块、可以使用模式匹配的通用语句块、END语句块3部分组成,这三个部分是可选的。数组
mark@mark-Pc:~$ cat 0.txt test mark@mark-Pc:~$ awk '{print}' 0.txt test mark@mark-Pc:~$ awk 'BEGIN{ print "Start" } { print } END{ print "End" }' 0.txt Start test End
print不带参数时,打印当前行。bash
用户信息文件
/etc/passwd root:x:0:0:root:/root:/bin/bash account:password:UID:GID:GECOS:directory:shell 用户名:密码:用户ID:组ID:用户说明:家目录:登录以后shell 注意:无密码只容许本机登录,远程不容许登录
打印用户名、uid,登陆以后的home目录
mark@mark-Pc:~$ cat 3.txt mark:x:1000:1000:mark,,,:/home/mark:/bin/bash mark@mark-Pc:~$ awk 'BEGIN{FS=":"} {print $1,$3,$6}' 3.txt mark 1000 /home/mark mark@mark-Pc:~$ awk -F':' '{print $1,$3,$6}' 3.txt mark 1000 /home/mark mark@mark-Pc:~$ awk -F: '{print $1,$3,$6}' 3.txt mark 1000 /home/mark mark@mark-Pc:~$ awk -F: '{print $1,$3,$6}' OFS="\t" 3.txt mark 1000 /home/mark mark@mark-Pc:~$
mark@mark-Pc:~$ echo 'this is a test' | awk '{print $NF}' test mark@mark-Pc:~$ awk -F ':' '{print NR,$1}' /etc/passwd 1 root 2 daemon 3 bin 4 sys 5 sync
变量NF表示当前行有多少个字段,所以$NF就表明最后一个字段。变量NR表示当前处理的是第几行。
内置变量
$n 当前记录的第n个字段,好比n为1表示第一个字段,n为2表示第二个字段。 $0 这个变量包含执行过程当中当前行的文本内容。 ARGC 命令行参数的数目。 ARGIND 命令行中当前文件的位置(从0开始算)。 ARGV 包含命令行参数的数组。 CONVFMT 数字转换格式(默认值为%.6g)。 ENVIRON 环境变量关联数组。 ERRNO 最后一个系统错误的描述。 FIELDWIDTHS 字段宽度列表(用空格键分隔)。 FILENAME 当前输入文件的名。 FNR 同NR,但相对于当前文件。 FS 字段分隔符(默认是任何空格)。 IGNORECASE 若是为真,则进行忽略大小写的匹配。 NF 表示字段数,在执行过程当中对应于当前的字段数。 NR 表示记录数,在执行过程当中对应于当前的行号。 OFMT 数字的输出格式(默认值是%.6g)。 OFS 输出字段分隔符(默认值是一个空格)。 ORS 输出记录分隔符(默认值是一个换行符)。 RS 记录分隔符(默认是一个换行符)。 RSTART 由match函数所匹配的字符串的第一个位置。 RLENGTH 由match函数所匹配的字符串的长度。 SUBSEP 数组下标分隔符(默认值是34)。
mark@mark-Pc:~$ echo 'this is a test' | awk -F ':' '{ print toupper($1) }' THIS IS A TEST
函数toupper()
用于将字符转为大写。
其余内置函数:https://www.gnu.org/software/gawk/manual/html_node/Built_002din.html#Built_002din
# 检查uid为0的用户 awk -F: '{if ($3==0) print $1}' /etc/passwd
#!/usr/bin/awk #运行前 BEGIN { math = 0 english = 0 computer = 0 printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n" printf "---------------------------------------------\n" } #运行中 { math+=$3 english+=$4 computer+=$5 printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5 } #运行后 END { printf "---------------------------------------------\n" printf " TOTAL:%10d %8d %8d \n", math, english, computer printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR }
参考
https://man.linuxde.net/ https://coolshell.cn/articles/9104.html http://www.ruanyifeng.com/blog/2018/11/awk.html