书签分享收藏举报版权申诉 / 18

立即下载加入VIP,免费下载

当前位置：首页 > IT计算机 > 计算机软件及应用 > 正则表达式.docx

正则表达式.docx

文档编号：4669866
上传时间：2022-12-07
格式：DOCX
页数：18
大小：23.22KB

正则表达式.docx

《正则表达式.docx》由会员分享，可在线阅读，更多相关《正则表达式.docx（18页珍藏版）》请在冰豆网上搜索。

正则表达式.docx

正则表达式

这部分内容可以说是学习shell脚本之前必学的内容。

如果你这部分内容学的越好，那么你的shell脚本编写能力就会越强。

所以不要嫌这部分内容啰嗦，也不要怕麻烦，要用心学习。

一定要多加练习，练习多了就能熟练掌握了。

在计算机科学中，正则表达式是这样解释的：

它是指一个用来描述或者匹配一系列符合某个句法规则的字符串的单个字符串。

在很多文本编辑器或其他工具里，正则表达式通常被用来检索和/或替换那些符合某个模式的文本内容。

许多程序设计语言都支持利用正则表达式进行字符串操作。

对于系统管理员来讲，正则表达式贯穿在我们的日常运维工作中，无论是查找某个文档，抑或查询某个日志文件分析其内容，都会用到正则表达式。

其实正则表达式，只是一种思想，一种表示方法。

只要我们使用的工具支持表示这种思想那么这个工具就可以处理正则表达式的字符串。

常用的工具有grep,sed,awk等，下面阿铭就分别介绍一下这三种工具的使用方法。

grep/egrep

阿铭在前面的内容中多次提到并用到grep命令，可见它的重要性。

所以好好学习一下这个重要的命令吧。

你要知道的是grep连同下面讲的sed,awk都是针对文本的行才操作的。

语法：

grep [-cinvABC] 'word' filename

-c：

打印符合要求的行数

-i：

忽略大小写

-n：

在输出符合要求的行的同时连同行号一起输出

-v：

打印不符合要求的行

-A：

后跟一个数字（有无空格都可以），例如–A2则表示打印符合要求的行以及下面两行

-B：

后跟一个数字，例如–B2则表示打印符合要求的行以及上面两行

-C：

后跟一个数字，例如–C2则表示打印符合要求的行以及上下各两行

[root@localhost~]#grep-A2'halt'/etc/passwd

halt:

x:

7:

0:

halt:

/sbin:

/sbin/halt

mail:

x:

8:

12:

mail:

/var/spool/mail:

/sbin/nologin

uucp:

x:

10:

14:

uucp:

/var/spool/uucp:

/sbin/nologin

把包含‘halt’的行以及这行下面的两行都打印出。

[root@localhost~]#grep-B2'halt'/etc/passwd

sync:

x:

5:

0:

sync:

/sbin:

/bin/sync

shutdown:

x:

6:

0:

shutdown:

/sbin:

/sbin/shutdown

halt:

x:

7:

0:

halt:

/sbin:

/sbin/halt

把包含‘halt’的行以及这行上面的两行都打印出。

[root@localhost~]#grep-C2'halt'/etc/passwd

sync:

x:

5:

0:

sync:

/sbin:

/bin/sync

shutdown:

x:

6:

0:

shutdown:

/sbin:

/sbin/shutdown

halt:

x:

7:

0:

halt:

/sbin:

/sbin/halt

mail:

x:

8:

12:

mail:

/var/spool/mail:

/sbin/nologin

uucp:

x:

10:

14:

uucp:

/var/spool/uucp:

/sbin/nologin

把包含‘halt’的行以及这行上面和下面的各两行都打印出。

下面阿铭举几个典型实例帮你更深刻的理解grep.

1.过滤出带有某个关键词的行并输出行号

[root@localhost~]#grep-n'root'/etc/passwd

1:

root:

x:

0:

root:

/root:

/bin/bash

11:

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

2.过滤不带有某个关键词的行，并输出行号

[root@localhost~]#grep-nv'nologin'/etc/passwd

1:

root:

x:

0:

root:

/root:

/bin/bash

6:

sync:

x:

5:

0:

sync:

/sbin:

/bin/sync

7:

shutdown:

x:

6:

0:

shutdown:

/sbin:

/sbin/shutdown

8:

halt:

x:

7:

0:

halt:

/sbin:

/sbin/halt

26:

test:

x:

511:

:

/home/test:

/bin/bash

27:

test1:

x:

512:

511:

:

/home/test1:

/bin/bash

3.过滤出所有包含数字的行

[root@localhost~]#grep'[0-9]'/etc/inittab

#upstartworks,seeinit（5）,init（8）,andinitctl（8）.

#0-halt（DoNOTsetinitdefaulttothis）

#1-Singleusermode

#2-Multiuser,withoutNFS（Thesameas3,ifyoudonothavenetworking）

#3-Fullmultiusermode

#4-unused

#5-X11

#6-reboot（DoNOTsetinitdefaulttothis）

id:

3:

initdefault:

4.过滤出所有不包含数字的行

[root@localhost~]#grep-v'[0-9]'/etc/inittab

#inittabisonlyusedbyupstartforthedefaultrunlevel.

#

#ADDINGOTHERCONFIGURATIONHEREWILLHAVENOEFFECTONYOURSYSTEM.

#

#Systeminitializationisstartedby/etc/init/rcS.conf

#

#Individualrunlevelsarestartedby/etc/init/rc.conf

#

#Ctrl-Alt-Deleteishandledby/etc/init/control-alt-delete.conf

#

#Terminalgettysarehandledby/etc/init/tty.confand/etc/init/serial.conf,

#withconfigurationin/etc/sysconfig/init.

#

#Forinformationonhowtowriteupstarteventhandlers,orhow

#

#Defaultrunlevel.Therunlevelsusedare:

#

5.把所有以‘#’开头的行去除

[root@localhost~]#grep-v'^#'/etc/inittab

id:

3:

initdefault:

6.去除所有空行和以‘#’开头的行

[root@localhost~]#grep-v'^#'/etc/crontab|grep-v'^$'

SHELL=/bin/bash

PATH=/sbin:

/bin:

/usr/sbin:

/usr/bin

MAILTO=root

HOME=/

在正则表达式中，“^”表示行的开始，“$”表示行的结尾，那么空行则可以用“^$”表示，如何打印出不以英文字母开头的行呢？

[root@localhost~]#vimtest.txt

[root@localhost~]#cattest.txt

123

abc

456

abc2323

#laksdjf

Alllllllll

阿铭先在test.txt中写几行字符串，用来做实验。

[root@localhost~]#grep'^[^a-zA-Z]'test.txt

123

456

#laksdjf

[root@localhost~]#grep'[^a-zA-Z]'test.txt

123

456

abc2323

#laksdjf

在前面阿铭也提到过这个‘[]’的应用，如果是数字的话就用[0-9]这样的形式，当然有时候也可以用这样的形式[15]即只含有1或者5，注意，它不会认为是15。

如果要过滤出数字以及大小写字母则要这样写[0-9a-zA-Z]。

另外[]还有一种形式，就是[^字符]表示除[]内的字符之外的字符。

7.过滤任意一个字符与重复字符

[root@localhost~]#grep'r..o'/etc/passwd

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

gopher:

x:

13:

30:

gopher:

/var/gopher:

/sbin/nologin

vcsa:

x:

69:

virtualconsolememoryowner:

/dev:

/sbin/nologin

. 表示任意一个字符，上例中，就是把符合r与o之间有两个任意字符的行过滤出来， * 表示零个或多个前面的字符。

[root@localhost~]#grep'ooo*'/etc/passwd

root:

x:

0:

root:

/root:

/bin/bash

lp:

x:

4:

7:

lp:

/var/spool/lpd:

/sbin/nologin

mail:

x:

8:

12:

mail:

/var/spool/mail:

/sbin/nologin

uucp:

x:

10:

14:

uucp:

/var/spool/uucp:

/sbin/nologin

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

postfix:

x:

89:

:

/var/spool/postfix:

/sbin/nologin

‘ooo*’表示oo,ooo,oooo...或者更多的‘o’现在你是否想到了‘.*’这个组合表示什么意义？

[root@localhost~]#grep'.*'/etc/passwd|wc-l

27

[root@localhost~]#wc-l/etc/passwd

27/etc/passwd

‘.*’表示零个或多个任意字符，空行也包含在内。

8.指定要过滤字符出现的次数

[root@localhost~]#grep'o\{2\}'/etc/passwd

root:

x:

0:

root:

/root:

/bin/bash

lp:

x:

4:

7:

lp:

/var/spool/lpd:

/sbin/nologin

mail:

x:

8:

12:

mail:

/var/spool/mail:

/sbin/nologin

uucp:

x:

10:

14:

uucp:

/var/spool/uucp:

/sbin/nologin

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

postfix:

x:

89:

:

/var/spool/postfix:

/sbin/nologin

这里用到了{}，其内部为数字，表示前面的字符要重复的次数。

上例中表示包含有两个o即‘oo’的行。

注意，{}左右都需要加上脱意字符‘\’,另外，使用{}我们还可以表示一个范围的，具体格式是‘{n1,n2}’其中n1

上面部分讲的grep，另外阿铭常常用到egrep这个工具，简单点讲，后者是前者的扩展版本，我们可以用egrep完成grep不能完成的工作，当然了grep能完成的egrep完全可以完成。

如果你嫌麻烦，egrep了解一下即可，因为grep的功能已经足够可以胜任你的日常工作了。

下面阿铭介绍egrep不用于grep的几个用法。

为了试验方便，阿铭把test.txt编辑成如下内容：

rot:

x:

0:

/rot:

/bin/bash

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

1.筛选一个或一个以上前面的字符

[root@localhost~]#egrep'o+'test.txt

rot:

x:

0:

/rot:

/bin/bash

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

[root@localhost~]#egrep'oo+'test.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

[root@localhost~]#egrep'ooo+'test.txt

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

和grep不同的是，egrep这里是使用’+’的。

2.筛选零个或一个前面的字符

[root@localhost~]#egrep'o?

'test.txt

rot:

x:

0:

/rot:

/bin/bash

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

[root@localhost~]#egrep'ooo?

'test.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

[root@localhost~]#egrep'oooo?

'test.txt

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

3.筛选字符串1或者字符串2

[root@localhost~]#egrep'aaa|111|ooo'test.txt

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

4.egrep中（）的应用

[root@localhost~]#egrep'r（oo）|（at）o'test.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

用（）表示一个整体，例如（oo）+就表示1个‘oo’或者多个‘oo’

[root@localhost~]#egrep'（oo）+'test.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

sed工具的使用

grep工具的功能其实还不够强大，grep实现的只是查找功能，而它却不能实现把查找的内容替换掉。

以前用vim的时候，可以查找也可以替换，但是只局限于在文本内部来操作，而不能输出到屏幕上。

sed工具以及下面要讲的awk工具就能实现把替换的文本输出到屏幕上的功能了，而且还有其他更丰富的功能。

sed和awk都是流式编辑器，是针对文档的行来操作的。

1.打印某行

sed -n 'n'p filename 单引号内的n是一个数字，表示第几行:

[root@localhost~]#sed-n'2'p/etc/passwd

bin:

x:

1:

bin:

/bin:

/sbin/nologin

要想把所有行都打印出来可以使用 sed -n '1,$'p filename

[root@localhost~]#sed-n'1,$'ptest.txt

rot:

x:

0:

/rot:

/bin/bash

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

也可以指定一个区间:

[root@localhost~]#sed-n'1,3'ptest.txt

rot:

x:

0:

/rot:

/bin/bash

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

2.打印包含某个字符串的行

[root@localhost~]#sed-n'/root/'ptest.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

grep中使用的特殊字符，如 ^ $ . * 等同样也能在sed中使用

[root@localhost~]#sed-n'/^1/'ptest.txt

111111*********1111111111111111

[root@localhost~]#sed-n'/in$/'ptest.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

[root@localhost~]#sed-n'/r..o/'ptest.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

[root@localhost~]#sed-n'/ooo*/'ptest.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

3.-e可以实现多个行为

[root@localhost~]#sed-e'1'p-e'/111/'p-ntest.txt

rot:

x:

0:

/rot:

/bin/bash

111111*********1111111111111111

4.删除某行或者多行

[root@localhost~]#sed'1'dtest.txt

operator:

x:

11:

0:

operator:

/root:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

[root@localhost~]#sed'1,3'dtest.txt

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

[root@localhost~]#sed'/oot/'dtest.txt

rot:

x:

0:

/rot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

‘d’这个字符就是删除的动作了，不仅可以删除指定的单行以及多行，而且还可以删除匹配某个字符的行，另外还可以删除从某一行一直到文档末行。

5.替换字符或字符串

[root@localhost~]#sed'1,2s/ot/to/g'test.txt

rto:

x:

0:

/rto:

/bin/bash

operator:

x:

11:

0:

operator:

/roto:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooot:

/sbin/nologin

roooot:

x:

0:

/rooooot:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

上例中的‘s’就是替换的命令，‘g’为本行中全局替换，如果不加‘g’只换该行中出现的第一个。

除了可以使用‘/’作为分隔符外，还可以使用其他特殊字符例如‘#’或者‘@’都没有问题。

[root@localhost~]#sed's#ot#to#g'test.txt

rto:

x:

0:

/rto:

/bin/bash

operator:

x:

11:

0:

operator:

/roto:

/sbin/nologin

operator:

x:

11:

0:

operator:

/rooto:

/sbin/nologin

roooto:

x:

0:

/rooooto:

/bin/bash

111111*********1111111111111111

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

[root@localhost~]#sed

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

下载	加入VIP,免费下载

版权申诉 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

配套讲稿：: 如PPT文件的首页显示word图标，表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
特殊限制：: 部分文档作品中含有的国旗、国徽等图片，仅作为作品整体效果示例展示，禁止商用。设计者仅对作品中独创性部分享有著作权。
关键词：: 正则表达式

冰豆网所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

关于本文

本文标题：正则表达式.docx
链接地址：https://www.bdocx.com/doc/4669866.html

正则表达式.docx

热门标签