微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

正则表达式:入门

正则表达式

1适用对象

.net framework提供的正则表达是专门服务于string类型的字符检索和模式匹配的,当然,string累提供的方法,比如IndexOf()等也能检索和匹配想要的方法,但是代码书写繁琐,方便性和灵活性上,都输于正则表达式。

2入门级的正则表达式

2.1Plain-Text查找

如下所示给出一串字符串:

const string myText = @”This comprehensive compendium provides a broad and thorough investigation of all aspects of programming with ASP.NET. Entirely revised and updated for the fourth release of .NET,this book will give you the information you need to master ASP.NET and build a dynamic,successful,enterprise Web application.”;

要想检索出字符子串“ion”所在的所有位置,正则表达该如何书写呢?书写格式非常简单,如下所示:

const string pattern = @"this";

然后利用.net提供的Regex类,匹配所有带有ion的位置,代码如下所示:

MatchCollection myMatches = Regex.Matches(myText,pattern,RegexOptions.IgnoreCase |
            RegexOptions.ExplicitCapture); //Mathes()匹配方法

            foreach (Match nextMatch in myMatches)
            {
                Console.WriteLine(string.Format("{0} ",nextMatch.Index));
            }

查询搜索后的结果为2个匹配项,index分别为0,181. myText的第0号位置为t,单词为this,第181号位置t,对应的单词也为this。

const string myText = @”This comprehensive compendium provides a broad and thorough investigation of all aspects of programming with ASP.NET. Entirely revised and updated for the fourth release of .NET,this book will give you the information you need to master ASP.NET and build a dynamic,enterprise Web application.”;

像pattern = “this”这种正则表达式,是一种文本模式,翻译过来称为 “plain -text search

2.2Metacharacters查找

元字符(Metacharacter)are special characters that provide commands,as well as escape sequences(\b),which work in much the same way
as C# escape sequences.

They are characters preceded by a backslash (“\”) and have special meanings.

例如1,想要查找以字母t开头的所有单词,

const string pattern = @"\bt";
MatchCollection myMatches = Regex.Matches(myText,RegexOptions.IgnoreCase |
RegexOptions.ExplicitCapture);

结果搜索到单词在myText中的index分别为0,51,153,181,205,230
例如2,如果想要查找以tion结尾的单词,可以使用:

const string pattern = @"ion\b";
MatchCollection myMatches = Regex.Matches(myText,RegexOptions.IgnoreCase |
RegexOptions.ExplicitCapture);

结果搜索到单词在myText中的index分别为70,217,304都以ion结尾。常用的Metacharacters主要包括

符号
描述 例子 匹配举例
^ Beginning of input text ^B B,but only if first character in text
¥(美元符号) End of input text X$ X,but only if last character in text
. Any single character except the newline character (\ ) i.ation isation,ization
* Preceding character may be repeated zero or more times ra*t rt,rat,raat,raaat,and so on
+ Preceding character may be repeated one or more times ra+t rat,raaat and so on,but not rt
? Preceding character may be repeated zero or one time ra?t rt and rat only
\s Any whitespace character \sa [space]a,\ta,\na (\t and \n have the same meanings as in C#)
\S Any character that isn’t whitespace \SF aF,rF,cF,but not \tf
\b Word boundary ion\b Any word ending in ion
\B Any position that isn’t a word boundary \BX\B Any X in the middle of a word

以上元字符的任意组合查询举例:
以a字符开头,以ion结尾的,中间不能出现空格的所有单词,

const string pattern = @"\ba\S*ion\b";
MatchCollection myMatches = Regex.Matches(myText,RegexOptions.IgnoreCase |
RegexOptions.ExplicitCapture);

结果,检索出的位置未334,可以看到时application这个单词。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐