perl diamond
我的perl直觉
diamond “<>” is a very important operator in perl, which can input the file,such as “<STDIN>”,"<STDOUT>","<…>" (源自intermediate perl)
“$",蛇杖,象征着权管,在perl所有操作最后都可以归结到reference, 包括subroutine reference, file reference, object reference, hash reference, array reference(源自intermediate perl)
“%",禅杖,一种hash散列字典,一种数据结构(源自唐玄奘)
“[]”, 立方体,cube, “@“array对象的具体内容(源自黑客帝国)
“{}”, 万花筒,,”%“禅杖对象的具体内容
perl万事万物模式的匹配均从”/ /“双斜杠开始(第一,首先),python一般简化为字符串或者regex字符串(源自pattern胡晖)
所有语言的正则
正则在任何语言均有实现,他是一个内语言,具有自己的一套语法,perl中从双斜杠开始(源自Mr Wang)。
在双斜杠”//“内部包含五类对象:
-
字符内容,比如 “\d"等效于[0-9] ,"\w” 等效于 [a-zA-Z0-9], “\s”, “\D"等效于[^0-9],"\W"等效于[^a-zA-Z0-9],"§”
-
元字符,具备特殊意义, “\”,”.”,”*”,"?","{3,}","{3,12}"((?#注意:1和2可以组合))
为了获得真正的元字符需要转义,"\\","\.","\*","\?","\{","\(","\)"
-
位置符,"^","$","(?<=…)", “(?<!…)”, “\b”,"(?=…)", “(?!…)”
-
分组和反引用, “(?<=<(\w+)>).*(?=<\/\1>)”,
开始于左零宽断言(或者叫"^"),结束语右零宽断言(或者叫"\("),经典应用囊括着1、2、3、4所有内容,隐含这所有正则表达式,最终归结于0!或者perl的"\)"(具备)
-
分支,"?(group)yes|no" 如果存在分组group,则继续执行yes,否则no “\w+|\d+” 分支
-
“(?<groupname>\w+)\b\s+\k<groupname>\b” 其中groupname指定情况下,需要使用"\k" 把捕获的内容命名为groupname中,并压入堆栈(以为着还可以弹出来),(?<-groupname>)弹出groupname捕获内容,如果groupname不存在则匹配失败(出错)
-
“(?‘groupName’\w+)\b\s+\k’groupname’\b” 其中groupname指定情况下,需要使用"\k"
< #最外层的左括号
[^<>]* #它后面非括号的内容
(
(
(?'Open'<) #左括号,压入"Open"
[^<>]* #左括号后面的内容
)+
(
(?'-Open'>) #右括号,弹出一个"Open"
[^<>]* #右括号后面的内容
)+
)*
(?(Open)(?!)) #最外层的右括号前检查
#若还有未弹出的"Open"
#则匹配失败
> #最外层的右括号
#平衡组的一个最常见的应用就是匹配HTML,下面这个例子可以匹配嵌套的<div>标签:<div[^>]*>[^<>]*(((?'Open'<div[^>]*>)[^<>]*)+((?'-Open'</div>)[^<>]*)+)*(?(Open)(?!))</div>.
例子
#!/usr/bin/env perl
#===============================================================================
#
# FILE: testRegex.pl
#
# USAGE: ./testRegex.pl
#
# DESCRIPTION:
#
# OPTIONS: ---
# REQUIREMENTS: ---
# BUGS: ---
# NOTES: ---
# AUTHOR: Ye Zhaoliang (Ye Zhaoliang), zl_ye@qny.chng.com.cn
# ORGANIZATION:
# VERSION: 1.0
# CREATED: 2022-02-19 0:36:50
# REVISION: ---
#===============================================================================
use strict;
use warnings;
use utf8;
no warnings qw(experimental::vlb);
## PerlSupport is very good vim pluasn
## \ra 可以设置命令行参数
## \rs 执行语法检查
## \ry 格式化程序
## \rr执行程序
## \rt save buffer to file with timestamp
## \rk显示配置信息
#
#my $arg=shift @_;
print "command line argument is $arg \n";
my $he = "abc df jje";
print "Hello regex\n" if $he =~ m/\b\d{2}\b/xm;
print "Hello regex \\w works\n" if $he =~ m/\b\w{2}\b/xm;
my $html = "<head>nothing</head>";
#print "label is " if $html=~m/(?<=<(\w{0,20})>).{0,20}(?=<\/\1)/;
print "label is $1 content is $2\n" if $html =~ m/
(?<=<(\w{0,20})>)
(.{0,20})
(?=<\/\1>)
/x;
print "labe2 is $1 content is $2\n" if $html =~ m/
(?<=<(?<groupname>\w{0,20})>)
(.{0,20})
(?=<\/\k<groupname>>)
/x;
my $line1 =
"Score = 161 bits (409), Expect = 1e-43, Method: Compositional matrix adjust.";
my $line2 =
"Identities = 141/471 (29%), Positives = 227/471 (48%), Gaps =41/471 (8%)";
if ( $line1 =~
m/Score\s*=\s*([\d\.\,]+)\s*(?:bits|Bits)\s*\(\d+\)\,\s*Expect\s*=\s*(\d+e-\d+)\.*/xm
)
{
print "$1 \t $2\n";
}
if ( $line2 =~
m/Identities\s*=\s*([\d\,\.]+)(?#141)\/([\d\,\.]+)\s*\((\S+)\%\).*/xm )
{
print "$1 \t $2 \t $3\n";
}