一个正则表达式的坑……分支

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

这是一个创建于 2269 天前的主题，其中的信息可能已经有所发展或是发生改变。

一直理解有误，以为正则里的分支『 | 』，比如 StatementA|StatementB，如果 StatementA 被匹配成功了，则 StatementB 不会被执行匹配过程，直接返回 StatementA 的结果。

A|B,  When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy.

然而今天发现上面这个说法是错的……

举例：

pattern1 = ’ reuglar|uglar ‘
pattern2 = ’ uglar|reuglar ‘
str =’ regular expression ‘

re.search(pattern1,str)  # 返回 ['regular']
re.search(pattern2,str)  # 还是返回 ['regular']

第二匹配 re.search(pattern2,str)中按照错误的想法，先匹配 ’ uglar ‘，应该返回['uglar']，而真实的情况是：

’ uglar ‘ 匹配得到结果'uglar'；'regular' 匹配，由于字符串’ regular expression ‘ 中的 re 未被 consume，'regular' 依然能匹配成功，并返回 ['regular']

2 条回复 • 2019-08-27 17:16:24 +08:00

Rorysky

2019-08-06 19:42:35 +08:00

"completely matches" means matching to the end of the string...

Rorysky

2019-08-27 17:16:24 +08:00

@livid 大大，我是不是被降权了，刚发的帖子，找都找不到……

求给个改过自新的机会！

一个正则表达式的坑……分支 |