一直理解有误,以为 正则里的 分支 『 | 』,比如 StatementA|StatementB,如果 StatementA 被匹配成功了,则 StatementB 不会被执行匹配过程,直接返回 StatementA 的结果。
而且,翻了下 Python 的文档,re 模块也是这么解释的:
A|B, When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy.
然而今天发现上面这个说法是错的……
举例:
pattern1 = ’ reuglar|uglar ‘
pattern2 = ’ uglar|reuglar ‘
str =’ regular expression ‘
re.search(pattern1,str) # 返回 ['regular']
re.search(pattern2,str) # 还是返回 ['regular']
第二匹配 re.search(pattern2,str)
中按照错误的想法,先匹配 ’ uglar ‘,应该返回['uglar'],而真实的情况是:
’ uglar ‘ 匹配得到结果'uglar';'regular' 匹配,由于 字符串’ regular expression ‘ 中的 re 未被 consume,'regular' 依然能匹配成功,并返回 ['regular']
1
Rorysky OP "completely matches" means matching to the end of the string...
|