大家都知道,小薄本子多了,整理起来就麻烦了=。=
我想按作者分,按社团分,按展会分等等,所以写了个正则 想从一个本子的名字里抽取所有信息
但是本子标题五花八门,如下
0. (event ) (tag ) [group (artist )] title (form ) [addition1] [addition2]
(event ) [group (artist )] title (form ) [addition1]
[event] [group (artist )] title (form ) (addition1 )
(tag ) [group (artist )] title
[group (artist )] title
title
我试着写了一个
import re
regex_patern = ur'([\(\[](?P<event>[^\)\]]*)[\)\]])?\s*([\(\[](?P<type>[^\)\](\)\])]*)[\)\]])?\s*(\[(?P<group>[^\(\]]*)(\((?P<artist>[^\)]*)\))?\])?(?P<title>[^\(\)\[\]]*)([\(\[](?P<from>[^\)\]]*)[\)\]])?(\s*[\(\[](?P<more1>[^\)\]]*)[\)\]])'
p = re.compile (regex_patern )
rows= [
'(event ) (tag ) [group (artist )] title (form ) [addition1] [addition2]',
'(event ) [group (artist )] title (form ) [addition1]',
'[event] [group (artist )] title (form ) (addition1 )',
'(tag ) [group (artist )] title',
'[group (artist )] title',
'title',
]
for r in rows:
r = re.search (p, r )
print r.groupdict ()
#输出:
{u'from': 'form', u'more1': 'addition1', u'artist': 'artist', u'title': ' title ', u'group': 'group ', u'type': 'tag', u'event': 'event'}
{u'from': 'form', u'more1': 'addition1', u'artist': 'artist', u'title': ' title ', u'group': 'group ', u'type': None, u'event': 'event'}
{u'from': 'form', u'more1': 'addition1', u'artist': 'artist', u'title': ' title ', u'group': 'group ', u'type': None, u'event': 'event'}
{u'from': None, u'more1': 'group (artist', u'artist': None, u'title': '', u'group': None, u'type': None, u'event': 'tag'}
{u'from': None, u'more1': 'group (artist', u'artist': None, u'title': '', u'group': None, u'type': None, u'event': None}
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last )
<ipython-input-5-831c548bc3f0> in <module>()
15 for r in rows:
16 r = re.search (p, r )
---> 17 print r.groupdict ()
AttributeError: 'NoneType' object has no attribute 'groupdict'
从第四行开始结果就不对了,我感觉 re 应该要先匹配中间的简单规则,再最后扩展到最复杂的规则,
但是不知道怎么写。。。。特来请教各位
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.