抓取知网遇到一个一直解决不了的 bug,求大神们支援,急,急,急

2018-06-07 00:47:14 +08:00
 wei6666

Traceback (most recent call last): File "China_hownet_journal_end.py", line 296, in <module> china_hownet.run() File "China_hownet_journal_end.py", line 281, in run url_list = self.parse_content_html(html3str) File "China_hownet_journal_end.py", line 212, in parse_content_html html = etree.HTML(html3str) File "lxml.etree.pyx", line 2945, in lxml.etree.HTML (src/lxml/lxml.etree.c:62546) File "parser.pxi", line 1617, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:93194) File "parser.pxi", line 1488, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:91938) File "parser.pxi", line 969, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:88328) File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84385) File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85488) File "parser.pxi", line 625, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:84945) lxml.etree.XMLSyntaxError: line 1046: htmlParseEntityRef: expecting ';'

1070 次点击
所在节点    问与答
1 条回复
wei6666
2018-06-07 00:48:34 +08:00
我以为是 xpath 写错了,我就改了很多次 xpath 匹配规则,但是还是会出报错。。。。不知道怎么解决了,求大神们支援

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/461049

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX