python 下 lxml 解析 html 问题

2015-06-15 16:43:25 +08:00

wico77

s= requests.get(url)
d = html.fromstring(s.text)
q = d.xpath('//td[@class="postcell"]/div/div[@class="post-text"]/*')
for i in q:
print i
结果：
<Element p at 0x10ae26f70>
<Element p at 0x10ae26fc8>
<Element pre at 0x10ae32050>
<Element p at 0x10ae320a8>
<Element p at 0x10ae32100>
<Element p at 0x10ae32158>
<Element p at 0x10ae321b0>
<Element p at 0x10ae32208>
请问如何让段落信息显示出来？

2802 次点击

所在节点

Python

4 条回复

JasperYanky

2015-06-15 16:50:08 +08:00

/text()

wico77

2015-06-15 16:56:12 +08:00

@JasperYanky 用text()的话就得到里面的文字，我想把成段的内容都得到。包括<p>, </p>

blueset

2015-06-15 17:14:55 +08:00

etree.tostring(i)

wico77

2015-06-15 17:17:18 +08:00

@blueset 问题解决。谢谢

第 1 页／共 1 页

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/198725

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.