用 BS4 如何搜索文本内容，然后再取出其标签？

2016-10-02 22:29:48 +08:00

omg21

比方说下面这样的代码，

<p id="a1">新闻</p>
<p id="a2">娱乐</p>

我需要搜索“娱乐”，如果找到了，就取出标签“<p id="a2">娱乐</p>”，我试过用 soup.find_all()，但是这个只能搜索标签，不能搜索内容。

4122 次点击

所在节点

Python

27 条回复

xucuncicero

2016-10-03 14:49:27 +08:00

```python3
soup = BeautifulSoup(html, 'html.parser')
theTag = soup.find_all(text='娱乐')
for tag in theTag:
print(tag.find_parent("p"))
```

xucuncicero

2016-10-03 14:50:35 +08:00

2 楼的没问题，你说报错，具体是啥

Kisesy

2016-10-03 16:52:36 +08:00

b = BeautifulSoup("""<p id="a1">新闻</p><p id="a2">娱乐</p><div id="a3">娱乐</div>""")
print(b.find_all(True, text='娱乐'))

额，标签不固定的话写个 True 不就行了。。。
输出 [<p id="a2">娱乐</p>, <div id="a3">娱乐</div>]

sherlocktheplant

2016-10-03 18:37:49 +08:00

@omg21 可以得到的字符串不是一般的字符串是 NavigatableString 所有可以直接通过 parent 属性访问到标签我那个 gist 你运行一下就懂了

omg21

2016-10-03 18:47:16 +08:00

@practicer 已搞定，就是用你的方法，谢谢

omg21

2016-10-03 18:47:54 +08:00

@xucuncicero 是的，是我自己搞错了

practicer

2016-10-03 20:05:27 +08:00

@omg21 不客气

第 2 页／共 2 页

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/310321

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.