运行豆瓣爬虫程序,一直在下载也不知道什么意思?

2016-07-02 18:54:49 +08:00
 grey5659

就是这个 http://blog.csdn.net/lanbing510/article/details/45887075 运行$ python doubanSpider.py 后一直在下载,是什么意思额? /usr/local/lib/python2.7/dist-packages/bs4/init.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "html.parser")

markup_type=markup_type)) Downloading Information From Page 1 Downloading Information From Page 2 Downloading Information From Page 3 Downloading Information From Page 4 Downloading Information From Page 5 Downloading Information From Page 6 WARNING:root:Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. Downloading Information From Page 7 Downloading Information From Page 8 Downloading Information From Page 9 Downloading Information From Page 10 Downloading Information From Page 11 Downloading Information From Page 12 Downloading Information From Page 13 Downloading Information From Page 14 Downloading Information From Page 15 Downloading Information From Page 16 Downloading Information From Page 17 Downloading Information From Page 18 Downloading Information From Page 19 Downloading Information From Page 20 Downloading Information From Page 21 Downloading Information From Page 22 Downloading Information From Page 23 Downloading Information From Page 24

1712 次点击
所在节点    问与答
1 条回复
woniu127
2016-07-02 19:28:09 +08:00
BeautifulSoup([your markup], "lxml")

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/289839

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX