V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
Ewig
V2EX  ›  Python

scrapy 框架里面 middleware

  •  
  •   Ewig · 2019-01-16 09:41:54 +08:00 · 1178 次点击
    这是一个创建于 2145 天前的主题,其中的信息可能已经有所发展或是发生改变。
    from scrapy.http.headers import Headers
    from Espider.tools.get_cookies import get_cookies
    import pymongo,random
    from Espider.tools.user_agents import user_agents
    from fake_useragent import UserAgent


    class zhipincookiemiddleware():

    def __init__(self, mongodbHost, mongodbPort, mongodbName):
    self.mongodbHost = mongodbHost
    self.mongodbPort = mongodbPort
    self.mongodbName = mongodbName

    @classmethod
    def from_crawler(cls, crawler):
    return cls(mongodbHost=crawler.settings.get('MONGODB_HOST'), mongodbPort=crawler.settings.get('MONGODB_PORT'),
    mongodbName=crawler.settings.get('MONGODB_DBNAME'))

    def process_request(self, request, spider):
    ua=UserAgent()

    self.client = pymongo.MongoClient(self.mongodbHost, self.mongodbPort)
    self.mongodb = self.client[self.mongodbName]

    self.collection = self.mongodb[spider.name + '_cookie']
    self.cookies_str = self.collection.find_one()['cookie']
    self.headers = {
    "User-Agent":ua.random,
    "cookie": random.choice(self.cookies_str)}

    request.headers = Headers(self.headers)

    框架里面写了一个 cookiemiddleware
    我这里写了一个 random cookie,在每次请求的时候 会重新随机一下 cookie 吗?
    1 条回复    2019-01-17 09:21:54 +08:00
    holajamc
        1
    holajamc  
       2019-01-17 09:21:54 +08:00
    你们公司还要人嘛?
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2213 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 22ms · UTC 01:12 · PVG 09:12 · LAX 17:12 · JFK 20:12
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.