scrapy多ip切换

最近在公司部署scrapy爬虫的时候遇到个问题,公司给了3个机器,每个机器10个IP,为了反爬,在爬虫请求异常时切换IP,scrapy官方文档写的非常简介,调试N久终于成功,现讲心得记录如下:

1、新建中间件HeaderMidWare.py

1
2
3
4
5
6
7
8
9
10
11
#!/usr/bin/python #coding: utf-8
from scrapy.utils.project import get_project_settings
import random,base64
from scrapy.conf import settings
from user_agent import generate_user_agent
class IPProxyMiddleware():
def process_request(self, request, spider):
bindaddress = random.choice(settings.get('BindAddress'))
request.meta['bindaddress'] = bindaddress

2、setting启用中间件,并同时设置BindAddress