未分类

讯代理动态转发#讯代理动态转发源码importrequestsimporthashlibimporttimeclassIP(object):def__init__(self,orderno,secret):self.orderno=ordernoself.secret=secretdefHeaders(self):timestamp=str(int(time.time()))#计算时间戳planText='orderno='+self.orderno+','+'secret='+self.secret+','+'timestamp='+timestamp#‘订单号’‘接口’‘时间戳’拼接出字符串string=planText.encode()#需要MD5加密需要转码md5_string=hashlib.md5(string).hexdigest()#planText拼接出的字符串进行MD5()sign=md5_string.upper()#转成大写auth='sign='+sign+'&'+'orderno='+self.orderno+'&'+'timestamp='+timestamp#‘加密的字符串’‘订单号’‘时间戳’拼接字符串headers=auth#认证的头部returnheadersif__name__=='__main__':#注意不同的网站,修改不同的httpip=IP('订单号','secret')proxy={'http':'http://forward.xdaili.cn:80'}headers={'Proxy-Authorization':ip.Headers()}print(headers)r=requests.get(url="http://httpbin.org/ip",headers=headers,proxies=proxy).json()print(r)随机IP开启下载中间件#Enableordisabledownloadermiddlewares#Seehttps://docs.scrapy.org/en/latest/topics/downloader-middleware.htmlDOWNLOADER_MIDDLEWARES={'IpApp.middlewares.IpappDownloaderMiddleware':543,}Spider源码classIpspiderSpider(scrapy.Spider):name='ipSpider'allowed_domains=['httpbin.org']start_urls=['http://httpbin.org/ip']defparse(self,response):print(response.text)yieldRequest(url=self.start_urls[0],dont_filter=True)设置随机IP在middlewares.py里编辑DownloaderMiddleware()类fromIpApp.xundailiimportIPclassIpappDownloaderMiddleware(object):defprocess_request(self,request,spider):ip=IP('订单号','secret')headers=ip.Headers()proxy='http://forward.xdaili.cn:80'request.headers['Proxy-Authorization']=headersrequest.meta['proxy']=proxyreturnNone测试结果小错误提示当使用认证类代理是出现找不到Proxy-Authorization的头部信息原因:scrapy会自动去掉Proxy-Authorization解决:进入scrapy的源码路径:scrapy\core\downloader\handlers\http11.py注释掉:#ifisinstance(agent,self._TunnelingAgent):#headers.removeHeader(b'Proxy-Authorization')或者使用框架提供的下载中间件```关于下载中间件的了解