昨天在微信公众号后台搞了个开车的功能,大概是根据回复的关键字去bt搜索站搜索种子地址而后回复。用了xpath作数据获取,但感受效率不够,容易超时出现“公众号暂时没法服务”的状况,考虑两种解决方案:html
1.优化爬虫效率,提高获取结果的时效;json
2.优化微信回复逻辑;(微信api文档给出的解决方案是默认回复空或success,而后调用客服消息接口回复内容,但我的公众号没法开通客服消息接口,故没法按此方案解决)api
目前只能保持现状,贴上代码:微信
ps:经过用户输入的关键词获取搜索结果里的磁力连接,i表示获取前多少条记录。因为搜索结果默认是以热度排序,因此靠前的磁力连接可用度比较高app
#-*- coding:utf-8 -*- import xml.etree.ElementTree as ET import requests import json import urllib2 from lxml import etree #获取搜索结果的方法 def getUrl(key): url='https://www.torrentso.com/s/'+key+'/' html_content = requests.get(url).content html = etree.HTML(html_content) titles = html.xpath('//div[@class="bt_list"]/a/@title') urls = html.xpath('//div[@class="bt_list"]/a/@href') #初始化变量值 i = 0 replytext ='' result=[] for i in range(0,6): downurl = getDown(urls[i]) result.append(titles[i]) result.append(downurl) replytext = replytext + '\n'+ titles[i]+'\n'+urls[i]+'\n' #print downurl i+=1 return replytext #获取磁力连接的方法 def getDown(url): html_content = requests.get(url).content html = etree.HTML(html_content) downurl = html.xpath('//div[@id="hash_newdown"]/textarea') return downurl[0].text