系统:centos7.4html
安装scrapyd:pip isntall scrapydpython
由于我腾讯云上是python2与python3并存的 因此我执行的命令是:pip3 isntall scrapydweb
安装后新建一个配置文件:json
sudo mkdir /etc/scrapydvim
sudo vim /etc/scrapyd/scrapyd.confcentos
写入以下内容:(给内容在https://scrapyd.readthedocs.io/en/stable/config.html可找到)微信
[scrapyd] eggs_dir = eggs logs_dir = logs items_dir = jobs_to_keep = 5 dbs_dir = dbs max_proc = 0 max_proc_per_cpu = 10 finished_to_keep = 100 poll_interval = 5.0 bind_address = 0.0.0.0 http_port = 6800 debug = off runner = scrapyd.runner application = scrapyd.app.application launcher = scrapyd.launcher.Launcher webroot = scrapyd.website.Root [services] schedule.json = scrapyd.webservice.Schedule cancel.json = scrapyd.webservice.Cancel addversion.json = scrapyd.webservice.AddVersion listprojects.json = scrapyd.webservice.ListProjects listversions.json = scrapyd.webservice.ListVersions listspiders.json = scrapyd.webservice.ListSpiders delproject.json = scrapyd.webservice.DeleteProject delversion.json = scrapyd.webservice.DeleteVersion listjobs.json = scrapyd.webservice.ListJobs daemonstatus.json = scrapyd.webservice.DaemonStatus
主要更改bind_address=0.0.0.0app
建立文件后执行命令启动scrapyd: (scrapyd > /dev/null &) 当想要记录输出日志时: (scrapyd > /root/scrapyd.log &)scrapy
坑1:当我执行完命令后报错,说是找不到命令:ide
那是由于我系统上python2与3并存,因此找不到,这时应该作软链接:
个人python3路径: /usr/local/python3
制做软链接: ln -s /usr/local/python3/bin/scrapy /usr/bin/scrapy
昨晚软链接后,执行上边命令,又报错:
坑2:
这个好像是那个配置文件的最后一行有问题,具体缘由不大清楚,我将最后一行删除,再次从新执行,scrapyd就跑起来了
想了解更多Python关于爬虫、数据分析的内容,欢迎你们关注个人微信公众号:悟道Python