Windows下安装Scrapy

这几天正好有需求实现一个爬虫程序,想到爬虫程序立马就想到了python,python相关的爬虫资料好像也特别多。因而就决定用python来实现爬虫程序了,正好发现了python有一个开源库scrapy,正是用来实现爬虫框架的,因而果断采用这个实现。下面就先安装scrapy,决定在windows下面安装。css

 

Scrapy简介

Scrapy是一个快速,高效的网页抓取python框架。主要用于Web抓取&提取信息&格式化数据。常常用此作数据挖掘、检测、测试等。html

安装所需软件

安装步骤

一、安装Python
官网下载python(http://www.python.org/ftp/python/2.7.5/python-2.7.5.msi),双击msi文件便可直接安装, 将python路径(D:\Python27;D:\Python27\Scripts;)加入环境变量 
验证是否安装ok
C:\Users\admin>python Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>>
二、安装setuptools
官网下载setuptools(http://pypi.python.org/pypi/setuptools),能够下载相关的ez_setup.py文件,而后直接执行该文件即能自动完成安装:
python  ez_setup.py
三、安装Zope.Interface
官网下载Zope.Interface(http://pypi.python.org/pypi/zope.interface/)到官网下载与python版本对应的安装文件msi文件,双击也能够自动完成安装,验证是否安装ok
C:\Users\admin>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> import zope.interface >>>
四、安装Twisted
官网下载Twisted(http://twistedmatrix.com/trac/wiki/Downloads)下载对应版本的msi文件,双击直接安装便可。
五、安装w3lib
官网下载w3lib(http://pypi.python.org/pypi/w3lib) 安装,下载w3lib-1.9.0.tar.gz文件,解压,
#进入插件目录并执行命令安装
>D:\python-plugin\w3lib-1.3>python setup.py install

验证java

D:\python-plugin\w3lib-1.3>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> import w3lib >>>
六、安装libxml2
官网下载libxml2(http://users.skynet.be/sbi/libxml-python/)& 下载对应python版本的exe文件,双击便可
七、安装pyOpenSSL
官网下载pyOpenSSL(https://pypi.python.org/pypi/pyOpenSSL)& 下载pyOpenSSL-0.14.tar.gz文件,而后解压文件,并进入目录
接着执行命令:
python setup.py build
python setup.py install
 
这个时候报错:

error: Unable to find vcvarsall.bat

这是由于pyOpenSSL编译须要借助VC++编译,因此若是这个时候已经安装了visual studio,就须要执行visual studio的路径:python

 若是安装了 Visual Studio 2010,则执行以下命令:web

 SET VS90COMNTOOLS=%VS100COMNTOOLS%sql

若是安装了 Visual Studio 2012 (Visual Studio Version 11),则执行以下命令:shell

 SET VS90COMNTOOLS=%VS110COMNTOOLS%windows

若是安装了 Visual Studio 2013  (Visual Studio Version 12),那么执行下面命令api

 SET VS90COMNTOOLS=%VS120COMNTOOLS%bash

 能够参考文章:http://blog.csdn.net/secretx/article/details/17472107

 这个时候,仍是报错:
Cannot open include file: 'openssl/asn1.h': No such file or directory

这是由于须要在windows下安装openssl这个库,能够到http://slproweb.com/products/Win32OpenSSL.html地址下载:
Win32 OpenSSL v1.0.1i
而后再制定目录:

> set LIB=C:\OpenSSL-Win32\lib\VC\static;%LIB%

> set INCLUDE=C:\OpenSSL-Win32\include;%INCLUDE%

则这个时候编译经过

 
 
 
八、安装scrapy
官网下载scrapy(https://pypi.python.org/pypi/Scrapy) 安装
#进入scrapy目录并执行安装
>D:\python-plugin\Scrapy-0.16.5>python setup.py install

验证

D:\python-plugin\Scrapy-0.16.5>scrapy
Scrapy 0.16.5 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  fetch         Fetch a URL using the Scrapy downloader
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command D:\python-plugin\Scrapy-0.16.5>

安装完毕 OK

相关文章
相关标签/搜索