scrapy 学习

时间 2019-12-20

标签 scrapy 学习栏目 Python 繁體版

原文原文链接

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"

    def start_requests(self):
        urls = [
            'http://quotes.toscrape.com/page/1/',
            #'http://quotes.toscrape.com/page/2/',
        ]
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
        for quote in response.xpath('//div[@class="quote"]'):
            yield {
               'text': quote.xpath('.//span[@class="text"]/text()').extract(),
               'author': quote.xpath('.//small[@class="author"]/text()').extract(),
               'tags': quote.xpath('./div/meta/@content').extract(),
            }
 
        next_page = response.xpath('//li[@class="next"]/a/@href').extract_first()
        if next_page is not None:
            next_page = response.urljoin(next_page)
            yield scrapy.Request(next_page, callback=self.parse)

知识点：python

1.xpath如何在循环中访问当前节点下的内容（'.//scrapy

2.当前循环节点内容下的值能够按照绝对路径获取 (./div/metaide

3.当前页面的url如何访问 response.urljoin(url

1. 【scrapy】学习Scrapy入门
2. Scrapy学习
3. scrapy库学习
4. Scrapy 简单学习
5. Scrapy学习笔记
6. Scrapy框架学习
7. scrapy简单学习
8. 【学习笔记】Scrapy
9. scrapy学习笔记
10. 爬虫scrapy学习
更多相关文章...
• 您已经学习了 XML Schema，下一步学习什么呢？ - XML Schema 教程
• 我们已经学习了 SQL，下一步学习什么呢？ - SQL 教程
• Tomcat学习笔记（史上最全tomcat学习笔记）
• 适用于PHP初学者的学习线路和建议