Scrapy 提取选择器数据的常用方法

Scrapy 选择器

xpath 选择器

xpath 匹配后返回的结果是一个 列表

get 提取 list() 里面的值

# extract extract_first 以被 getall() get() 方法取代

def parse(self, response):
    # 获取第一个结果，没有结果 返回 default 设置的默认值 None
    response.xpath("").get(default=0)
    # 获取全部匹配结果
    response.xpath("").getall()

正则获取 list() 里面的值

def parse(self, response):
    # 正则匹配 返回 list() 的第一个值，没有匹配返回默认值
    response.xpath('').re_first(r'.',default=0)

    # 匹配 list() 里面的所有值
    response.xpath('').re(r'href="(.+?)"')

其他方法

index(value) 返回 value 所在列表位置的索引值。

def parse(self, response):
    response.xpath('').index("大数据男孩")

count(value) 返回 value 出现的个数。

def parse(self, response):
    response.xpath('').count("大数据男孩")

css 选择器

提取css 选择器匹配结果的方法同上

def parse(self, response):
    response.css('a::attr(href)').getall()

上一篇
初识 FastAPI

下一篇
HBuilder 连接夜神模拟器

版权声明：《 Scrapy 提取选择器数据的常用方法》为明非原创文章，转载请注明出处！
最后编辑:2020-5-20 09:05:08

Scrapy 提取 选择器数据的 常用方法

Scrapy 选择器

xpath 选择器

其他方法

css 选择器

相关推荐

Python os.path 模块常用函数

【PY模块】aiohttp 使用

【爬虫项目】房天下二手房爬取

【FastAPI】 GET 方法 参数验证

Python 迭代器原理简述

【FastAPI】 GET 方法参数验证