Python经常使用模块（3）—— shelve、pickle、json、xml、configparser

时间 2019-11-19

标签 python 经常使用模块 shelve pickle json xml configparser 栏目 Python 繁體版

原文原文链接

什么是序列化/反序列化？json

　　序列化就是将内存中的数据结构转换成一种中间格式存储到硬盘或者基于网络传输，反序列化就是硬盘中或者网络中传来的一种数据格式转换成内存中数据结构网络

为何要有序列化/反序列化？数据结构

　　一、能够保存程序的运行状态app

　　二、数据的跨平台交互ide

shelve模块：也用于序列化，它于 pickle 不一样之处在于不须要关心文件模式等，而是直接把它当成一个字典来看待，它能够直接对数据进行修改，而不用覆盖原来的数据，但 pickle 想要修改只能用 wb 模式来覆盖编码

import shelve dic1 = {'pwd': 'qiu123', 'age': 22} dic2 = {'pwd': 'xi456', 'age': 22} d = shelve.open('db.txt') d['qiu'] = dic1 d['xi'] = dic2 print(d['qiu']['pwd']) d.close()

shelve序列化

import shelve d = shelve.open('db.txt') print(d['qiu']) print(d['qiu']['pwd']) d.close()

shelve反序列化

import shelve # 对子字典的修改须要参数
d = shelve.open('db.txt', writeback=True) d['qiu']['age'] = 20
print(d['qiu']) d.close()

对子字典的修改须要参数

pickle模块：一个用来序列化的模块url

　　主要功能有dump（序列化）、load（反序列化）、dumps、loads。dump/load 相比于 dumps/loads，封装了 write 和 read，使用操做更方便spa

import pickle dic = {'a':1, 'b':2, 'c':3} # 1. 序列化
pkl = pickle.dumps(dic) print(pkl, type(pkl)) # 2. 写入文件
with open('db.pkl', 'wb') as f: f.write(pkl) # 1和2能够合成一步
with open('db.pkl', 'wb') as f: res = pickle.dump(dic, f)

pickle序列化

import pickle # 1. 从文件中读取pickle格式
with open('db.pkl', 'rb') as f: pkl = f.read() # 2. 将pkl_str转成内存中的数据类型
dic = pickle.loads(pkl) print(dic) # 1和2能够合成一步
with open('db.pkl', 'rb') as f: dic = pickle.load(f) print(dic)

pickle反序列化

优势：能够支持 Python 中全部的数据类型翻译

缺点：只能被 Python 识别，不能跨平台3d

json模块：一个用于序列化的模块

在使用 json 模块以前，须要先了解 JSON（JavaScript Object Notation，JS的对象简谱），它表示出来的是一个字符串，能够被任何语言解析读取，方便使用

JSON 表示的对象就是标准的 JavaScript 语言的对象，JSON 和 Python 内置的数据类型对应以下：

import json dic = {'name': 'qiuxi', 'age': 22} # 序列化: 内存中的数据类型转化成中间格式json
 json_str = json.dumps(dic) print(json_str, type(json_str)) # 运行
{"name": "qiuxi", "age": 22} <class 'str'>

JSON格式不能识别单引号，全都是双引号

import json dic = {'name': 'qiuxi', 'age': 22} # 1. 序列化获得json_str
json_str = json.dumps(dic) # 2. 把json_str写入文件 # 由于json表示出来的都是字符串, 因此用wt且指定编码
with open('db.json', 'wt', encoding='utf-8') as f: f.write(json_str) # 1和2合并一步
with open('db.json', 'wt', encoding='utf-8') as f: json.dump(dic, f)

json序列化

import json # 1. 从文件中读取json格式
with open('db.json', 'rt', encoding='utf-8') as f: json_str = f.read() # 2. 将json_str转成内存中的数据类型
dic = json.loads(json_str) print(dic) # 1和2合并一步
with open('db.json', 'rt', encoding='utf-8') as f: dic = json.load(f) print(dic)

json反序列化

当本身手写的 json 文件时，能够到网上进行JSON格式化校验

优势：跨平台性强

缺点：只能支持 Python 部分的数据类型

xml模块：也是用于序列化的一种模块

在使用 xml 模块以前，须要先了解 XML（可扩展标记语言），也是一种通用的数据格式。

语法格式：

　　一、任何的起始标签都必须有一个结束标签

　　二、能够采用另外一种简化语法，能够在一个标签中同时表示起始和结束标签。这种语法是在大于符号以前紧跟一个斜线（/），例如<百度百科词条/>，XML解析器会将其翻译成 <百度百科词条></百度百科词条>

　　三、标签必须按合适的顺序进行嵌套，因此结束标签必须按镜像顺序匹配起始签。这比如是将起始和结束标签看做是数学中的左右括号：在没有关闭全部的内部括号以前，是不能关闭外面的括号的。

　　四、全部的属性都必须有值

　　五、全部的属性都必须在值的周围加上双引号

标签的组成

<tagname attributename="value">text <subtags/>
</tagname>


<标签名 属性="属性值">文本 </子标签>
</标签名>

标签组成格式

import xml.etree.ElmentTree    # 表示节点树

<studentinfo>
    <stu age="20" name="张三">
        <phone name="华为">这是华为手机</phone>
        <computer name="Mac">14888</computer>
    </stu>
    <stu age="22" name="李四">
        <phone name="华为">这是华为手机</phone>
        <computer name="联想">4888</computer>
    </stu>
</studentinfo>

初始的数据

标签的三个特征：标签名tag、标签属性attrib、标签的文本内容text

import xml.etree.ElementTree as ET # 解析d.xml
tree = ET.parse('d.xml') print(tree) # 获取根标签
rootTree = tree.getroot() # 第一种获取标签的方式 # iter用于在全文范围获取标签
for item in rootTree.iter('phone'): print(item.tag)  # 标签名
    print(item.attrib)  # 标签的属性
    print(item.text)    # 文本内容


# 第二种获取标签的方式 # find用于从根标签的子标签中查找一个名为stu的标签, 若是有多个, 找到的是第一个
print(rootTree.find('stu').attrib) # 第三种获取标签的方式 # findall用于从同级标签中查找全部名为phone的标签
print(rootTree.findall('phone')) # 一、查 # 遍历整个文档
for stu in rootTree: for item in stu: print(item.tag) print(item.attrib) print(item.text) # 二、改
for phone in rootTree.iter('phone'): print(phone.tag) phone.attrib = {'name': '华为'} phone.text = '这是华为手机' tree.write('d.xml',encoding='utf-8') # 三、增
for stu in rootTree: computer = stu.find('computer') if int(computer.text) > 5000: print('价钱大于5000的电脑的使用者', stu.attrib) tag = ET.Element('qiuxi') tag.attrib = {'hobby': 'music'} tag.text = '喜欢音乐' stu.append(tag) tree.write('d.xml', encoding='utf-8') # 四、删
for stu in rootTree: tag = stu.find('qiuxi') if tag is not None: print("========") stu.remove(tag) tree.write('d.xml', encoding='utf-8')

xml模块的使用

configparser模块：用于解析配置文件的模块

配置文件即包含配置程序信息的文件，一些须要修改但不常常修改的信息，例如数据文件的路径等

配置文件中只有两种内容，一种是 section 分区，另外一种是 option 选项，就是一个 key=value 形式

使用最多的是 get ，用来从配置文件获取一个配置选项

# 路径的相关配置
[path] db_path = C://myfile/test.txt # 用户相关的配置
[user] name = qiuxi age = 22

初始的配置文件

import configparser # 建立一个解析器
config = configparser.ConfigParser() # 读取并解析test.cfg
config.read('test.cfg', encoding='utf-8') # 获取须要的信息
print(config.sections())    # 获取分区
print(config.options('user'))   # 获取选项

# 获取某个选项的值
print(config.get('path', 'db_path')) print(config.get('user', 'age')) # get返回的都是字符串类型, 若是须要转换类型, 直接使用get+对应的类型
print(type(config.getint("user","age"))) # 是否有某个分区
print(config.has_section('user')) # 是否有某个选项
print(config.has_option('user', 'name')) # 一些不太经常使用的操做 # 添加
config.add_section("server") config.set("server","url","192.168.1.2") # 删除
config.remove_option("user","age") # 修改
config.set("server","url","192.168.1.2") # 增删改查操做完成后写回文件中
with open("test.cfg", "wt", encoding="utf-8") as f: config.write(f)

configparser模块操做