Python多线程、进程、协程

时间 2019-12-11

标签 python 多线程进程栏目 Python 繁體版

原文原文链接

本节内容html

操做系统发展史介绍
进程、与线程区别
python GIL全局解释器锁
线程

语法
join
线程锁之Lock\Rlock\信号量
将线程变为守护进程
Event事件　
queue队列
生产者消费者模型
Queue队列
开发一个线程池

进程
1. 语法
2. 进程间通信
3. 进程池

操做系统发展史

手工操做（无操做系统）

1946年第一台计算机诞生--20世纪50年代中期，还未出现操做系统，计算机工做采用手工操做方式。python

手工操做
程序员将对应于程序和数据的已穿孔的纸带（或卡片）装入输入机，而后启动输入机把程序和数据输入计算机内存，接着经过控制台开关启动程序针对数据运行；计算完毕，打印机输出计算结果；用户取走结果并卸下纸带（或卡片）后，才让下一个用户上机。程序员

手工操做方式两个特色：
（1）用户独占全机。不会出现因资源已被其余用户占用而等待的现象，但资源的利用率低。
（2）CPU 等待手工操做。CPU的利用不充分。web

 20世纪50年代后期，出现人机矛盾：手工操做的慢速度和计算机的高速度之间造成了尖锐矛盾，手工操做方式已严重损害了系统资源的利用率（使资源利用率降为百分之几，甚至更低），不能容忍。惟一的解决办法：只有摆脱人的手工操做，实现做业的自动过渡。这样就出现了成批处理。编程

批处理系统

批处理系统：加载在计算机上的一个系统软件，在它的控制下，计算机可以自动地、成批地处理一个或多个用户的做业（这做业包括程序、数据和命令）。多线程

联机批处理系统
首先出现的是联机批处理系统，即做业的输入/输出由CPU来处理。
主机与输入机之间增长一个存储设备——磁带，在运行于主机上的监督程序的自动控制下，计算机可自动完成：成批地把输入机上的用户做业读入磁带，依次把磁带上的用户做业读入主机内存并执行并把计算结果向输出机输出。完成了上一批做业后，监督程序又从输入机上输入另外一批做业，保存在磁带上，并按上述步骤重复处理。并发

监督程序不停地处理各个做业，从而实现了做业到做业的自动转接，减小了做业创建时间和手工操做时间，有效克服了人机矛盾，提升了计算机的利用率。app

可是，在做业输入和结果输出时，主机的高速CPU仍处于空闲状态，等待慢速的输入/输出设备完成工做：主机处于“忙等”状态。less

脱机批处理系统
为克服与缓解高速主机与慢速外设的矛盾，提升CPU的利用率，又引入了脱机批处理系统，即输入/输出脱离主机控制。
这种方式的显著特征是：增长一台不与主机直接相连而专门用于与输入/输出设备打交道的卫星机。
其功能是：
（1）从输入机上读取用户做业并放到输入磁带上。
（2）从输出磁带上读取执行结果并传给输出机。dom

这样，主机不是直接与慢速的输入/输出设备打交道，而是与速度相对较快的磁带机发生关系，有效缓解了主机与设备的矛盾。主机与卫星机可并行工做，两者分工明确，能够充分发挥主机的高速计算能力。

脱机批处理系统:20世纪60年代应用十分普遍，它极大缓解了人机矛盾及主机与外设的矛盾。IBM-7090/7094：配备的监督程序就是脱机批处理系统，是现代操做系统的原型。

不足：每次主机内存中仅存放一道做业，每当它运行期间发出输入/输出（I/O）请求后，高速的CPU便处于等待低速的I/O完成状态，导致CPU空闲。

为改善CPU的利用率，又引入了多道程序系统。

多道程序系统

多道程序设计技术

所谓多道程序设计技术，就是指容许多个程序同时进入内存并运行。即同时把多个程序放入内存，并容许它们交替在CPU中运行，它们共享系统中的各类硬、软件资源。当一道程序因I/O请求而暂停运行时，CPU便当即转去运行另外一道程序。

单道程序的运行过程：
在A程序计算时，I/O空闲， A程序I/O操做时，CPU空闲（B程序也是一样）；必须A工做完成后，B才能进入内存中开始工做，二者是串行的，所有完成共需时间=T1+T2。

多道程序的运行过程：
将A、B两道程序同时存放在内存中，它们在系统的控制下，可相互穿插、交替地在CPU上运行：当A程序因请求I/O操做而放弃CPU时，B程序就可占用CPU运行，这样 CPU再也不空闲，而正进行A I/O操做的I/O设备也不空闲，显然，CPU和I/O设备都处于“忙”状态，大大提升了资源的利用率，从而也提升了系统的效率，A、B所有完成所需时间<<T1+T2。

多道程序设计技术不只使CPU获得充分利用，同时改善I/O设备和内存的利用率，从而提升了整个系统的资源利用率和系统吞吐量（单位时间内处理做业（程序）的个数），最终提升了整个系统的效率。

单处理机系统中多道程序运行时的特色：
（1）多道：计算机内存中同时存放几道相互独立的程序；
（2）宏观上并行：同时进入系统的几道程序都处于运行过程当中，即它们前后开始了各自的运行，但都未运行完毕；
（3）微观上串行：实际上，各道程序轮流地用CPU，并交替运行。

多道程序系统的出现，标志着操做系统渐趋成熟的阶段，前后出现了做业调度管理、处理机管理、存储器管理、外部设备管理、文件系统管理等功能。

多道批处理系统
20世纪60年代中期，在前述的批处理系统中，引入多道程序设计技术后造成多道批处理系统（简称：批处理系统）。
它有两个特色：
（1）多道：系统内可同时容纳多个做业。这些做业放在外存中，组成一个后备队列，系统按必定的调度原则每次从后备做业队列中选取一个或多个做业进入内存运行，运行做业结束、退出运行和后备做业进入运行均由系统自动实现，从而在系统中造成一个自动转接的、连续的做业流。
（2）成批：在系统运行过程当中，不容许用户与其做业发生交互做用，即：做业一旦进入系统，用户就不能直接干预其做业的运行。

批处理系统的追求目标：提升系统资源利用率和系统吞吐量，以及做业流程的自动化。

批处理系统的一个重要缺点：不提供人机交互能力，给用户使用计算机带来不便。
虽然用户独占全机资源，而且直接控制程序的运行，能够随时了解程序运行状况。但这种工做方式因独占全机形成资源效率极低。

一种新的追求目标：既能保证计算机效率，又能方便用户使用计算机。 20世纪60年代中期，计算机技术和软件技术的发展使这种追求成为可能。

分时系统

因为CPU速度不断提升和采用分时技术，一台计算机可同时链接多个用户终端，而每一个用户可在本身的终端上联机使用计算机，好象本身独占机器同样。

分时技术：把处理机的运行时间分红很短的时间片，按时间片轮流把处理机分配给各联机做业使用。

若某个做业在分配给它的时间片内不能完成其计算，则该做业暂时中断，把处理机让给另外一做业使用，等待下一轮时再继续其运行。因为计算机速度很快，做业运行轮转得很快，给每一个用户的印象是，好象他独占了一台计算机。而每一个用户能够经过本身的终端向系统发出各类操做控制命令，在充分的人机交互状况下，完成做业的运行。

具备上述特征的计算机系统称为分时系统，它容许多个用户同时联机使用计算机。

特色：
（1）多路性。若干个用户同时使用一台计算机。微观上看是各用户轮流使用计算机；宏观上看是各用户并行工做。
（2）交互性。用户可根据系统对请求的响应结果，进一步向系统提出新的请求。这种能使用户与系统进行人机对话的工做方式，明显地有别于批处理系统，于是，分时系统又被称为交互式系统。
（3）独立性。用户之间能够相互独立操做，互不干扰。系统保证各用户程序运行的完整性，不会发生相互混淆或破坏现象。
（4）及时性。系统可对用户的输入及时做出响应。分时系统性能的主要指标之一是响应时间，它是指：从终端发出命令到系统予以应答所需的时间。

分时系统的主要目标：对用户响应的及时性，即不至于用户等待每个命令的处理时间过长。

分时系统能够同时接纳数十个甚至上百个用户，因为内存空间有限，每每采用对换（又称交换）方式的存储方法。即将未“轮到”的做业放入磁盘，一旦“轮到”，再将其调入内存；而时间片用完后，又将做业存回磁盘（俗称“滚进”、“滚出“法），使同一存储区域轮流为多个用户服务。

多用户分时系统是当今计算机操做系统中最广泛使用的一类操做系统。

实时系统

虽然多道批处理系统和分时系统能得到较使人满意的资源利用率和系统响应时间，但却不能知足实时控制与实时信息处理两个应用领域的需求。因而就产生了实时系统，即系统可以及时响应随机发生的外部事件，并在严格的时间范围内完成对该事件的处理。
实时系统在一个特定的应用中常做为一种控制设备来使用。

实时系统可分红两类：
（1）实时控制系统。当用于飞机飞行、导弹发射等的自动控制时，要求计算机能尽快处理测量系统测得的数据，及时地对飞机或导弹进行控制，或将有关信息经过显示终端提供给决策人员。当用于轧钢、石化等工业生产过程控制时，也要求计算机能及时处理由各种传感器送来的数据，而后控制相应的执行机构。
（2）实时信息处理系统。当用于预约飞机票、查询有关航班、航线、票价等事宜时，或当用于银行系统、情报检索系统时，都要求计算机能对终端设备发来的服务请求及时予以正确的回答。此类对响应及时性的要求稍弱于第一类。

实时操做系统的主要特色：
（1）及时响应。每个信息接收、分析处理和发送的过程必须在严格的时间限制内完成。
（2）高可靠性。需采起冗余措施，双机系统先后台工做，也包括必要的保密措施等。

操做系统发展图谱

进程与线程

什么是进程(process)？

An executing instance of a program is called a process.

Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.

程序并不能单独运行，只有将程序装载到内存中，系统为它分配资源才能运行，而这种执行的程序就称之为进程。程序和进程的区别就在于：程序是指令的集合，它是进程运行的静态描述文本；进程是程序的一次执行活动，属于动态概念。

在多道编程中，咱们容许多个程序同时加载到内存中，在操做系统的调度下，能够实现并发地执行。这是这样的设计，大大提升了CPU的利用率。进程的出现让每一个用户感受到本身独享CPU，所以，进程就是为了在CPU上实现多道编程而提出的。

有了进程为何还要线程？

进程有不少优势，它提供了多道编程，让咱们感受咱们每一个人都拥有本身的CPU和其余资源，能够提升计算机的利用率。不少人就不理解了，既然进程这么优秀，为何还要线程呢？其实，仔细观察就会发现进程仍是有不少缺陷的，主要体如今两点上：

进程只能在一个时间干一件事，若是想同时干两件事或多件事，进程就无能为力了。
进程在执行的过程当中若是阻塞，例如等待输入，整个进程就会挂起，即便进程中有些工做不依赖于输入的数据，也将没法执行。

例如，咱们在使用qq聊天， qq作为一个独立进程若是同一时间只能干一件事，那他如何实如今同一时刻即能监听键盘输入、又能监听其它人给你发的消息、同时还能把别人发的消息显示在屏幕上呢？你会说，操做系统不是有分时么？但个人亲，分时是指在不一样进程间的分时呀，即操做系统处理一会你的qq任务，又切换到word文档任务上了，每一个cpu时间片分给你的qq程序时，你的qq仍是只能同时干一件事呀。

再直白一点，一个操做系统就像是一个工厂，工厂里面有不少个生产车间，不一样的车间生产不一样的产品，每一个车间就至关于一个进程，且你的工厂又穷，供电不足，同一时间只能给一个车间供电，为了能让全部车间都能同时生产，你的工厂的电工只能给不一样的车间分时供电，可是轮到你的qq车间时，发现只有一个干活的工人，结果生产效率极低，为了解决这个问题，应该怎么办呢？。。。。没错，你确定想到了，就是多加几个工人，让几我的工人并行工做，这每一个工人，就是线程！

什么是线程(thread)？

线程是操做系统可以进行运算调度的最小单位。它被包含在进程之中，是进程中的实际运做单位。一条线程指的是进程中一个单一顺序的控制流，一个进程中能够并发多个线程，每条线程并行执行不一样的任务

A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.

Suppose you're reading a book, and you want to take a break right now, but you want to be able to come back and resume reading from the exact point where you stopped. One way to achieve that is by jotting down the page number, line number, and word number. So your execution context for reading a book is these 3 numbers.

If you have a roommate, and she's using the same technique, she can take the book while you're not using it, and resume reading from where she stopped. Then you can take it back, and resume it from where you were.

Threads work in the same way. A CPU is giving you the illusion that it's doing multiple computations at the same time. It does that by spending a bit of time on each computation. It can do that because it has an execution context for each computation. Just like you can share a book with your friend, many tasks can share a CPU.

On a more technical level, an execution context (therefore a thread) consists of the values of the CPU's registers.

Last: threads are different from processes. A thread is a context of execution, while a process is a bunch of resources associated with a computation. A process can have one or many threads.

Clarification: the resources associated with a process include memory pages (all the threads in a process have the same view of the memory), file descriptors (e.g., open sockets), and security credentials (e.g., the ID of the user who started the process).

进程与线程的区别？

Threads share the address space of the process that created it; processes have their own address space.
Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
New threads are easily created; new processes require duplication of the parent process.
Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.

Python GIL(Global Interpreter Lock)　　

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

上面的核心意思就是，不管你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只容许一个线程运行，擦。。。，那这还叫什么多线程呀？莫如此早的下结结论，听我现场讲。

首先须要明确的一点是GIL并非Python的特性，它是在实现Python解析器(CPython)时所引入的一个概念。就比如C++是一套语言（语法）标准，可是能够用不一样的编译器来编译成可执行代码。有名的编译器例如GCC，INTEL C++，Visual C++等。Python也同样，一样一段代码能够经过CPython，PyPy，Psyco等不一样的Python执行环境来执行。像其中的JPython就没有GIL。然而由于CPython是大部分环境下默认的Python执行环境。因此在不少人的概念里CPython就是Python，也就想固然的把GIL归结为Python语言的缺陷。因此这里要先明确一点：GIL并非Python的特性，Python彻底能够不依赖于GIL

这篇文章透彻的剖析了GIL对python多线程的影响，强烈推荐看一下：http://www.dabeaz.com/python/UnderstandingGIL.pdf

Python threading模块

线程有2种调用方式，以下：

直接调用

 
         import 
         threading 
        
         import 
         time 
        
         def 
         sayhi(num):  
         #定义每一个线程要运行的函数 
        
         print 
         ( 
         "running on number:%s" 
         % 
         num) 
        
         time.sleep( 
         3 
         ) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         t1  
         = 
         threading.Thread(target 
         = 
         sayhi,args 
         = 
         ( 
         1 
         ,))  
         #生成一个线程实例 
        
         t2  
         = 
         threading.Thread(target 
         = 
         sayhi,args 
         = 
         ( 
         2 
         ,))  
         #生成另外一个线程实例 
        
         t1.start()  
         #启动线程 
        
         t2.start()  
         #启动另外一个线程 
        
         print 
         (t1.getName())  
         #获取线程名 
        
         print 
         (t2.getName())

继承式调用

 
         import 
         threading 
        
         import 
         time 
        
         class 
         MyThread(threading.Thread): 
        
         def 
         __init__( 
         self 
         ,num): 
        
         threading.Thread.__init__( 
         self 
         ) 
        
         self 
         .num  
         = 
         num 
        
         def 
         run( 
         self 
         ): 
         #定义每一个线程要运行的函数 
        
         print 
         ( 
         "running on number:%s" 
         % 
         self 
         .num) 
        
         time.sleep( 
         3 
         ) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         t1  
         = 
         MyThread( 
         1 
         ) 
        
         t2  
         = 
         MyThread( 
         2 
         ) 
        
         t1.start() 
        
         t2.start()

Join & Daemon

Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it's okay to kill them off once the other, non-daemon, threads have exited.

Without daemon threads, you'd have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.

 
         #_*_coding:utf-8_*_ 
        
         __author__  
         = 
         'Alex Li' 
        
         import 
         time 
        
         import 
         threading 
        
         def 
         run(n): 
        
         print 
         ( 
         '[%s]------running----\n' 
         % 
         n) 
        
         time.sleep( 
         2 
         ) 
        
         print 
         ( 
         '--done--' 
         ) 
        
         def 
         main(): 
        
         for 
         i  
         in 
         range 
         ( 
         5 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         run,args 
         = 
         [i,]) 
        
         t.start() 
        
         t.join( 
         1 
         ) 
        
         print 
         ( 
         'starting thread' 
         , t.getName()) 
        
         m  
         = 
         threading.Thread(target 
         = 
         main,args 
         = 
         []) 
        
         m.setDaemon( 
         True 
         )  
         #将main线程设置为Daemon线程,它作为程序主线程的守护线程,当主线程退出时,m线程也会退出,由m启动的其它子线程会同时退出,无论是否执行完任务 
        
         m.start() 
        
         m.join(timeout 
         = 
         2 
         ) 
        
         print 
         ( 
         "---main thread done----" 
         )

Note：Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an .Event

线程锁(互斥锁Mutex)

一个进程下能够启动多个线程，多个线程共享父进程的内存空间，也就意味着每一个线程能够访问同一份数据，此时，若是2个线程同时要修改同一份数据，会出现什么情况？

 
         import 
         time 
        
         import 
         threading 
        
         def 
         addNum(): 
        
         global 
         num  
         #在每一个线程中都获取这个全局变量 
        
         print 
         ( 
         '--get num:' 
         ,num ) 
        
         time.sleep( 
         1 
         ) 
        
         num   
         - 
         = 
         1 
         #对此公共变量进行-1操做 
        
         num  
         = 
         100  
         #设定一个共享变量 
        
         thread_list  
         = 
         [] 
        
         for 
         i  
         in 
         range 
         ( 
         100 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         addNum) 
        
         t.start() 
        
         thread_list.append(t) 
        
         for 
         t  
         in 
         thread_list:  
         #等待全部线程执行完毕  
        
         t.join() 
        
         print 
         ( 
         'final num:' 
         , num )

正常来说，这个num结果应该是0，但在python 2.7上多运行几回，会发现，最后打印出来的num结果不老是0，为何每次运行的结果不同呢？哈，很简单，假设你有A,B两个线程，此时都要对num 进行减1操做，因为2个线程是并发同时运行的，因此2个线程颇有可能同时拿走了num=100这个初始变量交给cpu去运算，当A线程去处完的结果是99，但此时B线程运算完的结果也是99，两个线程同时CPU运算的结果再赋值给num变量后，结果就都是99。那怎么办呢？很简单，每一个线程在要修改公共数据时，为了不本身在还没改完的时候别人也来修改此数据，能够给这个数据加一把锁，这样其它线程想修改此数据时就必须等待你修改完毕并把锁释放掉后才能再访问此数据。

*注：不要在3.x上运行，不知为何，3.x上的结果老是正确的，多是自动加了锁

加锁版本

 
         import 
         time 
        
         import 
         threading 
        
         def 
         addNum(): 
        
         global 
         num  
         #在每一个线程中都获取这个全局变量 
        
         print 
         ( 
         '--get num:' 
         ,num ) 
        
         time.sleep( 
         1 
         ) 
        
         lock.acquire()  
         #修改数据前加锁 
        
         num   
         - 
         = 
         1 
         #对此公共变量进行-1操做 
        
         lock.release()  
         #修改后释放 
        
         num  
         = 
         100  
         #设定一个共享变量 
        
         thread_list  
         = 
         [] 
        
         lock  
         = 
         threading.Lock()  
         #生成全局锁 
        
         for 
         i  
         in 
         range 
         ( 
         100 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         addNum) 
        
         t.start() 
        
         thread_list.append(t) 
        
         for 
         t  
         in 
         thread_list:  
         #等待全部线程执行完毕 
        
         t.join() 
        
         print 
         ( 
         'final num:' 
         , num )

GIL VS Lock

机智的同窗可能会问到这个问题，就是既然你以前说过了，Python已经有一个GIL来保证同一时间只能有一个线程来执行了，为何这里还须要lock? 注意啦，这里的lock是用户级的lock,跟那个GIL不要紧，具体咱们经过下图来看一下+配合我现场讲给你们，就明白了。

那你又问了，既然用户程序已经本身有锁了，那为何C python还须要GIL呢？加入GIL主要的缘由是为了下降程序的开发的复杂度，好比如今的你写python不须要关心内存回收的问题，由于Python解释器帮你自动按期进行内存回收，你能够理解为python解释器里有一个独立的线程，每过一段时间它起wake up作一次全局轮询看看哪些内存数据是能够被清空的，此时你本身的程序里的线程和 py解释器本身的线程是并发运行的，假设你的线程删除了一个变量，py解释器的垃圾回收线程在清空这个变量的过程当中的clearing时刻，可能一个其它线程正好又从新给这个还没来及得清空的内存空间赋值了，结果就有可能新赋值的数据被删除了，为了解决相似的问题，python解释器简单粗暴的加了锁，即当一个线程运行时，其它人都不能动，这样就解决了上述的问题，这能够说是Python早期版本的遗留问题。

RLock（递归锁）

说白了就是在一个大锁中还要再包含子锁

 
         import 
         threading,time 
        
         def 
         run1(): 
        
         print 
         ( 
         "grab the first part data" 
         ) 
        
         lock.acquire() 
        
         global 
         num 
        
         num  
         + 
         = 
         1 
        
         lock.release() 
        
         return 
         num 
        
         def 
         run2(): 
        
         print 
         ( 
         "grab the second part data" 
         ) 
        
         lock.acquire() 
        
         global  
         num2 
        
         num2 
         + 
         = 
         1 
        
         lock.release() 
        
         return 
         num2 
        
         def 
         run3(): 
        
         lock.acquire() 
        
         res  
         = 
         run1() 
        
         print 
         ( 
         '--------between run1 and run2-----' 
         ) 
        
         res2  
         = 
         run2() 
        
         lock.release() 
        
         print 
         (res,res2) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         num,num2  
         = 
         0 
         , 
         0 
        
         lock  
         = 
         threading.RLock() 
        
         for 
         i  
         in 
         range 
         ( 
         10 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         run3) 
        
         t.start() 
        
         while 
         threading.active_count() ! 
         = 
         1 
         : 
        
         print 
         (threading.active_count()) 
        
         else 
         : 
        
         print 
         ( 
         '----all threads done---' 
         ) 
        
         print 
         (num,num2)

Semaphore(信号量)

互斥锁同时只容许一个线程更改数据，而Semaphore是同时容许必定数量的线程更改数据，好比厕全部3个坑，那最多只容许3我的上厕所，后面的人只能等里面有人出来了才能再进去。

 
         import 
         threading,time 
        
         def 
         run(n): 
        
         semaphore.acquire() 
        
         time.sleep( 
         1 
         ) 
        
         print 
         ( 
         "run the thread: %s\n" 
         % 
         n) 
        
         semaphore.release() 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         num 
         = 
         0 
        
         semaphore   
         = 
         threading.BoundedSemaphore( 
         5 
         )  
         #最多容许5个线程同时运行 
        
         for 
         i  
         in 
         range 
         ( 
         20 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         run,args 
         = 
         (i,)) 
        
         t.start() 
        
         while 
         threading.active_count() ! 
         = 
         1 
         : 
        
         pass 
         #print threading.active_count() 
        
         else 
         : 
        
         print 
         ( 
         '----all threads done---' 
         ) 
        
         print 
         (num)

Timer

This class represents an action that should be run only after a certain amount of time has passed

Timers are started, as with threads, by calling their start() method. The timer can be stopped (before its action has begun) by calling thecancel() method. The interval the timer will wait before executing its action may not be exactly the same as the interval specified by the user.

 
         def 
         hello(): 
        
         print 
         ( 
         "hello, world" 
         ) 
        
         t  
         = 
         Timer( 
         30.0 
         , hello) 
        
         t.start()   
         # after 30 seconds, "hello, world" will be printed

Events

An event is a simple synchronization object;

the event represents an internal flag, and threads
can wait for the flag to be set, or set or clear the flag themselves.

event = threading.Event()

# a client thread can wait for the flag to be set
event.wait()

# a server thread can set or reset it
event.set()
event.clear()
If the flag is set, the wait method doesn’t do anything.
If the flag is cleared, wait will block until it becomes set again.
Any number of threads may wait for the same event.

经过Event来实现两个或多个线程间的交互，下面是一个红绿灯的例子，即起动一个线程作交通指挥灯，生成几个线程作车辆，车辆行驶按红灯停，绿灯行的规则。

 
         import 
         threading,time 
        
         import 
         random 
        
         def 
         light(): 
        
         if 
         not 
         event.isSet(): 
        
         event. 
         set 
         ()  
         #wait就不阻塞 #绿灯状态 
        
         count  
         = 
         0 
        
         while 
         True 
         : 
        
         if 
         count <  
         10 
         : 
        
         print 
         ( 
         '\033[42;1m--green light on---\033[0m' 
         ) 
        
         elif 
         count < 
         13 
         : 
        
         print 
         ( 
         '\033[43;1m--yellow light on---\033[0m' 
         ) 
        
         elif 
         count < 
         20 
         : 
        
         if 
         event.isSet(): 
        
         event.clear() 
        
         print 
         ( 
         '\033[41;1m--red light on---\033[0m' 
         ) 
        
         else 
         : 
        
         count  
         = 
         0 
        
         event. 
         set 
         ()  
         #打开绿灯 
        
         time.sleep( 
         1 
         ) 
        
         count  
         + 
         = 
         1 
        
         def 
         car(n): 
        
         while 
         1 
         : 
        
         time.sleep(random.randrange( 
         10 
         )) 
        
         if  
         event.isSet():  
         #绿灯 
        
         print 
         ( 
         "car [%s] is running.." 
         % 
         n) 
        
         else 
         : 
        
         print 
         ( 
         "car [%s] is waiting for the red light.." 
         % 
         n) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         event  
         = 
         threading.Event() 
        
         Light  
         = 
         threading.Thread(target 
         = 
         light) 
        
         Light.start() 
        
         for 
         i  
         in 
         range 
         ( 
         3 
         ): 
        
         t  
         = 
         threading.Thread(target 
         = 
         car,args 
         = 
         (i,)) 
        
         t.start()

这里还有一个event使用的例子，员工进公司门要刷卡，咱们这里设置一个线程是“门”，再设置几个线程为“员工”，员工看到门没打开，就刷卡，刷完卡，门开了，员工就能够经过。

 1 #_*_coding:utf-8_*_
 2 __author__ = 'Alex Li'
 3 import threading
 4 import time
 5 import random
 6 
 7 def door():
 8     door_open_time_counter = 0
 9     while True:
10         if door_swiping_event.is_set():
11             print("\033[32;1mdoor opening....\033[0m")
12             door_open_time_counter +=1
13 
14         else:
15             print("\033[31;1mdoor closed...., swipe to open.\033[0m")
16             door_open_time_counter = 0 #清空计时器
17             door_swiping_event.wait()
18 
19 
20         if door_open_time_counter > 3:#门开了已经3s了,该关了
21             door_swiping_event.clear()
22 
23         time.sleep(0.5)
24 
25 
26 def staff(n):
27 
28     print("staff [%s] is comming..." % n )
29     while True:
30         if door_swiping_event.is_set():
31             print("\033[34;1mdoor is opened, passing.....\033[0m")
32             break
33         else:
34             print("staff [%s] sees door got closed, swipping the card....." % n)
35             print(door_swiping_event.set())
36             door_swiping_event.set()
37             print("after set ",door_swiping_event.set())
38         time.sleep(0.5)
39 door_swiping_event  = threading.Event() #设置事件
40 
41 
42 door_thread = threading.Thread(target=door)
43 door_thread.start()
44 
45 
46 
47 for i in range(5):
48     p = threading.Thread(target=staff,args=(i,))
49     time.sleep(random.randrange(3))
50     p.start()

View Code

queue队列

queue is especially useful in threaded programming when information must be exchanged safely between multiple threads.

class queue.Queue(maxsize=0) #先入先出

class queue.LifoQueue(maxsize=0) #last in fisrt out
class queue.PriorityQueue(maxsize=0) #存储数据时可设置优先级的队列

Constructor for a priority queue. maxsize is an integer that sets the upperbound limit on the number of items that can be placed in the queue. Insertion will block once this size has been reached, until queue items are consumed. If maxsize is less than or equal to zero, the queue size is infinite.

The lowest valued entries are retrieved first (the lowest valued entry is the one returned by sorted(list(entries))[0]). A typical pattern for entries is a tuple in the form: (priority_number, data).

exception queue.Empty: Exception raised when non-blocking get() (or get_nowait()) is called on a Queue object which is empty.

exception queue.Full: Exception raised when non-blocking put() (or put_nowait()) is called on a Queue object which is full.

Queue. qsize ()

Queue. empty () #return True if empty

Queue. full () # return True if full

Queue. put (item, block=True, timeout=None): Put item into the queue. If optional args block is true and timeout is None (the default), block if necessary until a free slot is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Full exception if no free slot was available within that time. Otherwise (block is false), put an item on the queue if a free slot is immediately available, else raise the Full exception (timeout is ignored in that case).

Queue. put_nowait (item): Equivalent to put(item, False).

Queue. get (block=True, timeout=None): Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).

Queue. get_nowait (): Equivalent to get(False).

Two methods are offered to support tracking whether enqueued tasks have been fully processed by daemon consumer threads.

Queue. task_done ()

Indicate that a formerly enqueued task is complete. Used by queue consumer threads. For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete.

If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue).

Raises a ValueError if called more times than there were items placed in the queue.

Queue. join () block直到queue被消费完毕

生产者消费者模型

在并发编程中使用生产者和消费者模式可以解决绝大多数并发问题。该模式经过平衡生产线程和消费线程的工做能力来提升程序的总体处理数据的速度。

为何要使用生产者和消费者模式

在线程世界里，生产者就是生产数据的线程，消费者就是消费数据的线程。在多线程开发当中，若是生产者处理速度很快，而消费者处理速度很慢，那么生产者就必须等待消费者处理完，才能继续生产数据。一样的道理，若是消费者的处理能力大于生产者，那么消费者就必须等待生产者。为了解决这个问题因而引入了生产者和消费者模式。

什么是生产者消费者模式

生产者消费者模式是经过一个容器来解决生产者和消费者的强耦合问题。生产者和消费者彼此之间不直接通信，而经过阻塞队列来进行通信，因此生产者生产完数据以后不用等待消费者处理，直接扔给阻塞队列，消费者不找生产者要数据，而是直接从阻塞队列里取，阻塞队列就至关于一个缓冲区，平衡了生产者和消费者的处理能力。

下面来学习一个最基本的生产者消费者模型的例子

 
         import 
         threading 
        
         import 
         queue 
        
         def 
         producer(): 
        
         for 
         i  
         in 
         range 
         ( 
         10 
         ): 
        
         q.put( 
         "骨头 %s" 
         % 
         i ) 
        
         print 
         ( 
         "开始等待全部的骨头被取走..." 
         ) 
        
         q.join() 
        
         print 
         ( 
         "全部的骨头被取完了..." 
         ) 
        
         def 
         consumer(n): 
        
         while 
         q.qsize() > 
         0 
         : 
        
         print 
         ( 
         "%s 取到" 
         % 
         n  , q.get()) 
        
         q.task_done()  
         #告知这个任务执行完了 
        
         q  
         = 
         queue.Queue() 
        
         p  
         = 
         threading.Thread(target 
         = 
         producer,) 
        
         p.start() 
        
         c1  
         = 
         consumer( 
         "李闯" 
         )

 
         import 
         time,random 
        
         import 
         queue,threading 
        
         q  
         = 
         queue.Queue() 
        
         def 
         Producer(name): 
        
         count  
         = 
         0 
        
         while 
         count < 
         20 
         : 
        
         time.sleep(random.randrange( 
         3 
         )) 
        
         q.put(count) 
        
         print 
         ( 
         'Producer %s has produced %s baozi..' 
         % 
         (name, count)) 
        
         count  
         + 
         = 
         1 
        
         def 
         Consumer(name): 
        
         count  
         = 
         0 
        
         while 
         count < 
         20 
         : 
        
         time.sleep(random.randrange( 
         4 
         )) 
        
         if 
         not 
         q.empty(): 
        
         data  
         = 
         q.get() 
        
         print 
         (data) 
        
         print 
         ( 
         '\033[32;1mConsumer %s has eat %s baozi...\033[0m' 
         % 
         (name, data)) 
        
         else 
         : 
        
         print 
         ( 
         "-----no baozi anymore----" 
         ) 
        
         count  
         + 
         = 
         1 
        
         p1  
         = 
         threading.Thread(target 
         = 
         Producer, args 
         = 
         ( 
         'A' 
         ,)) 
        
         c1  
         = 
         threading.Thread(target 
         = 
         Consumer, args 
         = 
         ( 
         'B' 
         ,)) 
        
         p1.start() 
        
         c1.start()

多进程multiprocessing

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

 
         from 
         multiprocessing  
         import 
         Process 
        
         import 
         time 
        
         def 
         f(name): 
        
         time.sleep( 
         2 
         ) 
        
         print 
         ( 
         'hello' 
         , name) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         p  
         = 
         Process(target 
         = 
         f, args 
         = 
         ( 
         'bob' 
         ,)) 
        
         p.start() 
        
         p.join()

To show the individual process IDs involved, here is an expanded example:

 
         from 
         multiprocessing  
         import 
         Process 
        
         import 
         os 
        
         def 
         info(title): 
        
         print 
         (title) 
        
         print 
         ( 
         'module name:' 
         , __name__) 
        
         print 
         ( 
         'parent process:' 
         , os.getppid()) 
        
         print 
         ( 
         'process id:' 
         , os.getpid()) 
        
         print 
         ( 
         "\n\n" 
         ) 
        
         def 
         f(name): 
        
         info( 
         '\033[31;1mfunction f\033[0m' 
         ) 
        
         print 
         ( 
         'hello' 
         , name) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         info( 
         '\033[32;1mmain process line\033[0m' 
         ) 
        
         p  
         = 
         Process(target 
         = 
         f, args 
         = 
         ( 
         'bob' 
         ,)) 
        
         p.start() 
        
         p.join()

进程间通信　　

不一样进程间内存是不共享的，要想实现两个进程间的数据交换，能够用如下方法：

Queues

使用方法跟threading里的queue差很少

 
         from 
         multiprocessing  
         import 
         Process, Queue 
        
         def 
         f(q): 
        
         q.put([ 
         42 
         ,  
         None 
         ,  
         'hello' 
         ]) 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         q  
         = 
         Queue() 
        
         p  
         = 
         Process(target 
         = 
         f, args 
         = 
         (q,)) 
        
         p.start() 
        
         print 
         (q.get())     
         # prints "[42, None, 'hello']" 
        
         p.join()

Pipes

The Pipe() function returns a pair of connection objects connected by a pipe which by default is duplex (two-way). For example:

 
         from 
         multiprocessing  
         import 
         Process, Pipe 
        
         def 
         f(conn): 
        
         conn.send([ 
         42 
         ,  
         None 
         ,  
         'hello' 
         ]) 
        
         conn.close() 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         parent_conn, child_conn  
         = 
         Pipe() 
        
         p  
         = 
         Process(target 
         = 
         f, args 
         = 
         (child_conn,)) 
        
         p.start() 
        
         print 
         (parent_conn.recv())    
         # prints "[42, None, 'hello']" 
        
         p.join()

The two connection objects returned by Pipe() represent the two ends of the pipe. Each connection object has send() and recv() methods (among others). Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time.

Managers

A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.

A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier, Queue, Value and Array. For example,

 
          from 
          multiprocessing  
          import 
          Process, Manager 
         
          def 
          f(d, l): 
         
          d[ 
          1 
          ]  
          = 
          '1' 
         
          d[ 
          '2' 
          ]  
          = 
          2 
         
          d[ 
          0.25 
          ]  
          = 
          None 
         
          l.append( 
          1 
          ) 
         
          print 
          (l) 
         
          if 
          __name__  
          = 
          = 
          '__main__' 
          : 
         
          with Manager() as manager: 
         
          d  
          = 
          manager. 
          dict 
          () 
         
          l  
          = 
          manager. 
          list 
          ( 
          range 
          ( 
          5 
          )) 
         
          p_list  
          = 
          [] 
         
          for 
          i  
          in 
          range 
          ( 
          10 
          ): 
         
          p  
          = 
          Process(target 
          = 
          f, args 
          = 
          (d, l)) 
         
          p.start() 
         
          p_list.append(p) 
         
          for 
          res  
          in 
          p_list: 
         
          res.join() 
         
          print 
          (d) 
         
          print 
          (l)

进程同步

Without using the lock output from the different processes is liable to get all mixed up.

 
         from 
         multiprocessing  
         import 
         Process, Lock 
        
         def 
         f(l, i): 
        
         l.acquire() 
        
         try 
         : 
        
         print 
         ( 
         'hello world' 
         , i) 
        
         finally 
         : 
        
         l.release() 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         lock  
         = 
         Lock() 
        
         for 
         num  
         in 
         range 
         ( 
         10 
         ): 
        
         Process(target 
         = 
         f, args 
         = 
         (lock, num)).start()

进程池　　

进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，若是进程池序列中没有可供使用的进进程，那么程序就会等待，直到进程池中有可用进程为止。

进程池中有两个方法：

apply
apply_async

 
         from  
         multiprocessing  
         import 
         Process,Pool 
        
         import 
         time 
        
         def 
         Foo(i): 
        
         time.sleep( 
         2 
         ) 
        
         return 
         i 
         + 
         100 
        
         def 
         Bar(arg): 
        
         print 
         ( 
         '-->exec done:' 
         ,arg) 
        
         pool  
         = 
         Pool( 
         5 
         ) 
        
         for 
         i  
         in 
         range 
         ( 
         10 
         ): 
        
         pool.apply_async(func 
         = 
         Foo, args 
         = 
         (i,),callback 
         = 
         Bar) 
        
         #pool.apply(func=Foo, args=(i,)) 
        
         print 
         ( 
         'end' 
         ) 
        
         pool.close() 
        
         pool.join() 
         #进程池中进程执行完毕后再关闭，若是注释，那么程序直接关闭。

做业需求：

题目:简单主机批量管理工具

需求:

主机分组
主机信息配置文件用configparser解析
可批量执行命令、发送文件，结果实时返回，执行格式以下
1. batch_run -h h1,h2,h3 -g web_clusters,db_servers -cmd "df -h"　
2. batch_scp -h h1,h2,h3 -g web_clusters,db_servers -action put -local test.py -remote /tmp/　
主机用户名密码、端口能够不一样
执行远程命令使用paramiko模块
批量命令需使用multiprocessing并发