An executing instance of a program is called a process.
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
在多道编程中,我们允许多个程序同时加载到内存中,在操作系统的调度下,可以实现并发地执行。这是这样的设计,大大提高了CPU的利用率。进程的出现让每个用户感觉到自己独享CPU,因此,进程就是为了在CPU上实现多道编程而提出的。
进程有很多优点,它提供了多道编程,让我们感觉我们每个人都拥有自己的CPU和其他资源,可以提高计算机的利用率。很多人就不理解了,既然进程这么优秀,为什么还要线程呢?其实,仔细观察就会发现进程还是有很多缺陷的,主要体现在两点上:
进程只能在一个时间干一件事,如果想同时干两件事或多件事,进程就无能为力了。
进程在执行的过程中如果阻塞,
例如等待输入,整个进程就会挂起,即使进程中有些工作不依赖于输入的数据,也将无法执行。
例如,我们在使用qq聊天, qq做为一个独立进程如果同一时间只能干一件事,那他如何实现在同一时刻 既能监听键盘输入、又能监听其它人给你发的消息、同时还能把别人发的消息显示在屏幕上呢?你会说,操作系统不是有分时么?但我的亲,分时是指在不同进程间的分时呀, 即操作系统处理一会你的qq任务,又切换到word文档任务上了,每个cpu时间片分给你的qq程序时,你的qq还是只能同时干一件事呀。
再直白一点, 一个操作系统就像是一个工厂,工厂里面有很多个生产车间,不同的车间生产不同的产品,每个车间就相当于一个进程,且你的工厂又穷,供电不足,同一时间只能给一个车间供电,为了能让所有车间都能同时生产,你的工厂的电工只能给不同的车间分时供电,但是轮到你的qq车间时,发现只有一个干活的工人,结果生产效率极低,为了解决这个问题,应该怎么办呢?。。。。没错,你肯定想到了,就是多加几个工人,让几个人工人并行工作,这每个工人,就是线程!
线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务
A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.
Suppose you're reading a book, and you want to take a break right now, but you want to be able to come back and resume reading from the exact point where you stopped. One way to achieve that is by jotting down the page number, line number, and word number. So your execution context for reading a book is these 3 numbers.
If you have a roommate, and she's using the same technique, she can take the book while you're not using it, and resume reading from where she stopped. Then you can take it back, and resume it from where you were.
Threads work in the same way. A CPU is giving you the illusion that it's doing multiple computations at the same time. It does that by spending a bit of time on each computation. It can do that because it has an execution context for each computation. Just like you can share a book with your friend, many tasks can share a CPU.
On a more technical level, an execution context (therefore a thread) consists of the values of the CPU's registers.
Last: threads are different from processes. A thread is a context of execution, while a process is a bunch of resources associated with a computation. A process can have one or many threads.
Clarification: the resources associated with a process include memory pages (all the threads in a process have the same view of the memory), file descriptors (e.g., open sockets), and security credentials (e.g., the ID of the user who started the process).
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)
上面的核心意思就是,无论你启多少个线程,你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行,擦。。。,那这还叫什么多线程呀?莫如此早的下结结论,听我现场讲。
首先需要明确的一点是GIL
并不是Python的特性,它是在实现Python解析器(CPython)时所引入的一个概念。就好比C++是一套语言(语法)标准,但是可以用不同的编译器来编译成可执行代码。有名的编译器例如GCC,INTEL C++,Visual C++等。Python也一样,同样一段代码可以通过CPython,PyPy,Psyco等不同的Python执行环境来执行。像其中的JPython就没有GIL。然而因为CPython是大部分环境下默认的Python执行环境。所以在很多人的概念里CPython就是Python,也就想当然的把GIL
归结为Python语言的缺陷。所以这里要先明确一点:GIL并不是Python的特性,Python完全可以不依赖于GIL
这篇文章透彻的剖析了GIL对python多线程的影响,强烈推荐看一下:http://www.dabeaz.com/python/UnderstandingGIL.pdf
线程有2种调用方式,如下:
直接调用
import threading
import time
def sayhi(num): # 定义每个线程要运行的函数
print("running on number:%s" % num)
time.sleep(3)
if __name__ == '__main__':
t1 = threading.Thread(target=sayhi, args=(1,)) # 生成一个线程实例
t2 = threading.Thread(target=sayhi, args=(2,)) # 生成另一个线程实例
t1.start() # 启动线程
t2.start() # 启动另一个线程
print(t1.getName()) # 获取线程名
print(t2.getName())
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# running on number:1
# running on number:2
# Thread-1
# Thread-2
继承式调用
import threading
import time
class MyThread(threading.Thread):
def __init__(self, num):
threading.Thread.__init__(self)
self.num = num
def run(self): # 定义每个线程要运行的函数
print("running on number:%s" % self.num)
time.sleep(3)
if __name__ == '__main__':
t1 = MyThread(1)
t2 = MyThread(2)
t1.start()
t2.start()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# running on number:1
# running on number:2
Join & Daemon
Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it's okay to kill them off once the other, non-daemon, threads have exited.
Without daemon threads, you'd have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.
1 Python 默认参数创建线程后,不管主线程是否执行完毕,都会等待子线程执行完毕才一起退出,有无join结果一样
2 如果创建线程,并且设置了daemon为true,即thread.setDaemon(True), 则主线程执行完毕后自动退出,
不会等待子线程的执行结果。而且随着主线程退出,子线程也消亡。
3 join方法的作用是阻塞,等待子线程结束,join方法有一个参数是timeout,
即如果主线程等待timeout,子线程还没有结束,则主线程强制结束子线程。
4 如果线程daemon属性为False, 则join里的timeout参数无效。主线程会一直等待子线程结束。
5 如果线程daemon属性为True, 则join里的timeout参数是有效的, 主线程会等待timeout时间后,结束子线程。
此处有一个坑,即如果同时有N个子线程join(timeout),那么实际上主线程会等待的超时时间最长为 N * timeout,
因为每个子线程的超时开始时刻是上一个子线程超时结束的时刻。
Event
.
import threading
import time
class MyThread(threading.Thread):
def __init__(self,id):
threading.Thread.__init__(self)
self.id = id
def run(self):
x = 0
time.sleep(10)
print(self.id)
if __name__ == "__main__":
t1=MyThread(999)
t1.start()
for i in range(5):
print(i)
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 0
# 1
# 2
# 3
# 4
import threading
import time
class MyThread(threading.Thread):
def __init__(self, id):
threading.Thread.__init__(self)
self.id = id
def run(self):
x = 0
time.sleep(10)
print(self.id)
if __name__ == "__main__":
t1 = MyThread(999)
t1.start()
t1.join()
for i in range(5):
print(i)
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 999
# 0
# 1
# 2
# 3
# 4
import threading
import time
class MyThread(threading.Thread):
def __init__(self, id):
threading.Thread.__init__(self)
def run(self):
time.sleep(5)
print("This is " + self.getName())
if __name__ == "__main__":
t1 = MyThread(999)
t1.setDaemon(True)
t1.start()
print("I am the father thread.")
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# I am the father thread.
线程锁(互斥锁Mutex)
一个进程下可以启动多个线程,多个线程共享父进程的内存空间,也就意味着每个线程可以访问同一份数据,此时,如果2个线程同时要修改同一份数据,会出现什么状况?
# -*- coding:UTF-8 -*-
import time
import threading
def addNum():
global num # 在每个线程中都获取这个全局变量
print('--get num:', num)
time.sleep(1)
num -= 1 # 对此公共变量进行-1操作
num = 100 # 设定一个共享变量
thread_list = []
for i in range(100):
t = threading.Thread(target=addNum)
t.start()
thread_list.append(t)
for t in thread_list: # 等待所有线程执行完毕
t.join()
print('final num:', num)
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# --get num: 100
# --get num: 100
# --get num: 100
# --get num: 100
# ...
# --get num: 100
# --get num: 100
# final num: 0
# D:\Python\python\python-2.7.13\Python27\python2.exe
# ('--get num:', 100)
# ('--get num:', 100)
# ('--get num:', 100)
# ('--get num:', 100)
# ('--get num:', (100'--get num:'),
# 100)
#
# ('--get num:', 100)
# ('--get num:', 100)
# ('--get num:', 100)
# ('final num:', 1)
正常来讲,这个num结果应该是0, 但在python 2.7上多运行几次,会发现,最后打印出来的num结果不总是0,为什么每次运行的结果不一样呢? 哈,很简单,假设你有A,B两个线程,此时都 要对num 进行减1操作, 由于2个线程是并发同时运行的,所以2个线程很有可能同时拿走了num=100这个初始变量交给cpu去运算,当A线程去处完的结果是99,但此时B线程运算完的结果也是99,两个线程同时CPU运算的结果再赋值给num变量后,结果就都是99。那怎么办呢? 很简单,每个线程在要修改公共数据时,为了避免自己在还没改完的时候别人也来修改此数据,可以给这个数据加一把锁, 这样其它线程想修改此数据时就必须等待你修改完毕并把锁释放掉后才能再访问此数据。
*注:不要在3.x上运行,不知为什么,3.x上的结果总是正确的,可能是自动加了锁
# -*- coding:UTF-8 -*-
import time
import threading
def addNum():
global num # 在每个线程中都获取这个全局变量
print('--get num:', num)
time.sleep(1)
lock.acquire() # 修改数据前加锁
num -= 1 # 对此公共变量进行-1操作
lock.release() # 修改后释放
num = 100 # 设定一个共享变量
thread_list = []
lock = threading.Lock() # 生成全局锁
for i in range(100):
t = threading.Thread(target=addNum)
t.start()
thread_list.append(t)
for t in thread_list: # 等待所有线程执行完毕
t.join()
print('final num:', num)
# D:\Python\python\python-2.7.13\Python27\python2.exe
# ('--get num:', 100)
# ('--get num:', 100)
# ...
# ('--get num:', 100)
# ('--get num:', 100)
# ('final num:', 0)
GIL VS Lock
那你又问了, 既然用户程序已经自己有锁了,那为什么C python还需要GIL呢?
加入GIL主要的原因是为了降低程序的开发的复杂度,比如现在的你写python不需要关心内存回收的问题,因为Python解释器帮你自动定期进行内存回收,你可以理解为python解释器里有一个独立的线程,每过一段时间它起wake up做一次全局轮询看看哪些内存数据是可以被清空的,此时你自己的程序 里的线程和 py解释器自己的线程是并发运行的,假设你的线程删除了一个变量,py解释器的垃圾回收线程在清空这个变量的过程中的clearing时刻,可能一个其它线程正好又重新给这个还没来及得清空的内存空间赋值了,结果就有可能新赋值的数据被删除了,为了解决类似的问题,python解释器简单粗暴的加了锁,即当一个线程运行时,其它人都不能动,这样就解决了上述的问题, 这可以说是Python早期版本的遗留问题。
RLock(递归锁)
import threading, time
def run1():
print("grab the first part data")
lock.acquire()
global num
num += 1
lock.release()
return num
def run2():
print("grab the second part data")
lock.acquire()
global num2
num2 += 1
lock.release()
return num2
def run3():
lock.acquire()
res = run1()
print('--------between run1 and run2-----')
res2 = run2()
lock.release()
print(res, res2)
if __name__ == '__main__':
num, num2 = 0, 0
lock = threading.RLock()
for i in range(10):
t = threading.Thread(target=run3)
t.start()
while threading.active_count() != 1:
print(threading.active_count())
else:
print('----all threads done---')
print(num, num2)
# D:\Python\python\python-2.7.13\Python27\python2.exe
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (grab the first part data1
# , --------between run1 and run2-----1
# )grab the second part data
#
# (2, 2)
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (3, 3)
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (4, 4)
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (5, 5)
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (6, 6grab the first part data)
#
# --------between run1 and run2-----
# grab the second part data
# (grab the first part data7
# , --------between run1 and run2-----7
# )grab the second part data
#
# (8, 8)grab the first part data
#
# --------between run1 and run2-----
# grab the second part data
# (9, 9)
# grab the first part data
# --------between run1 and run2-----
# grab the second part data
# (10, 10)
# 1
# ----all threads done---
# (10, 10)
#
Semaphore(信号量)
import threading, time
def run(n):
global num
semaphore.acquire()
time.sleep(1)
print("run the thread: %s\n" % n)
num += n;
semaphore.release()
if __name__ == '__main__':
num = 0
semaphore = threading.BoundedSemaphore(5) # 最多允许5个线程同时运行
for i in range(20):
t = threading.Thread(target=run, args=(i,))
t.start()
while threading.active_count() != 1:
# print ("threading.active_count():",threading.active_count())
pass
else:
print('----all threads done---')
print(num)
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# run the thread: 2
# run the thread: 0
# run the thread: 4
# run the thread: 1
#
# run the thread: 3
#
#
#
#
# run the thread: 6
# run the thread: 5
#
# run the thread: 7
#
# run the thread: 9
# run the thread: 8
#
#
#
# run the thread: 14
# run the thread: 12
#
# run the thread: 13
# run the thread: 11
#
#
#
# run the thread: 10
#
# run the thread: 19
# run the thread: 16
# run the thread: 17
#
#
# run the thread: 15
#
#
# run the thread: 18
#
# ----all threads done---
# 190
#
start()
method. The timer can be stopped (before its action has begun) by calling the cancel()
method. The interval the timer will wait before executing its action may not be exactly the same as the interval specified by the user.from threading import Timer
def hello():
print("hello, world")
t = Timer(3.0, hello)
t.start()
# after 3 seconds, "hello, world" will be printed
Events
event = threading.Event()
event.wait()
event.set()
event.clear()
通过Event来实现两个或多个线程间的交互:
import random
import threading
import time
def light():
if not event.isSet():
event.set() # wait就不阻塞 #绿灯状态
count = 0
while True:
print("count:",count)
if count < 10:
print('\033[42;1m--green light on---\033[0m')
elif count < 13:
print('\033[43;1m--yellow light on---\033[0m')
elif count < 20:
if event.isSet():
event.clear()
print('\033[41;1m--red light on---\033[0m')
else:
count = 0
event.set() # 打开绿灯
time.sleep(1)
count += 1
def car(n):
while 1:
time.sleep(random.randrange(10))
if event.isSet(): # 绿灯
print("car [%s] is running.." % n)
else:
print("car [%s] is waiting for the red light.." % n)
if __name__ == '__main__':
event = threading.Event()
Light = threading.Thread(target=light)
Light.start()
for i in range(3):
t = threading.Thread(target=car, args=(i,))
t.start()
#_*_coding:utf-8_*_
__author__ = 'Alex Li'
import threading
import time
import random
def door():
door_open_time_counter = 0
while True:
if door_swiping_event.is_set():
print("\033[32;1mdoor opening....\033[0m")
door_open_time_counter +=1
else:
print("\033[31;1mdoor closed...., swipe to open.\033[0m")
door_open_time_counter = 0 #清空计时器
door_swiping_event.wait()
if door_open_time_counter > 3:#门开了已经3s了,该关了
door_swiping_event.clear()
time.sleep(0.5)
def staff(n):
print("staff [%s] is comming..." % n )
while True:
if door_swiping_event.is_set():
print("\033[34;1mdoor is opened, passing.....\033[0m")
break
else:
print("staff [%s] sees door got closed, swipping the card....." % n)
print(door_swiping_event.set())
door_swiping_event.set()
print("after set ",door_swiping_event.set())
time.sleep(0.5)
door_swiping_event = threading.Event() #设置事件
door_thread = threading.Thread(target=door)
door_thread.start()
for i in range(5):
p = threading.Thread(target=staff,args=(i,))
time.sleep(random.randrange(3))
p.start()
Queue是python标准库中的线程安全的队列(FIFO)实现,提供了一个适用于多线程编程的先进先出的数据结构,即队列,用来在生产者和消费者线程之间的信息传递
class Queue.Queue(maxsize=0)
FIFO即First in First Out,先进先出。Queue提供了一个基本的FIFO容器,使用方法很简单,maxsize是个整数,指明了队列中能存放的数据个数的上限。一旦达到上限,插入会导致阻塞,直到队列中的数据被消费掉。如果maxsize小于或者等于0,队列大小没有限制。
举个栗子:
import queue
q = queue.Queue()
for i in range(5):
q.put(i)
while not q.empty():
print(q.get())
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 0
# 1
# 2
# 3
# 4
class Queue.LifoQueue(maxsize=0)
LIFO即Last in First Out,后进先出。与栈的类似,使用也很简单,maxsize用法同上
再举个栗子:
import queue
q = queue.LifoQueue()
for i in range(5):
q.put(i)
while not q.empty():
print(q.get())
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 4
# 3
# 2
# 1
# 0
可以看到仅仅是将Queue.Quenu类
替换为Queue.LifiQueue类
class Queue.PriorityQueue(maxsize=0)
构造一个优先队列。maxsize用法同上。
import Queue
import threading
class Job(object):
def __init__(self, priority, description):
self.priority = priority
self.description = description
print 'Job:',description
return
def __cmp__(self, other):
return cmp(self.priority, other.priority)
q = Queue.PriorityQueue()
q.put(Job(3, 'level 3 job'))
q.put(Job(10, 'level 10 job'))
q.put(Job(1, 'level 1 job'))
def process_job(q):
while True:
next_job = q.get()
print 'for:', next_job.description
q.task_done()
workers = [threading.Thread(target=process_job, args=(q,)),
threading.Thread(target=process_job, args=(q,))
]
for w in workers:
w.setDaemon(True)
w.start()
q.join()
# D:\Python\python\python-2.7.13\Python27\python2.exe
# Job: level 3 job
# Job: level 10 job
# Job: level 1 job
# for: level 1 job
# for: level 3 job
# for: level 10 job
意味着之前入队的一个任务已经完成。由队列的消费者线程调用。
每一个get()调用得到一个任务,接下来的task_done()调用告诉队列该任务已经处理完毕。
如果当前一个join()正在阻塞,
它将在队列中的所有任务都处理完时恢复执行(即每一个由put()调用入队的任务都有一个对应的task_done()调用)。
阻塞调用线程,直到队列中的所有任务被处理掉。
只要有数据被加入队列,未完成的任务数就会增加。
当消费者线程调用task_done()(意味着有消费者取得任务并完成任务),未完成的任务数就会减少。
当未完成的任务数降到0,join()解除阻塞。
将item放入队列中。
如果可选的参数block为True且timeout为空对象(默认的情况,阻塞调用,无超时)。
如果timeout是个正整数,阻塞调用进程最多timeout秒,如果一直无空空间可用,抛出Full异常(带超时的阻塞调用)。
如果block为False,如果有空闲空间可用将数据放入队列,否则立即抛出Full异常
其非阻塞版本为put_nowait等同于put(item, False)
从队列中移除并返回一个数据。block跟timeout参数同put方法
其非阻塞方法为`get_nowait()`相当与get(False)
如果队列为空,返回True,反之返回False
在并发编程中使用生产者和消费者模式能够解决绝大多数并发问题。该模式通过平衡生产线程和消费线程的工作能力来提高程序的整体处理数据的速度。
为什么要使用生产者和消费者模式
在线程世界里,生产者就是生产数据的线程,消费者就是消费数据的线程。在多线程开发当中,如果生产者处理速度很快,而消费者处理速度很慢,那么生产者就必须等待消费者处理完,才能继续生产数据。同样的道理,如果消费者的处理能力大于生产者,那么消费者就必须等待生产者。为了解决这个问题于是引入了生产者和消费者模式。
什么是生产者消费者模式
生产者消费者模式是通过一个容器来解决生产者和消费者的强耦合问题。生产者和消费者彼此之间不直接通讯,而通过阻塞队列来进行通讯,所以生产者生产完数据之后不用等待消费者处理,直接扔给阻塞队列,消费者不找生产者要数据,而是直接从阻塞队列里取,阻塞队列就相当于一个缓冲区,平衡了生产者和消费者的处理能力。
下面来学习一个最基本的生产者消费者模型的例子
import threading
import queue
def producer():
for i in range(10):
q.put("骨头 %s" % i)
print("开始等待所有的骨头被取走...")
q.join()
print("所有的骨头被取完了...")
def consumer(n):
while q.qsize() > 0:
print("%s 取到" % n, q.get())
q.task_done() # 告知这个任务执行完了
q = queue.Queue()
p = threading.Thread(target=producer, )
p.start()
c1 = consumer("李闯")
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 开始等待所有的骨头被取走...
# 李闯 取到 骨头 0
# 李闯 取到 骨头 1
# 李闯 取到 骨头 2
# 李闯 取到 骨头 3
# 李闯 取到 骨头 4
# 李闯 取到 骨头 5
# 李闯 取到 骨头 6
# 李闯 取到 骨头 7
# 李闯 取到 骨头 8
# 李闯 取到 骨头 9
# 所有的骨头被取完了...
multiprocessing
is a package that supports spawning processes using an API similar to the threading
module. The multiprocessing
package offers both local and remote concurrency, effectively side-stepping(回避) the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing
module allows the programmer to fully leverage( 杠杆作用; 优势,力量) multiple processors on a given machine. It runs on both Unix and Windows.
from multiprocessing import Process
import time
def f(name):
time.sleep(2)
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# hello bob
To show the individual process IDs involved, here is an expanded example:
from multiprocessing import Process
import os
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
print("\n\n")
def f(name):
info('\033[31;1mfunction f\033[0m')
print('hello', name)
if __name__ == '__main__':
info('\033[32;1mmain process line\033[0m')
p = Process(target=f, args=('bob',))
p.start()
p.join()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# main process line
# module name: __main__
# parent process: 7652
# process id: 5576
#
#
#
# function f
# module name: __mp_main__
# parent process: 5576
# process id: 7888
#
#
#
# hello bob
#
Queues
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print(q.get())
p.join()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# [42, None, 'hello']
Pipes
Pipe()
function returns a pair of connection objects connected by a pipe which by default is duplex (two-way)(有两部分的). For example:from multiprocessing import Process, Pipe
def f(conn):
conn.send([42, None, 'hello'])
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv())
p.join()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# [42, None, 'hello']
Pipe()
represent the two ends of the pipe. send()
and recv()
methods (among others).
Manager()
controls a server process which holds Python objects and allows other processes to manipulate them using proxies.Manager()
will support types list
, dict
, Namespace
, Lock
, RLock
, Semaphore
, BoundedSemaphore
, Condition
, Event
, Barrier
, Queue
, Value
and Array
. For example,from multiprocessing import Process, Manager
def f(d, l):
d[1] = '1'
d['2'] = 2
d[0.25] = None
l.append("A")
print(l)
if __name__ == '__main__':
with Manager() as manager:
d = manager.dict()
l = manager.list(range(5))
p_list = []
for i in range(10):
p = Process(target=f, args=(d, l))
p.start()
p_list.append(p)
for res in p_list:
res.join()
print(d)
print(l)
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# [0, 1, 2, 3, 4, 'A']
# [0, 1, 2, 3, 4, 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']
# {1: '1', '2': 2, 0.25: None}
# [0, 1, 2, 3, 4, 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']
进程同步
from multiprocessing import Process, Lock
def f(l, i):
l.acquire()
try:
print('hello world', i)
finally:
l.release()
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# hello world 9
# hello world 7
# hello world 8
# hello world 6
# hello world 3
# hello world 1
# hello world 2
# hello world 5
# hello world 4
# hello world 0
import multiprocessing
import os
import time
from datetime import datetime
def subprocess(number):
# 子进程
print('这是第%d个子进程' % number)
pid = os.getpid() # 得到当前进程号
print('当前进程号:%s,开始时间:%s' % (pid, datetime.now().isoformat()))
time.sleep(30) # 当前进程休眠30秒
print('当前进程号:%s,结束时间:%s' % (pid, datetime.now().isoformat()))
def Bar(arg):
print('-->exec done:', arg)
def mainprocess():
# 主进程
print('这是主进程,进程编号:%d' % os.getpid())
t_start = datetime.now()
pool = multiprocessing.Pool()
for i in range(8):
pool.apply_async(subprocess, args=(i,), callback=Bar)
pool.close()
pool.join()
t_end = datetime.now()
print('主进程用时:%d毫秒' % (t_end - t_start).microseconds)
if __name__ == '__main__':
# 主测试函数
mainprocess()
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# 这是主进程,进程编号:11224
# 这是第0个子进程
# 当前进程号:10640,开始时间:2017-08-10T08:34:36.821712
# 这是第1个子进程
# 当前进程号:10076,开始时间:2017-08-10T08:34:36.850713
# 这是第2个子进程
# 当前进程号:10996,开始时间:2017-08-10T08:34:36.859714
# 这是第3个子进程
# 当前进程号:10720,开始时间:2017-08-10T08:34:36.904716
# 当前进程号:10640,结束时间:2017-08-10T08:35:06.822428
# 这是第4个子进程
# 当前进程号:10640,开始时间:2017-08-10T08:35:06.822428
# -->exec done: None
# 当前进程号:10076,结束时间:2017-08-10T08:35:06.851429
# 这是第5个子进程
# 当前进程号:10076,开始时间:2017-08-10T08:35:06.851429
# -->exec done: None
# 当前进程号:10996,结束时间:2017-08-10T08:35:06.860430
# 这是第6个子进程
# 当前进程号:10996,开始时间:2017-08-10T08:35:06.860430
# -->exec done: None
# 当前进程号:10720,结束时间:2017-08-10T08:35:06.905432
# -->exec done: None
# 这是第7个子进程
# 当前进程号:10720,开始时间:2017-08-10T08:35:06.905432
# 当前进程号:10640,结束时间:2017-08-10T08:35:36.823144
# -->exec done: None
# 当前进程号:10076,结束时间:2017-08-10T08:35:36.852145
# -->exec done: None
# 当前进程号:10996,结束时间:2017-08-10T08:35:36.861146
# -->exec done: None
# 当前进程号:10720,结束时间:2017-08-10T08:35:36.906148
# -->exec done: None
# 主进程用时:417456毫秒
#
1、新建单一进程
如果我们新建少量进程,可以如下:
import multiprocessing
import time
def func(msg):
for i in range(3):
print(msg)
time.sleep(1)
if __name__ == "__main__":
p = multiprocessing.Process(target=func, args=("hello",))
p.start()
p.join()
print("Sub-process done.")
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# hello
# hello
# hello
# Sub-process done.
2、使用进程池
是的,你没有看错,不是线程池。它可以让你跑满多核CPU,而且使用方法非常简单。
import multiprocessing
import time
def func(msg):
for i in range(3):
print(msg)
time.sleep(1)
print("++++++++++")
if __name__ == "__main__":
pool = multiprocessing.Pool(processes=4)
for i in range(10):
msg = "hello %d" % (i)
pool.apply_async(func, (msg,))
pool.close()
pool.join()
print("Sub-process(es) done.")
# D:\Python\python\python-3.6.1\Python36-64\python.exe
# hello 0
# hello 1