python内存泄露

1、python有自动垃圾回收机制(当对象的引用计数为零时解释器会自动释放内存),出现内存泄露的场景通常是扩展库内存泄露或者循环引用(还有一种是全局容器里的对象没有删除)html

前者无需讨论,后者举例以下(Obj('B')和Obj('C')的内存没有回收)(貌似循环引用的内存,Python解释器也会本身回收(标记-清除垃圾收集机制),只是时间迟早的问题,也就是说咱们在编码中不须要耗费精力去刻意避免循环引用,具体的内容这两天再细看一下(http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works 源码剖析的垃圾收集那一章还没看完真是心病啊)---2013.10.20)node

[dongsong@localhost python_study]$ cat leak_test2.py 
#encoding=utf-8

class Obj:
    def __init__(self,name='A'):
        self.name = name
        print '%s inited' % self.name
    def __del__(self):
        print '%s deleted' % self.name

if __name__ == '__main__':
    a = Obj('A')
    b = Obj('B')
    c = Obj('c')

    c.attrObj = b
    b.attrObj = c
[dongsong@localhost python_study]$ vpython leak_test2.py 
A inited
B inited
c inited
A deleted


2、objgraph模块python

该模块能够找到增加最快的对象、实际最多的对象,能够画出某对象里面全部元素的引用关系图、某对象背后的全部引用关系图;能够根据地址获取对象linux

可是用它来找内存泄露仍是有点大海捞针的感受:须要本身更具增加最快、实际最多对象的日志来肯定可疑对象(通常是list/dict/tuple等common对象,这个很难排查;若是最多最快的是自定义的很是规对象则比较好肯定缘由)
sql

1.show_refs() show_backrefs() show_most_common_types() show_growth()
django

[dongsong@localhost python_study]$ !cat
cat objgraph1.py 
#encoding=utf-8
import objgraph

if __name__ == '__main__':
        x = []
        y = [x, [x], dict(x=x)]
        objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]里面全部对象的引用画出来
        objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把对x对象的引用所有画出来
        #objgraph.show_most_common_types() #全部经常使用类型对象的统计,数据量太大,意义不大
        objgraph.show_growth(limit=4) #打印从程序开始或者上次show_growth到如今增长的对象(按照增长量的大小排序)
[dongsong@localhost python_study]$ !vpython
vpython objgraph1.py 
Graph written to /tmp/tmpuSFr9A.dot (5 nodes)
Image generated as /tmp/sample-graph.png
Graph written to /tmp/tmpAn6niV.dot (7 nodes)
Image generated as /tmp/sample-backref-graph.png
tuple                          3393     +3393
wrapper_descriptor              945      +945
function                        830      +830
builtin_function_or_method      622      +622

sample-graph.png

sample-backref-graph.pngapp

2.show_chain()dom

[dongsong@localhost python_study]$ cat objgraph2.py 
#encoding=utf-8
import objgraph, inspect, random

class MyBigFatObject(object):
        pass

def computate_something(_cache = {}):
        _cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject())
        x = MyBigFatObject()

if __name__ == '__main__':
        objgraph.show_growth(limit=3)
        computate_something()
        objgraph.show_growth(limit=3)
        objgraph.show_chain(
                objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')),
                        inspect.ismodule),
                filename = '/tmp/chain.png')
        #roots = objgraph.get_leaking_objects()
        #print 'len(roots)=%d' % len(roots)
        #objgraph.show_most_common_types(objects = roots)
        #objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')
[dongsong@localhost python_study]$ !vpython
vpython objgraph2.py 
tuple                  3400     +3400
wrapper_descriptor      945      +945
function                831      +831
wrapper_descriptor      956       +11
tuple                  3406        +6
member_descriptor       165        +4
Graph written to /tmp/tmpklkHqC.dot (7 nodes)
Image generated as /tmp/chain.png

chain.png


3、gc模块函数

该模块能够肯定垃圾回收期没法引用到(unreachable)和没法释放(uncollectable)的对象,跟objgraph相比有其独到之处ui

gc.collect()强制回收垃圾,返回unreachable object的数量

gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()析构函数而且身陷引用循环的对象)IfDEBUG_SAVEALL is set, then all unreachable objects will be added to this list rather than freed.

warning:若是用gc.disable()把自动垃圾回收关掉了,而后又不主动gc.collect(),你会看到内存刷刷的被消耗....

[dongsong@bogon python_study]$ cat gc_test.py 
#encoding=utf-8

import gc

class MyObj:
        def __init__(self, name):
                self.name = name
                print "%s inited" % self.name
        def __del__(self):
                print "%s deleted" % self.name


if __name__ == '__main__':
        gc.disable()
        gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL)

        a = MyObj('a')
        b = MyObj('b')
        c = MyObj('c')
        a.attr = b
        b.attr = a
        a = None
        b = None
        c = None

        if gc.isenabled():
                print 'automatic collection is enabled'
        else:
                print 'automatic collection is disabled'

        rt = gc.collect()
        print "%d unreachable" % rt

        garbages = gc.garbage
        print "\n%d garbages:" % len(garbages)
        for garbage in garbages:
                if isinstance(garbage, MyObj):
                        print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr)
                else:
                        print str(garbage)


[dongsong@bogon python_study]$ vpython gc_test.py 
a inited
b inited
c inited
c deleted
automatic collection is disabled
gc: uncollectable <MyObj instance at 0x7f3ebd455b48>
gc: uncollectable <MyObj instance at 0x7f3ebd455b90>
gc: uncollectable <dict 0x261c4b0>
gc: uncollectable <dict 0x261bdf0>
4 unreachable

4 garbages:
obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>
obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>
{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}
{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}

4、pdb模块

详细手册:http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/

命令和gdb差不错(只是打印数据的时候不是必须加个p,并且调试界面和操做相似python交互模式)

h(elp) 帮助

c(ontinue)  继续

n(ext) 下一个语句

s(tep)  下一步(跟进函数内部)

b(reak) 设置断点

l(ist) 显示代码

bt 调用栈

回车 重复上一个命令

....

鸟人喜欢在须要调试的地方加入pdb.set_trace()而后进入状态....(其余还有好多方式备选)


5、django内存泄露

Why is Django leaking memory?

Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG setting is set toFalse. IfDEBUGisTrue, then Django saves a copy of every SQL statement it has executed.

(The queries are saved in django.db.connection.queries. SeeHow can I see the raw SQL queries Django is running?.)

To fix the problem, set DEBUG toFalse.

If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:

from django import db
db.reset_queries()