背景:又要作狗血的数据迁移、数据清洗,每次面对此类需求,内心都会把pm祖宗老老少小都问候个遍。解决办法,要么用java写一大堆支撑代码,要么在vm上写蹩脚的shell,都很不爽。最近尝试了下python解决此类问题,有点小爽,心中安喜 : "BB,我不再怕pm这些脑残需求了"。php
环境准备: Ubuntu 13.04。java
python #
建议至少安装Python2.7/3.2版本,毕竟同Python 2.X/3.x仍是有区别的
sudo apt-get install python2.7 python2.7-dev
mysql
#安装libssl和libevent编译环境
sudo apt-get install build-essential libssl-dev libevent-dev libjpeg-dev libxml2-dev libxslt-dev
c++
#安装mysqldbsql
sudo easy_install mysql-pythonshell
#测试数据库
whereis python | python -V数组
Python开始: 有了上面的环境准备,就能够书写pthon了。建立python文件,touch firstPython.py。文件名是firstPython,扩展名是py。编辑此文件,相似php、java,python也有本身的函数库。bash
主方法,python文件被执行的入口,
读取主方法传入的参数,sys.argv返回的是一个参数数组,sys.argv[index]。
定义成员方法,在主方法中调用执行。
profile是成员方法的参数,因为python是弱语言类型,因此变量不须要声明类型,这点有别于强语言类型c++、java。
因为个人需求背景是数据迁移,因此在python中有效的访问数据库很重要。
看到这个数据库握手链接、fetch数据,是否是感到很方便、简洁,和php同样的类库风格。
python数据结构,内置类型
一、list:列表(动态数组, c++标准库的vector,能够在一个列表中包含不一样类型的元素)
列表下标从0开始,-1是最后一个元素。取list的元素数量:len(list)。
建立连续的list
L.append(var) #追加元素
L.insert(index,var)
L.pop(var) #返回最后一个元素,并从list中删除
L.remove(var) #删除第一次出现的该元素
L.count(var) #该元素在列表中出现的个数
L.index(var) #该元素的位置,无则抛异常
L.extend(list) #追加list,即合并list到L上
L.sort() #排序
L.reverse() #倒序
二、dictionary 字典(c++标准库的map)
每个元素是一个pair键值对,key是Integer或String类型,value是任意类型。
dictionary的方法:
D.get(key, 0) #同dict[key],多了个没有则返回缺省值,0。[]没有则抛异常
D.has_key(key) #有该键返回TRUE,不然FALSE
D.keys() #返回字典键的列表
D.values() #以列表的形式返回字典中的值,返回值的列表中可包含重复元素
D.items() #将全部的字典项以列表方式返回,这些列表中的每一项都来自于(键,值),可是项在返回时并无特殊的顺序
D.update(dict2) #增长合并字典
D.popitem() #获得一个pair,并从字典中删除它。已空则抛异常
D.clear() #清空字典,同del dict
D.copy() #拷贝字典
D.cmp(dict1,dict2) #比较字典,(优先级为元素个数、键大小、键值大小) 第一个大返回1,小返回-1,同样返回0
dictionary的复制
dict1 = dict #别名
dict2=dict.copy() #克隆,即另外一个拷贝。
示例代码:
import MySQLdbimport timeimport datetimeimport sys,oshms_connections = {}transfer_connections = {}totalResult = []def queryFromHms(): print "query from hms beginning..." db=MySQLdb.connect(host=hms_connections.get('host'),user=hms_connections.get('user'),passwd=hms_connections.get('passwd'),db=hms_connections.get('db'),port=hms_connections.get('port')) try: cursor = db.cursor() resultPerDay = {} sql = "select a.user_id,a.hotel_id,a.parent_group_id from hotel_sub_account a inner join lm_transfer_hotel b on a.hotel_id = b.hotel_id and b.QTA_STATUS=1" print sql cursor.execute(sql) results = cursor.fetchall() for row in results: print row user_id = row[0] hotel_id = row[1] parent_group_id = row[2] totalResult.append({"user_id":user_id, "hotel_id" : hotel_id, "parent_group_id" : parent_group_id}) cursor.close(); finally: db.close(); print "function queryFromHms to close db connection...";def queryFromTransfer(): print "query from transfer beginning..." db=MySQLdb.connect(host=transfer_connections.get('host'),user=transfer_connections.get('user'),passwd=transfer_connections.get('passwd'),db=transfer_connections.get('db'),port=transfer_connections.get('port')) try: for row in totalResult: cursor = db.cursor() sql = "select qta_id,hms_id from mapping_hms_qta_price where hms_level=1 and qta_level=1 and hms_id = %s" %(row.get('parent_group_id')) print sql cursor.execute(sql) results = cursor.fetchall() for subrow in results: print subrow row["qta_id"]= subrow[0] cursor.close(); finally: db.close(); print "function queryFromTransfer to close db connection...";def outputSupplierAccount(): print "output sql to supplier_account..." upgradeSql = "insert into supplier_account (`supplier_id`, `account`, `create_time`, `is_delete`) values(%(qta_id)s, '%(user_id)s', now(), 0); \n" callbackSql = "delete from supplier_account where supplier_id = %(qta_id)s and account = '%(user_id)s'; \n" upgradeFile = open("qta_upgrade.sql", "w") callbackFile = open("qta_callback.sql", "w") for row in totalResult: upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def outputUserHotelMapping(): print "output sql to eb_auth_user_hotel_mapping..." upgradeSql = "insert into eb_auth_user_hotel_mapping (`user_name`, `hotel_id`, `create_time`, `hotel_seq`, `supplier_id`, `group_id`) values('%(user_id)s', '', now(), '', %(qta_id)s, %(parent_group_id)s); \n" callbackSql = "delete from eb_auth_user_hotel_mapping where `user_name`='%(user_id)s' and `supplier_id`=%(qta_id)s and `group_id`=%(parent_group_id)s; \n" upgradeFile = open("hms_upgrade.sql", "w") callbackFile = open("hms_callback.sql", "w") for row in totalResult: upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def outputUserUriMapping(): print "output sql to eb_auth_user_uri_mapping..." upgradeFile = open("hms_upgrade.sql", "a") callbackFile = open("hms_callback.sql", "a") uris = [1,2,3,5,6,7,8,9,10,21,22,24,34,35,36,37,40,41,42,43,44,46,47,49,50,54,55,56,57,58,59,60,61,62,63,76,77,78,79] for row in totalResult: for uri in uris: upgradeSql = "insert into eb_auth_user_uri_mapping(`user_name`, `uri_id`, `create_time`) values('%s', %s, now()); \n" %(row['user_id'], uri) callbackSql = "delete from eb_auth_user_uri_mapping where user_name='%s' and uri_id=%s; \n" %(row['user_id'], uri) upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def configDbProfile(profile): print "current DB profile is %s" %(profile) if profile == "beta": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "" hms_connections['port'] = 3306 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "data_transfer" transfer_connections['port'] = 3306 elif profile == "product": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "hms" hms_connections['port'] = 3307 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "" transfer_connections['port'] = 3307 elif profile == "productb": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "hms" hms_connections['port'] = 3307 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "data_transfer" transfer_connections['port'] = 3308 else: print "input parameter invalid, choose (beta | product | productb)" sys.exit(0)if __name__ == '__main__': if len(sys.argv) != 2: print "please input parameter : (beta | product | productb)" sys.exit(0) profile = sys.argv[1] configDbProfile(profile) queryFromHms(); queryFromTransfer(); outputSupplierAccount(); outputUserHotelMapping(); outputUserUriMapping();