上周在进行压测时,某个调用hiredis库的函数出现了coredump,调用栈以下:git
Program terminated with signal 11, Segmentation fault. #0 0x000000000052c497 in wh::common::redis::RedisConn::HashMultiGet(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >&) () (gdb) bt #0 0x000000000052c497 in wh::common::redis::RedisConn::HashMultiGet(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >&) () #1 0x00000000004cc418 in wh::server::user_activityHandler::getUserRating(wh::server::GetUserRatingResult&, std::vector<int, std::allocator<int> > const&) () #2 0x00000000004e54cf in wh::server::user_activityProcessor::process_getUserRating(int, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, void*) () #3 0x00000000004e3ad3 in wh::server::user_activityProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
RedisConn
中的HashMultiGet
代码以下:github
int RedisConn::HashMultiGet( const string& key, const vector<string>& fields, map<string, string>& fvs) { if(key.empty() || fields.empty()) return 0; if ( !conn_ ) { LOG(LOG_ERR, "ERROR!!! conn is NULL!!!"); return kErrConnBroken; } size_t argc = fields.size() + 2; const char* argv[argc]; //在栈中直接分配内存 size_t argvlen[argc]; std::string cmd = "HMGET"; argv[0] = cmd.data(); argvlen[0] = cmd.length(); argv[1] = key.data(); argvlen[1] = key.length(); size_t i = 2; for(vector< string >::const_iterator cit = fields.begin(); cit != fields.end(); ++cit ) { // put value into arg list argv[i] = cit->data(); argvlen[i] = cit->length(); ++i; } redisReply* reply = static_cast<redisReply*>( redisCommandArgv( conn_, argc, argv, argvlen ) ); if ( !reply ) { this->Release(); LOG(LOG_ERR, "ERROR!!! Redis connection broken!!!"); return kErrConnBroken; } int32_t ret = kErrOk; if ( reply->type != REDIS_REPLY_ARRAY ) { this->CheckReply( reply ); LOG(LOG_ERR, "RedisReply ERROR: %d %s", reply->type, reply->str); ret = kErrUnknown; } ...
其中出现问题的地方是构造hiredis
的redisCommandArgv
请求时,构造的两个参数都是直接在栈上分配。redis
const char* argv[argc]; //在栈中直接分配内存 size_t argvlen[argc];
压测时,HashMultiGet(key, fields, fvs)
中fields
大小超过10万,那么在栈上分配的内存为 10万 * (8 + 8) = 160万字节 = 1.6MB (64位系统),再加上以前分配的栈,将栈打爆了,致使了coredump.apache
为何要将参数在栈上分配呢?一种多是:若是在堆上分配,就须要考虑free的问题。less
解决方法:
将argv和argvlen在堆上分配,毕竟堆的大小比栈大不少。函数