在项目中涉及到网络功能时,常常会用到gethostbyname函数来实现域名到IP地址的解析。可是该函数经过dns解析域名时是阻塞方式的行为,由于当程序运行环境网络不通时,调用它的进程就会阻塞,这在单进程环境下不是问题,但在多线程环境下时,这将致使整个整个进程的阻塞,经常不是指望的行为。最近项目开发中恰好遇到了这个问题,因此思考了一下它的阻塞超时实现,也许不是很完美但测试能用。 shell
实现经过使用alarm函数发出的定时信号和siglongjmp函数来解除gethostbyname函数的阻塞,由于涉及到线程与信号的复杂关系,实现也就稍显复杂了。首先须要注意的几点是: ubuntu
接下来看代码实现,首先是一些静态变量与信号处理函数的定义: 网络
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> #include <fcntl.h> #include <unistd.h> #include <resolv.h> #include <arpa/nameser.h> #include <errno.h> #include <setjmp.h> #include <time.h> #include <sys/time.h> #include <signal.h> #include <pthread.h> #define RET_FAILURE (-1) #define RET_SUCCESS 0 #define PLOG(level,format,args...) \ do{printf("[%s]",#level);printf(format,##args);}while(0) static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; static sigjmp_buf jmpbuf; static volatile sig_atomic_t canjump; static void alarm_handle(int signo) { if(canjump == 0) return; canjump = 0; siglongjmp(jmpbuf,1); }
线程锁用来保证一次只有一个线程调用gethostbyname。原子变量canjump用来保证siglongjmp跳转以前,已经成功执行过sigsetjmp设置好了jmpbuf跳转缓冲。 多线程
下面是gethostbyname的包装函数实现: 异步
int gethostbyname_proc2(char *name,char *ip) { int ret = RET_SUCCESS; struct hostent *host = NULL; int timeout = 5; if(name == NULL || ip == NULL) { PLOG(ERR,"invalid params!\n"); return RET_FAILURE; } pthread_mutex_lock(&lock); sigset_t mask,oldmask; sigemptyset(&mask); sigaddset(&mask,SIGALRM); pthread_sigmask(SIG_UNBLOCK,&mask,&oldmask); #if 1 signal(SIGALRM, alarm_handle); alarm(timeout); if(sigsetjmp(jmpbuf,1)!=0) { PLOG(ERR,"gethostbyname timeout\n"); alarm(0); signal(SIGALRM,SIG_IGN); pthread_mutex_unlock(&lock); pthread_sigmask(SIG_SETMASK,&oldmask,NULL); return RET_FAILURE; } canjump = 1; /* sigsetjmp() is ok */ #endif res_init(); /* clear dns_cache */ host = gethostbyname(name); int i = 0; while(1) { printf(">>>i=%d\n",i++);//host = NULL; sleep(1); } /* cancel signal handle if return */ alarm(0); // cancel timer signal(SIGALRM,SIG_IGN); if (host == NULL) {// use h_errno not errno variable PLOG(ERR, "get host %s err:%s!\n", name, hstrerror(h_errno)); ret = RET_FAILURE; } else {// only get the first ipv4 addr if host has many ipv4 addrs inet_ntop(AF_INET,(struct in_addr *)host->h_addr,ip,INET_ADDRSTRLEN); PLOG(DBG, "gethostbyname %s success,ip:%s!\n",name,ip); ret = RET_SUCCESS; } pthread_sigmask(SIG_SETMASK,&oldmask,NULL); pthread_mutex_unlock(&lock); return ret; }
首先解除线程的SIGALRM信号阻塞以并接收该信号,而后设置跳转缓冲以及超时后的处理逻辑,while(1)代码段是为了模拟gethostbyname执行阻塞超时(模拟网络不通环境,仅为测试),在gethostbyname执行成功后取消定时器并转换IP地址。这里用可重入的inet_ntop函数代替inet_ntoa函数。 函数
测试线程与主程序代码: 测试
void *get_host_addr(void *arg) { int ret = 0; char name[32] = "www.baidu.com"; char ip[16]={0}; while(1) { printf("++++++++++++++[%s]time1 = %lu +++++++++++\n",(char*)arg,time(NULL)); ret = gethostbyname_proc2(name,ip); printf("++++++++++++++[%s]time2 = %lu +++++++++++\n",(char*)arg,time(NULL)); usleep(100000); } return (void*)ret; } int main(int argc, char *argv[]) { int ret = 0; char name[32] = "www.baidu.com"; char ip[16]={0}; pthread_t tid1,tid2; sigset_t mask,oldmask; sigemptyset(&mask); sigaddset(&mask,SIGALRM); pthread_sigmask(SIG_BLOCK,&mask,&oldmask); pthread_create(&tid1,NULL,get_host_addr,"T_11"); pthread_create(&tid2,NULL,get_host_addr,"T_22"); pthread_join(tid1,NULL); pthread_join(tid2,NULL); sigprocmask(SIG_SETMASK,&oldmask,NULL); return ret; }建立两个线程不断去获取百度的IP地址,在主线程中首先阻止SIGALRM信号的发送,而使用pthread_create函数建立新线程时,新建线程会继承现有的信号屏蔽字。因此只有在线程调用gethostbyname函数时才会接收到SIGALRM信号。
当执行信号处理函数时,系统会屏蔽掉SIGALRM信号的接收,若是使用setjmp/longjmp函数则跳转回去后SIGALRM信号依然被屏蔽,这显然是不合适的,因此必须用sigsetjmp/siglongjmp来保证信号屏蔽字的恢复。 atom
实现的执行结果测试以下: spa
hong@ubuntu:~/test/test-example$ ./gethostbyname_proc ++++++++++++++[T_11]time1 = 1384779930 +++++++++++ ++++++++++++++[T_22]time1 = 1384779930 +++++++++++ >>>i=0 >>>i=1 >>>i=2 >>>i=3 >>>i=4 [ERR]gethostbyname timeout ++++++++++++++[T_11]time2 = 1384779935 +++++++++++ >>>i=0 ++++++++++++++[T_11]time1 = 1384779935 +++++++++++ >>>i=1 >>>i=2 >>>i=3 >>>i=4 [ERR]gethostbyname timeout ++++++++++++++[T_22]time2 = 1384779940 +++++++++++ >>>i=0 ++++++++++++++[T_22]time1 = 1384779940 +++++++++++ >>>i=1 >>>i=2 >>>i=3 >>>i=4 [ERR]gethostbyname timeout ++++++++++++++[T_11]time2 = 1384779945 +++++++++++ >>>i=0 ++++++++++++++[T_11]time1 = 1384779945 +++++++++++ >>>i=1 >>>i=2 ^C若是不进行SIGALRM信号的线程屏蔽,则在调用一次gethostbyname_proc2后就会出线段错误。缘由是siglongjmp跳转到了未初始化的栈内存中,而更深层致使跳转错误的缘由应该是SIGALRM信号随机发送到了不一样的线程,而该线程没有执行sigsetjmp函数(不是正在调用gethostbyname_proc函数的线程)。