摘要:详细介绍了C++中的Name Mangling的原理和gcc中对应的实现,经过程序代码和nm c++filt等工具来验证这些原理。对于详细了解程序的连接过程有必定的帮助。ios
Name Mangling概述c++
C++的语言特性比C丰富的多,C++支持的函数重载功能是须要Name Mangling技术的最直接的例子。对于重载的函数,不能仅依靠函数名称来区分不一样的函数,由于C++中重载函数的区分是创建在如下规则上的:git
固然,C++还有不少其余的地方须要Name Mangling,如namespace, class, template等等。sql
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return 3.14 * a * b;
- }
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000028 T _Z11elipse_areaii
- 0000000000000000 T _Z9rect_areaiiii
- U __gxx_personality_v0
- 0000000000000052 T main
l C++语言中规定 :如下划线并紧挨着大写字母开头或者以两个下划线开头的标识符都是C++语言中保留的标示符。因此_Z9rect_areaiiii是保留的标识符,g++编译的目标文件中的符号使用_Z开头(C99标准)。express
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- #ifdef __cplusplus
- extern "C" {
- #endif
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return (int)(3.14 * a * b);
- }
- #ifdef __cplusplus
- }
- #endif
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ g++ -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- U __gxx_personality_v0
- 0000000000000028 T elipse_area
- 0000000000000052 T main
- 0000000000000000 T rect_area
事实上,C标准库中使用了大量的extern “C”关键字,由于C标准库也是能够用C++编译器编译的,可是要确保编译以后仍然保持C的接口而不是C++的接口(由于是C标准库),因此须要使用extern “C”关键字。网络
- /*
- * libc_test.c
- * a demo program to show that how the standard C
- * library are compiled when encountering a C++ compiler
- */
- #include<stdio.h>
- int main(int argc,char * argv[])
- {
- puts("hello world.\n");
- return 0;
- }
搜索一下puts,咱们并无看到extern “C”.奇怪么?ide
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'puts'
- extern int fputs (__const char *__restrict __s, FILE *__restrict __stream);
- extern int puts (__const char *__s);
- extern int fputs_unlocked (__const char *__restrict __s,
- puts("hello world.\n");
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'extern "C"'
- extern "C" {
- extern "C" {
不一样编译器使用不一样的方式进行name mangling, 你可能会问为何不将C++的 name mangling标准化,这样就能实现各个编译器之间的互操做了。事实上,在C++的FAQ列表上有对此问题的回答:函数
"Compilers differ as to how objects are laid out, how multiple inheritance is implemented, how virtual function calls are handled, and so on, so if the name mangling were made the same, your programs would link against libraries provided from other compilers but then crash when run. For this reason, the ARM (Annotated C++ Reference Manual) encourages compiler writers to make their name mangling different from that of other compilers for the same platform. Incompatible libraries are then detected at link time, rather than at run time."工具
GCC采用IA 64的name mangling方案,此方案定义于Intel IA64 standard ABI.在g++的FAQ列表中有如下一段话:
"GNU C++ does not do name mangling in the same way as other C++ compilers.布局
This means that object files compiled with one compiler cannot be used with
GNU C++的name mangling方案和其余C++编译器方案不一样,因此一种编译器生成的目标文件并不能被另一种编译器生成的目标文件使用。
- Builtin types encoding
- <builtin-type> ::= v # void
- ::= w # wchar_t
- ::= b # bool
- ::= c # char
- ::= a # signed char
- ::= h # unsigned char
- ::= s # short
- ::= t # unsigned short
- ::= i # int
- ::= j # unsigned int
- ::= l # long
- ::= m # unsigned long
- ::= x # long long, __int64
- ::= y # unsigned long long, __int64
- ::= n # __int128
- ::= o # unsigned __int128
- ::= f # float
- ::= d # double
- ::= e # long double, __float80
- ::= g # __float128
- ::= z # ellipsis
- ::= u <source-name> # vendor extended type
Operator encoding
- <operator-name> ::= nw # new
- ::= na # new[]
- ::= dl # delete
- ::= da # delete[]
- ::= ps # + (unary)
- ::= ng # - (unary)
- ::= ad # & (unary)
- ::= de # * (unary)
- ::= co # ~
- ::= pl # +
- ::= mi # -
- ::= ml # *
- ::= dv # /
- ::= rm # %
- ::= an # &
- ::= or # |
- ::= eo # ^
- ::= aS # =
- ::= pL # +=
- ::= mI # -=
- ::= mL # *=
- ::= dV # /=
- ::= rM # %=
- ::= aN # &=
- ::= oR # |=
- ::= eO # ^=
- ::= ls # <<
- ::= rs # >>
- ::= lS # <<=
- ::= rS # >>=
- ::= eq # ==
- ::= ne # !=
- ::= lt # <
- ::= gt # >
- ::= le # <=
- ::= ge # >=
- ::= nt # !
- ::= aa # &&
- ::= oo # ||
- ::= pp # ++
- ::= mm # --
- ::= cm # ,
- ::= pm # ->*
- ::= pt # ->
- ::= cl # ()
- ::= ix # []
- ::= qu # ?
- ::= st # sizeof (a type)
- ::= sz # sizeof (an expression)
- ::= cv <type> # (cast)
- ::= v <digit> <source-name> # vendor extended operator
- <type> ::= <CV-qualifiers> <type>
- ::= P <type> # pointer-to
- ::= R <type> # reference-to
- ::= O <type> # rvalue reference-to (C++0x)
- ::= C <type> # complex pair (C 2000)
- ::= G <type> # imaginary (C 2000)
- ::= U <source-name> <type> # vendor extended type qualifier
- /*
- * Author: Chaos Lee
- * Description: A simple demo to show how the rules used to mangle functions' names work
- * Date:2012/05/06
- */
- #include<iostream>
- #include<string>
- using namespace std;
- int test_func(int & tmpInt,const char * ptr,double dou,string str,float f)
- {
- return 0;
- }
- int main(int argc,char * argv[])
- {
- char * test="test";
- int intNum = 10;
- double dou = 10.012;
- string str="str";
- float f = 1.2;
- test_func(intNum,test,dou,str,f);
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ -c func.cpp
- [lichao@sg01 name_mangling]$ nm func.cpp
- nm: func.cpp: File format not recognized
- [lichao@sg01 name_mangling]$ nm func.o
- 0000000000000060 t _GLOBAL__I__Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t _Z41__static_initialization_and_destruction_0ii
- 0000000000000000 T _Z9test_funcRiPKcdSsf
- U _ZNSaIcEC1Ev
- U _ZNSaIcED1Ev
- U _ZNSsC1EPKcRKSaIcE
- U _ZNSsC1ERKSs
- U _ZNSsD1Ev
- U _ZNSt8ios_base4InitC1Ev
- U _ZNSt8ios_base4InitD1Ev
- 0000000000000000 b _ZSt8__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
加粗的那行就是函数test_func通过name mangling以后的结果,其中:
C++的name mangling技术通常使得函数变得面目全非,而不少状况下咱们在查看这些符号的时候并不须要看到这些函数name mangling以后的效果,而是想看看是否认义了某个函数,或者是否引用了某个函数,这对于咱们调试程序是很是有帮助的。
因此须要一种方法从name mangling以后的符号变换为name mangling以前的符号,这个过程称之为name demangling.事实上有不少工具提供这些功能,最经常使用的就是c++file命令,c++filt命令接受一个name mangling以后的符号做为输入并输出demangling以后的符号。例如:
- [lichao@sg01 name_mangling]$ c++filt _Z9test_funcRiPKcdSsf
- test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- [lichao@sg01 name_mangling]$ nm func.o | c++filt
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
- [lichao@sg01 name_mangling]$ nm -C func.o
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::string, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
又到了Last but not least important的时候了,还有一个特别重要的接口函数就是__cxa_demangle(),此函数的原型为:
- namespace abi {
- extern "C" char* __cxa_demangle (const char* mangled_name,
- char* buf,
- size_t* n,
- int* status);
- }
- /*
- * Author: Chaos Lee
- * Description: Employ __cxa_demangle to demangle a mangling function name.
- * Date:2012/05/06
- *
- */
- #include<iostream>
- #include<cxxabi.h>
- using namespace std;
- using namespace abi;
- int main(int argc,char *argv[])
- {
- const char * mangled_string = "_Z9test_funcRiPKcdSsf";
- char buffer[100];
- int status;
- size_t n=100;
- __cxa_demangle(mangled_string,buffer,&n,&status);
- cout<<buffer<<endl;
- cout<<status<<endl;
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ cxa_demangle.cpp -o cxa_demangle
- [lichao@sg01 name_mangling]$ ./cxa_demangle
- test_func(int&, char const*, double, std::string, float)
- 0
l 编写名称为name mangling接口函数,打开重复符号的编译开关,能够替换原来函数中连接函数的指向,从而改变程序的运行结果。