当编译cubin文件时，指明“arch sm_35“的区别

时间 2019-12-11

标签编译 cubin 文件指明 arch 区别栏目 Linux 繁體版

原文原文链接

接着上面一篇blogspa

nvcc -cubin -m64 -arch sm_35   *.cu --use_fast_math  --maxrregcount=32  --ptxas-options=-v -O3 -o *.cubin

当用上面的命令编译后，影响有两个：code

1. 显示会使用local memory。blog

2. 在Tesla K40上运行，没有问题。ci

nvcc -cubin -m64 *.cu --use_fast_math  --maxrregcount=32  --ptxas-options=-v -O3 -o *.cubin

若是去掉 -arch sm_35, 默认是给sm_20编译“compiling entry funciton '*' for 'sm_20'。影响有两个：it

1. 使用--maxrregcount=32，不适用local memory。io

2. 不能在 Tesla K40 上运行。编译

当编译cubin文件时， 指明“arch sm_35“的区别