Question

我安装了CUDA 4.0, 并安装了2台(460卡)的仪器。

立方米与pt之间有何区别?

我认为,这立方米是树冠的本地代码,因此,这是一种特殊的微层构造,切除是一种中间语言,通过JIT汇编,在Fermi装置(例如Geforce G Sk 460)上运行。当我编辑<代码>.cu 源文档时,我可以选择从速或立方米的目标。如果我想起立案卷,我选择。但是,如果我需要一个字面文件一使用。

是否正确?

Answer 1

选择汇编阶段的备选办法(-ptx和-cubin)与用于控制哪些装置的备选办法(-code)相混合,因此,您应重新审议这些文件。

NVCC是NVIDIA编纂者司机。 <代码>-ptx和-cubin>的备选办法用于选择汇编的具体阶段,否则,任何具体阶段的备选办法都不会试图从投入中产生可执行的效果。大多数人使用<条码>-c选择,使Nvcc产生一份标书,然后由缺省平台链接器、<条码>-ptx和<条码>-cubin的选项链接起来,只有在您使用动因软件时,才会真正有用。关于中间阶段的更多信息,检查了在安装CUDA工具包时安装的Nvcc手册。

The output from -ptx is a plain-text PTX file. PTX is an intermediate assembly language for NVIDIA GPUs which has not yet been fully optimised and will later be assembled to the device-specific code (different devices have different register counts for example, hence fully optimising PTX would be wrong).
The output from -cubin is a fat binary which may contain one or more device-specific binary images as well as (optionally) PTX.

The -code argument you refer to has a different purpose entirely. I d encourage you to check out the nvcc documentation which contains several examples, in general I would advise using the -gencode option instead since it allows more control and allows you to target multiple devices in one binary. As a quick example:

-gencode arch=compute_xx,code= compute_xx,sm_yy,sm_zz causes nvcc to target all devices with compute capability xx (that s the arch= bit) and to embed PTX (code=compute_xx) as well as device specific binaries for sm_yy and sm_zz into the final fat binary.

友情链接