4000-520-616
欢迎来到免疫在线!(蚂蚁淘生物旗下平台)  请登录 |  免费注册 |  询价篮
主营:原厂直采,平行进口,授权代理(蚂蚁淘为您服务)
咨询热线电话
4000-520-616
当前位置: 首页 > 新闻动态 >
新闻详情
在Windows上编译Qt Creator中的Cuda代码 - IT屋-程序员软件开发...
来自 : www.it1352.com/4555...html 发布时间:2021-03-25

I have been trying for days to get a Qt project file running on a 32-bit Windows 7 system, in which I want/need to include Cuda code. This combination of things is either so simple that no one ever bothered to put an example online, or so difficult that nobody ever succeeded, it seems. Whatever way, the only helpful forum threads I found were the same issue on Linux or Mac, or with Visual Studio on a Windows. All of these give all sorts of different errors, however, whether due to linking or clashing libraries, or spaces in file names or non-existing folders in the Windows version of the Cuda SDK. Is there someone who has a clear .pro file to offer that does the trick?

I am aiming to compile a simple programme with ordinary C++ code in Qt style, with Qt 4.8 libraries, which reference several Cuda modules in .cu files. Something of the form:

TestCUDA \\ TestCUDA.pro main.cpp test.cu
解决方案

So I finally managed to assemble a .pro file that works on my and probably on all Windows systems. The following is an easy test programme that should probably do the trick. The following is a small project file plus test programme that works at least on my system.

The file system looks as follows:

TestCUDA \\ TestCUDA.pro main.cpp vectorAddition.cu

The project file reads:

TARGET = TestCUDA# Define output directoriesDESTDIR = releaseOBJECTS_DIR = release/objCUDA_OBJECTS_DIR = release/cuda# Source filesSOURCES += src/main.cpp# This makes the .cu files appear in your projectOTHER_FILES += vectorAddition.cu# CUDA settings -- may change depending on your systemCUDA_SOURCES += src/cuda/vectorAddition.cuCUDA_SDK = \"C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.2/C\" # Path to cuda SDK installCUDA_DIR = \"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.2\" # Path to cuda toolkit installSYSTEM_NAME = Win32 # Depending on your system either \'Win32\', \'x64\', or \'Win64\'SYSTEM_TYPE = 32 # \'32\' or \'64\', depending on your systemCUDA_ARCH = sm_11 # Type of CUDA architecture, for example \'compute_10\', \'compute_11\', \'sm_10\'NVCC_OPTIONS = --use_fast_math# include pathsINCLUDEPATH += $$CUDA_DIR/include \\ $$CUDA_SDK/common/inc/ \\ $$CUDA_SDK/../shared/inc/# library directoriesQMAKE_LIBDIR += $$CUDA_DIR/lib/$$SYSTEM_NAME \\ $$CUDA_SDK/common/lib/$$SYSTEM_NAME \\ $$CUDA_SDK/../shared/lib/$$SYSTEM_NAME# Add the necessary librariesLIBS += -lcuda -lcudart# The following library conflicts with something in CudaQMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.libQMAKE_LFLAGS_DEBUG = /NODEFAULTLIB:msvcrtd.lib# The following makes sure all path names (which often include spaces) are put between quotation marksCUDA_INC = $$join(INCLUDEPATH,\'\" -I\"\',\'-I\"\',\'\"\')# Configuration of the Cuda compilerCONFIG(debug, debug|release) { # Debug mode cuda_d.input = CUDA_SOURCES cuda_d.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o cuda_d.commands = $$CUDA_DIR/bin/nvcc.exe -D_DEBUG $$NVCC_OPTIONS $$CUDA_INC $$LIBS --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME} cuda_d.dependency_type = TYPE_C QMAKE_EXTRA_COMPILERS += cuda_delse { # Release mode cuda.input = CUDA_SOURCES cuda.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o cuda.commands = $$CUDA_DIR/bin/nvcc.exe $$NVCC_OPTIONS $$CUDA_INC $$LIBS --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME} cuda.dependency_type = TYPE_C QMAKE_EXTRA_COMPILERS += cuda

}

Note the QMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.lib: it took me a long time to figure out, but this library seems to clash with other things in Cuda, which produces strange linking warnings and errors. If someone has an explanation for this, and potentially a prettier way to get around this, I\'d like to hear it.

Also, since Windows file paths often include spaces (and NVIDIA\'s SDK by default does so too), it is necessary to artificially add quotation marks around the include paths. Again, if someone knows a more elegant way of solving this problem, I\'d be interested to know.

The main.cpp file looks like this:

#include cuda.h #include builtin_types.h #include drvapi_error_string.h #include QtCore/QCoreApplication #include QDebug // Forward declare the function in the .cu filevoid vectorAddition(const float* a, const float* b, float* c, int n);void printArray(const float* a, const unsigned int n) { QString s = \"(\"; unsigned int ii; for (ii = 0; ii n - 1; ++ii) s.append(QString::number(a[ii])).append(\", \"); s.append(QString::number(a[ii])).append(\")\"); qDebug() s;int main(int argc, char* argv []) QCoreApplication(argc, argv); int deviceCount = 0; int cudaDevice = 0; char cudaDeviceName [100]; unsigned int N = 50; float *a, *b, *c; cuInit(0); cuDeviceGetCount( deviceCount); cuDeviceGet( cudaDevice, 0); cuDeviceGetName(cudaDeviceName, 100, cudaDevice); qDebug() \"Number of devices: \" deviceCount; qDebug() \"Device name:\" cudaDeviceName; a = new float [N]; b = new float [N]; c = new float [N]; for (unsigned int ii = 0; ii ++ii) { a[ii] = qrand(); b[ii] = qrand(); // This is the function call in which the kernel is called vectorAddition(a, b, c, N); qDebug() \"input a:\"; printArray(a, N); qDebug() \"input b:\"; printArray(b, N); qDebug() \"output c:\"; printArray(c, N); if (a) delete a; if (b) delete b; if (c) delete c;

The Cuda file vectorAddition.cu, which describes a simple vector addition, look like this:

#include cuda.h #include builtin_types.h extern \"C\"__global__ void vectorAdditionCUDA(const float* a, const float* b, float* c, int n) int ii = blockDim.x * blockIdx.x + threadIdx.x; if (ii n) c[ii] = a[ii] + b[ii];void vectorAddition(const float* a, const float* b, float* c, int n) { float *a_cuda, *b_cuda, *c_cuda; unsigned int nBytes = sizeof(float) * n; int threadsPerBlock = 256; int blocksPerGrid = (n + threadsPerBlock - 1) / threadsPerBlock; // allocate and copy memory into the device cudaMalloc((void **) a_cuda, nBytes); cudaMalloc((void **) b_cuda, nBytes); cudaMalloc((void **) c_cuda, nBytes); cudaMemcpy(a_cuda, a, nBytes, cudaMemcpyHostToDevice); cudaMemcpy(b_cuda, b, nBytes, cudaMemcpyHostToDevice); vectorAdditionCUDA blocksPerGrid, threadsPerBlock (a_cuda, b_cuda, c_cuda, n); // load the answer back into the host cudaMemcpy(c, c_cuda, nBytes, cudaMemcpyDeviceToHost); cudaFree(a_cuda); cudaFree(b_cuda); cudaFree(c_cuda);

If you get this to work, then more complicated examples are self-evident, I think.

Edit (24-1-2013): I added the QMAKE_LFLAGS_DEBUG = /NODEFAULTLIB:msvcrtd.lib and the CONFIG(debug) with the extra D_DEBUG flag, such that it also compiles in debug mode.

本文链接: http://procudan.immuno-online.com/view-762495.html

发布于 : 2021-03-25 阅读(0)