Skip to content

对 gridDim.x 和 y 的困惑 #4

@sonald

Description

@sonald

大神,我初学 cuda,对代码中的实现有一个困惑,在 kernel 中x 维度代表的是行还是列,看代码应该是列(一行的0~blockDim.x-1列)?

__global__ __launch_bounds__(1024) void
mysgemm_v1(int M, int N, int K, float alpha, float *A, float *B, float beta, float *C) {

    int gx = blockIdx.x * blockDim.x + threadIdx.x; // 全局x
    int gy = blockIdx.y * blockDim.y + threadIdx.y; // 全局y

    float tmp = 0.;
    for (int i = 0; i < K; i++) {
        tmp += A[gy * K + i] * B[i * N + gx]; // 两次全局内存访问和一次FMA(累加乘)
    }
    C[gy * N + gx] = alpha * tmp + beta * C[gy * N + gx];
}

那为什么在设置 gridDim 的时候,x 维护用的 CEIL_DIV(M, 32)而不是 CEIL_DIV(N, 32)?

void test_mysgemm_v1(int M, int N, int K, float alpha, float *A, float *B, float beta, float *C) {
    dim3 blockDim(32, 32);
    dim3 gridDim(CEIL_DIV(M, 32), CEIL_DIV(N, 32));
    mysgemm_v1<<<gridDim, blockDim>>>(M, N, K, alpha, A, B, beta, C);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions