cuda - Grid of thread blocks and Multiprocessor -
the cuda programing guide states:
the cuda architecture built around scalable array of multithreaded streaming multiprocessors (sms). when cuda program on host cpu invokes kernel grid, blocks of grid enumerated , distributed multiprocessors available execution capacity. threads of thread block execute concurrently on 1 multiprocessor, , multiple thread blocks can execute concurrently on 1 multiprocessor. thread blocks terminate, new blocks launched on vacated multiprocessors.
does mean if have video card of 2 multiprocessor x n-cuda cores , if launch kernel like
mykernel<<<1,n>>>(sth);
one of multiprocessors idle, since i'm launching single block of n threads?
you correct.
in currect cuda architectures, block ever scheduled , run on single multiprocessor. if run 1 block on device more 1 multiprocessor, 1 of multiprocessors idle.
Comments
Post a Comment