Computers and Technology
Computers and Technology, 25.02.2020 21:45, wreckem

The following scalar product code tests your understanding of the basic CUDA model. The code computes 1024 dot products, each of which is calculated from a pair of 256-element vectors. Assume that the code is executed on G80. Use the code to answer the following questions.1 #define VECTOR_N 10242 #define ELEMENT_N 2563 const int DATA_N ¼ VECTOR_N * ELEMENT_N;4 const int DATA_SZ ¼ DATA_N * sizeof(float);5 const int RESULT_SZ ¼ VECTOR_N * sizeof(float);. . .6 float *d_A, *d_B, *d_C;. . .7 cudaMalloc((void **)&d_A, DATA_SZ);8 cudaMalloc((void **)&d_B, DATA_SZ);9 cudaMalloc((void **)&d_C, RESULT_SZ);. . .10 scalarProd<<>>(d_C, d_A, d_B, ELEMENT_N);1112 __global__ void13 scalarProd(float *d_C, float *d_A, float *d_B, int ElementN)14 {15 __shared__ float accumResult[ELEMENT_N];16 //Current vectors bases17 float *A ¼ d_A þ ElementN * blockIdx. x;18 float *B ¼ d_B þ ElementN * blockIdx. x;19 int tx ¼ threadIdx. x;2021 accumResult[tx] ¼ A[tx] * B[tx];2223 for(int stride ¼ ElementN /2; stride > 0; stride >>¼ 1)24 {25 __syncthreads();26 if(tx < stride)27 accumResult[tx] þ¼ accumResult[stride þ tx];28 }30 d_C[blockIdx. x] ¼ accumResult[0];31 }The following scalar product code tests your understanding of the basic CUDA model. The following code computes 1024 dot products, each of which is calculated from a pair of 256-element vectors. Assume that the code is executed on the G80. Use the code to answer the questions that follow. How many threads are there in total?How many threads are there in a warp?How many threads are there in a block?How many global memory loads and stores are done for each thread?How many accesses to shared memory are done for each block? (4pts.)List the source code lines, if any, that cause shared memory bank conflicts. (2 pts.)How many iterations of the for loop (Line 23) will have branch divergence? Show your derivation. Identify an opportunity to significantly reduce the bandwidth requirement on the global memory. How would you achieve this?How many accesses can you eliminate?

answer
Answers: 3

Other questions on the subject: Computers and Technology

image
Computers and Technology, 22.06.2019 06:30, miguel3maroghi
This technology is used to produce high-quality documents that look good on the computer screen and in print. wiki presentation paint desktop publishing
Answers: 3
image
Computers and Technology, 22.06.2019 08:30, gg68814
1.the index finger on your right hand types the f r v 4 j u m 7 h y 6 n lo.9 j u 7 m g t 5 b 2.if you need to multiply 400, 2, and 1 ½, what would you type on the numeric keypad? 400*2*1.5 400/2*1.5 400/2/1.5 400*2*1½ 3.select all examples of proper keyboarding technique. rest your fingers gently on the home row or home keys. slouch in your chair. rest your palms on the keyboard. relax your fingers. keep your hands lower than your elbows.
Answers: 1
image
Computers and Technology, 22.06.2019 23:00, brooklynmikestovgphx
Suppose s, t, and w are strings that have already been created inside main. write a statement or statements, to be added to main, that will determine if the lengths of the three strings are in order by length, smallest to largest. that is, your code should determine if s is strictly shorter than t, and if t is strictly shorter than w. if these conditions hold your code should print (the boolean value) true. if not, your code should print false. (strictly means: no ties) example: if s, t, and w are "cat", "hats", and "skies" your code should print true - their lengths are 3-4-5; but if s, t, and w are "cats" "shirt", and "trust", then print false - their lengths are 4-5-5 enter your code in the box below
Answers: 2
image
Computers and Technology, 23.06.2019 21:40, gaby06
Simon says is a memory game where "simon" outputs a sequence of 10 characters (r, g, b, y) and the user must repeat the sequence. create a for loop that compares the two strings. for each match, add one point to user_score. upon a mismatch, end the game. sample output with inputs: 'rrgbryybgy' 'rrgbbrybgy'
Answers: 3
Do you know the correct answer?
The following scalar product code tests your understanding of the basic CUDA model. The code compute...

Questions in other subjects: