Computers and Technology

Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors u and v. an abstract version of the function has a cpe of 14{18 with x86-64 fordi erent types of integer andoating-point data. by doing the same sort of transformations we didto transform the abstract program combine1 into the more ecient combine4, we get the followingcode: void inner4(vec_ptr u, vec_ptr v, data t *dest) {long i; long length = vec_length(u); data_t *udata = get_vec_start(u); data_t *vdata = get_vec_start(v); data_t sum = (data_t) 0; for (i = 0; i < length; i++){sum = sum + udata[i] * vdata[i]; }*dest = sum; }our measurements show that this function has a cpe of 1.50 for integer data and 3.00 foroating-point data. for data type double, the x86-64 assembly code for the inner loop is asfollows: # inner loop of inner4. data_t = double. op = *.# udata in %rbp, vdata %rax, sum in %xmm0, i in rcx, limit in rbx. l15: # loop: vmovsd 0(%rbp,%rcx,8), %xmm1 # get udata[i]vmulsd (%rax,%rcx,8), %xmm1, %xmm1 # multiply by vdata[i]vaddsd %xmm1, %xmm0, %xmm0 # add to sumaddq $1, %rcx # increment icmpq %rbx, %rcx # compare i: limitjl .l15 # if < , goto loopassume that the functional units have the latencies and issue times given in figure 5.12 (andin the course notes).a. diagram how this instruction sequence would be decoded into operations, and show how the datadependencies between them would create a critical path of operations in the style of figures 5.13(figure: opt/dpb-sequential) and 5.14 (figure: opt/dpb-ow and figure: opt/dpb-ow-abstract). (25points.)b. for data type double, what lower bound on the cpe is determined by the critical path? givea numerical value and an explanation. (6 points.)c. assuming similar instruction sequences for the integer code as well, what lower bound on thecpe is determined by the critical path for integer data? give a numerical value and an explanation.(6 points.)d. explain how theoating-point version can have a cpe of 3.00 even though the multiplicationoperation requires 5 cycles. (6 points.)hw6-2 (27 points) write a version of the inner product procedure described in the previousproblem that uses six-way loop unrolling (6 1; no parallelism). (11 points.)

answer
Answers: 1

Other questions on the subject: Computers and Technology

image
Computers and Technology, 22.06.2019 04:30, kkeith121p6ujlt
Eye injuries usually occur as a result of all of the following things, except: a) proper machine operation b) battery explosion c) falling or flying debris d) electric welding arc
Answers: 2
image
Computers and Technology, 24.06.2019 17:30, mjmckay03
What is the main difference between cloud computing and saas? cloud computing is a platform, and saas is software. cloud computing is software, and saas is a platform. cloud computing is a service, and saas is software. cloud computing is a service, and saas is a platform.
Answers: 1
image
Computers and Technology, 24.06.2019 20:00, Cookie320
Write c++programs for the following problem: let the user enter two numbers and display which is greater. !
Answers: 1
image
Computers and Technology, 25.06.2019 08:30, FombafTejanjr7503
Which of these devices is usually the default for most home network?
Answers: 1
Do you know the correct answer?
Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors...

Questions in other subjects:

Konu
Mathematics, 26.03.2020 02:26