2. In explicit parallel instruction computing (EPIC), the compiler encodes multiple operations into a long instruction word so hardware can schedule these operations at run-time on multiple functional units without analysis. Why might the compiler be better able to find instructions that do not have dependencies that run-time hardware of a superscalar computer?

3. For a typical program on a traditional computer, more time is spent doing procedure calls than anything else. Why are procedure calls so time consuming?

4. What ways are parameters passed on a procedure call?

5. Translate the following code to Itanium assembly language to eliminate the branch instructions by using predicate register(s).

if (R2 < R3) then

R1 = R1 + 1

else

R1 = R1 - R2

end if

1. Tomasulo's Algorithm: A loop-based example

Loop: LD F0, 0 (R1)

MULTD F4, F0, F2

SD F4, 0 (R1)

ADDI R1, R1, 8

BNE R1, R2, Loop ; Branch if R1 is Not Equal to R2

Assuming that instructions for two successive iterations of the loops get issued before either of the Load operations complete. Assuming that MULTD takes 4 clock cycles to execute, how long would it take to complete these two loop iterations?