Computer Systems Sample Final Questions

Question 1. Characteristics of CISC (complex instruction set computers) computers are:

Why do these characteristics make CISC hard to pipeline?

Question 2. Explain why a bus with dedicated (separate) address and data lines can have a faster WRITE operation than a bus with time-multiplexed address and data lines (a time-multiplexed bus means that the same lines are used to send the address and data).

Question 3. Explain why a bus with dedicated (separate) address and data lines would NOT have a faster READ operation than a bus with time-multiplexed address and data lines (a time-multiplexed bus means that the same lines are used to send the address and data).

Question 4.

a) In the asynchronous READ timing diagram shown, how does the Slave know that the Master has read the data off the Data lines?

b) In the asynchronous READ timing diagram shown, how is bus skew handed? Question 5.

a) For the five stage pipeline of discussed in class (see above), complete the following timing diagram Note that:

assuming NO by-pass signal paths (i.e., no forwarding).

Instructions

Time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
ADD R2, R6, R7 IF ID EX ME WB                                      
LOAD R1, 16(R2)                                                
SUB R4, R5, R1                                                
ADD R2, R3, R4                                                
STORE R2, 8(R6)                                                
ADD R1, R2, R3                                                

b) Complete the following timing diagram assuming by-pass signal paths (i.e., there is forwarding).

Instructions

Time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
ADD R2, R6, R7 IF ID EX ME WB                            
LOAD R1, 16(R2)                                      
SUB R4, R5, R1                                      
ADD R2, R3, R4                                      
STORE R2, 8(R6)                                      
ADD R1, R2, R3                                      

c) In the diagram at the top of the page add all by-pass signal paths used in part (b).

Question 6. Consider the following sequential search algorithm that searches an array for a specified "target" value. The index of where the "target" value is found is returned. If the "target" value is not in the array, then -1 is returned.

SequentialSearch (integer numberOfElements, integer target, integer array numbers[]) returns an integer

integer test;

for test = 1 to numberOfElements do

if number[test] = target then

return test;

end if

end for

return -1;

end SequentialSearch

a) Where in the code would unconditional branches be used and where would conditional branches be used?

b) If the compiler could statically predict by opcode for the conditional branches (i.e., select whether to use machine language statements like: "BRANCH_LE_PREDICT_NOT_TAKEN" or "BRANCH_LE_PREDICT_TAKEN"), then which conditional branches would be "PREDICT_NOT_TAKEN" and which would be "PREDICT_TAKEN"?

c) Assumptions:

Under the above assumptions, answer the following questions:

i) If static predict-never-taken is used by the hardware, then what will be the total branch penalty (# cycles wasted) for the algorithm? (Here assume NO branch-history table) For partial credit, explain your answer.

ii) If a branch-history table with one history bit per entry is used, then what will be the total branch penalty (# cycles wasted) for the algorithm? (Assume predict-not taken is used if there is no match in the branch-history table) For partial credit, explain your answer.

Question 7. Explain how a branch-history table with two history bits per entry (i.e., two wrong predictions needed before changing the prediction) is useful for reducing the total branch penalty for a program that has nested loops? (Assume predict-not taken is used if there is no match in the branch-history table)

Question 8. How do superpipelining computers try to improve performance over traditional pipeline computers?

Question 9. How do superscalar computers try to improve performance over traditional pipeline computers?

Question 10. Processors based on the VLWI architecture rely on the compiler to encodes multiple operations into a long instruction word so hardware can safely schedule these operations at run-time on multiple functional units without dependency analysis. Why might the compiler be better able to find instructions that do not have dependencies than the run-time hardware of a superscalar computer?