Computer Architecture Homework #3

Due: 10/21/05 (Friday by 5 PM)

1. Consider the following insertion sort algorithm that sorts an array numbers:

InsertionSort(numbers - address to integer array, length - integer)

integer firstUnsortedIndex, testIndex, elementToInsert;

for firstUnsortedIndex = 1 to (length-1) do

testIndex = firstUnsortedIndex-1;

elementToInsert = numbers[firstUnsortedIndex];

while (testIndex >=0) AND (numbers[testIndex] > elementToInsert ) do

numbers[ testIndex + 1 ] = numbers[ testIndex ];

testIndex = testIndex - 1;
end while

numbers[ testIndex + 1 ] = elementToInsert;

end for
end InsertionSort

a) Where in the code would unconditional branches be used and where would conditional branches be used?

b) If the compiler could predict by opcode for the conditional branches (i.e., select whether to use machine language statements like: "BRANCH_LE_PREDICT_NOT_TAKEN" or "BRANCH_LE_PREDICT_TAKEN"), then which conditional branches would be "PREDICT_NOT_TAKEN" and which would be "PREDICT_TAKEN"?

c) Assumptions:

Under the above assumptions, answer the following questions:

i) If fixed predict-never-taken is used by the hardware, then what will be the total branch penalty (# cycles wasted) for the algorithm? (Here assume NO branch-history table) For partial credit, explain your answer.

ii) If a branch-history table with one history bit per entry is used, then what will be the total branch penalty (# cycles wasted) for the algorithm? (Assume predict-not taken is used if there is no match in the branch-history table) For partial credit, explain your answer.

iii) If a branch-history table with two history bits per entry is used as in Figure 8.12, then what will be the total branch penalty (# cycles wasted) for the algorithm? (Assume predict-not taken is used if there is no match in the branch-history table) For partial credit, explain your answer.

2. Reflect (a page or two) on how well the Pentium 4 pipeline is designed without hyperthreading after reading the article about hyperthreading.