HW #4 Computer Systems

Due: 4/9/02 (Th)

1. Draw the circuit for a 4-to-16 decoder.

2. Draw a diagram of a register file with the following specifications:

Don't just draw a one-bit slice, but instead the whole register file. You do not need to show the internal implementation of D flip-flops, MUXs, or decoders.

3. How well does this register-file design scale? Suppose that we are implementing a 16 M x 8 (16M registers, each with 8 bits) register file with one write-port and one read-port.

a) How many and what type of decoder(s) would be needed?

b) How many total gates (assume only 1-input (NOTs) and 2-input (AND & OR) gates are used) would be needed to implement this (these) decoder(s)?

c) How many and what type of MUX(s) would be needed?

d) How many total gates (assume only 1-input (NOTs) and 2-input (AND & OR) gates are used) would be needed to implement this (these) MUX(s)?

e) Assuming D flip-flops to store each bit (5 gates/flip-flop). What % of the total gates is used to implement the D flip-flops?

4. Redo the previous question using the 16 M x 8 square-memory implementation similar to Figure 4.4 and the class handout.

5. Advanced DRAM designs (such as SDRAM and EDRAM described in the text) are typically accessed in burst mode with 4 (or more) data transfers from consecutive memory addresses. For example the Intel Pentium (circa 1996) predominately used the Intel 430HX (supports only EDRAM) or 430VX (supports SDRAM) chipsets.

The ideal timing for these bursts were 5-2-2-2 for the EDRAM and 7-1-1-1 for the SDRAM (5-2-2-2 means that the first data transfer required 5 bus cycles and the next three data transfers only required 2 bus cycles).

Rough Timing diagrams are:

EDRAM: (time flows from left to right)

RAS

CAS CAS CAS CAS (Can jump around within same Row)

Data 1 Data 2 Data 3 Data 4

SDRAM:

RAS

CAS

Data 1 Data 2 Data 3 Data 4 (adjacent data addresses read is some order)

From looking at Figures 4.26 and 4.27 answer the following questions:

a) Why does the second through fourth data transfers take less time than the first for EDRAM?

b) Why does the second through fourth data transfers take less time than the first for SDRAM?

c) Why does SDRAM only require one cycle each for the second through fourth data transfers while the EDRAM require two cycles for each of these data transfers?

6. Suppose we have a 64 MB memory that is byte addressable, and a 32KB cache with 16 bytes per block.

a) How many total lines are in the cache?

b) If the cache is direct-mapped, how many cache lines could a specific memory block be mapped to?

c) If the cache is direct-mapped, what would be the format (tag bits, cache line bits, block offset bits) of the address? (Clearly indicate the number of bits in each)

d) If the cache is fully-associative, how many cache lines could a specific memory block be mapped to?

e) If the cache is fully-associative, what would be the format of the address?

f) If the cache is 4-way set associative, how many cache lines could a specific memory block be mapped to?

g) If the cache is 4-way set associative, how many sets would there be?

h) If the cache is 4-way set associative, what would be the format of the address?