Computer Organization (CS 1410) Fall 2017

Time and Place: 2 - 3:15 PM Tuesday and Thursday in ITT 322

eLearning Site: CS 1410 Computer Organization Section 01 Fall 2017
Course website: www.cs.uni.edu/~fienup/cs1410f17/

Class Email List: Send messages to CS-1410-01-FALL@uni.edu from your UNI account

Instructor: Mark Fienup (fienup@cs.uni.edu)
Office: ITTC 313
Phone: 273-5918 (Home 266-5379)
Office Hours: M: 8-11:45, 1:10-3, T: 1:10-2, W: 1:10-2, Th: 1:10-2, F: 8-10

Pre- or Co-requisite: Intro. to Computing (CS 1510) or any programming course

Goals: After this course, you should understand: (1) how data is represented and manipulated on the computer, (2) simple combinational and memory circuits used to build computer components, (3) how these circuits are organized to build a computer, (4) how to program in assembly language, (5) how high-level language programming languages are implemented with respect to the run-time stack and built-in data structures such as arrays and records, and (5) general concepts of hardware support necessary for an operating system.


Assignments: Assignments will be both "pencil-and-paper" exercises and assembly-language programming.

Pedagogic Approach: This is a "flipped" class! Before coming to each class, you will be asked to do the assigned reading, watch mini-lecture video(s), and take eLearning quiz(zes). The pre-class activities frees up class time to focus on the more challenging content of the course. In class, we’ll dive into active and group learning worksheets and activities. As necessary I’ll do mini-lectures with the whole class when extra explanation is needed. While the in-class worksheets (etc.) are not formally graded, part (5%) of your grade will be based on your participation in these in-class activities.

Grading policy: There will be three tests (including the final). I’ll announce tests at least one week in advance to allow you time to prepare. Tentative schedule and weighting of course components are:

<table>
<thead>
<tr>
<th>Component</th>
<th>Weight</th>
</tr>
</thead>
<tbody>
<tr>
<td>Pre-class Work</td>
<td>15%</td>
</tr>
<tr>
<td>In-class Work</td>
<td>5%</td>
</tr>
<tr>
<td>Assignments</td>
<td>20%</td>
</tr>
<tr>
<td>In-class Test 1</td>
<td>20%    (September 28)</td>
</tr>
<tr>
<td>In-class Test 2</td>
<td>20%    (November 2)</td>
</tr>
<tr>
<td>Final</td>
<td>20%    (Wednesday, December 13 from 1:00 to 2:50 PM in ITT 322)</td>
</tr>
</tbody>
</table>

Grades will be assigned based on straight percentages off the top student score. If the top student's score is 92%, then the grading scale will be, i.e., 100-82 A, 81.9-72 B, 71.9-62 C, 61.9-52 D, and below 52 F. Plus and minus grades will be assigned for students near cutoff points.

Scholastic Conduct: You are responsible for being familiar with the University’ Academic Ethics Policies (http://www.uni.edu/pres/policies/301.shtml). Copying from other students is expressly forbidden. Doing so on exams or assignments will be penalized every time it is discovered. The penalty can vary from zero credit for the copied items (first offense) up to a failing grade for the course. If an assignment makes you realize you don't understand the material, ask questions designed to improve your understanding, not ones designed to discover how another student solved the assignment. The solutions to assignments should be individual, original work unless otherwise specified. Remember: discussing assignments is good. Copying code or test-question answers is cheating.

Any substantive contribution to your assignment solution by another person or taken from a publication (or the
web) should be properly acknowledged in writing. Failure to do so is plagiarism and will necessitate disciplinary action. In addition to the activities we can all agree are cheating (plagiarism, bringing notes to a closed book exam, etc), assisting or collaborating on cheating is cheating. Cheating can result in failing the course and/or more severe disciplinary actions.

Special Notices:
• In compliance with the University of Northern Iowa policy and equal access laws, I am available to discuss appropriate academic accommodations that may be required for students with disabilities. Requests for academic accommodations are to be made during the first three weeks of the semester, except for unusual circumstances, so arrangements can be made. Students are encouraged to register with Student Disability Services, 103 Student Health Center, to verify their eligibility for appropriate accommodations.

• I encourage you to utilize the Academic Learning Center's free assistance with writing, math, science, reading, and learning strategies. UNI's Academic Learning Center is located in 008 ITTC. Visit the website at http://www.uni.edu/unialc/ or phone 319-273-2361 for more information.

Computer Organization Fall 2017

<table>
<thead>
<tr>
<th>Lect #</th>
<th>Tuesday</th>
<th>Thursday</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>8/22</td>
<td>Sect 1.1-1.8: Introduction, Terminology, and Computer History</td>
</tr>
<tr>
<td>3</td>
<td>8/29</td>
<td>Sect 2.4: Signed integers</td>
</tr>
<tr>
<td>5</td>
<td>9/5</td>
<td>Sect 2.6-2.7: Character representation and Error Detection and Correction</td>
</tr>
<tr>
<td>7</td>
<td>9/12</td>
<td>Sect 3.4-3.5: Digital Components: decoders, multiplexers, adders</td>
</tr>
<tr>
<td>9</td>
<td>9/19</td>
<td>Register file vs. Square-Memory RAM</td>
</tr>
<tr>
<td>11</td>
<td>9/26</td>
<td>Review for Test 1</td>
</tr>
<tr>
<td>13</td>
<td>10/3</td>
<td>4.1-4.9: MARIE CPU, Bus, Clock, I/O, Memory, and RTL/RTL</td>
</tr>
<tr>
<td>15</td>
<td>10/10</td>
<td>MARIE Microprogrammed Control Unit</td>
</tr>
<tr>
<td>17</td>
<td>10/17</td>
<td>Sect 4.14: MIPS Assembly Language 1D-Array Examples</td>
</tr>
<tr>
<td>19</td>
<td>10/24</td>
<td>Run-time stack in HLL; CalculatePowers MIPS calling convention example</td>
</tr>
<tr>
<td>21</td>
<td>10/31</td>
<td>Review for Test 2</td>
</tr>
<tr>
<td>25</td>
<td>11/14</td>
<td>Sect 8.1-8.2: OS queues and process mgt; Instr Pipelining; Data and control hazards</td>
</tr>
<tr>
<td>27</td>
<td>11/28</td>
<td>Superscalar, WAR &amp; WAW dependencies; Ch 5: Memory Hierarchy and cache</td>
</tr>
<tr>
<td>29</td>
<td>12/5</td>
<td>Virtual Memory Examples</td>
</tr>
</tbody>
</table>

Final: Wednesday, December 13 from 1:00 to 2:50 PM in ITT 322
Below is the description of the hardware features of a desktop PC:

- Intel® 2nd Generation Core™ i3 (3.4GHz, L2 Cache Memory 3MB)
- System Memory: (RAM) 8GB DDR3 SDRAM
- Hard Drive: 1.5TB SATA (7200 rpm)
- Intel® HD Graphics 2000 (Video Memory Up to 4GB shared)
- Network Card: Built-in 10/100/1000Base-T Ethernet LAN
- Wireless Networking: Built-in 802.11b/g/n wireless LAN
- Recordable DVD Drive: 8x DVD+R DL; 8x DVD-R DL; 16x8x16 DVD+RW; 16x6x16 DVD-RW; 5x DVD-RAM; 40x24x40 CD-RW
- Digital Media Card Reader
- Available Expansion Bays: External: 1 (3.5"), 1 (5.25"); Internal: 1 (3.5")
- Available Expansion Slots: 1 PCI Express x1, 1 PCI Express x16
- USB 2.0 Ports: 2 USB 3.0 (front); 4 USB 2.0 (rear)
- Wireless Keyboard with volume control
- Wireless optical mouse
- 20" widescreen flat panel monitor
- Operating System: Windows 8

1) What does the processor do?

2) What is stored in main memory (RAM)?

3) What is stored on the hard disk?

4) What is the purpose of cache memory?

5) What terms relate to interconnection of internal PC components?

6) What terms relate to interconnection of external PC components?

7) What is a KB, MB, GB, Gigabit, MHz, GHz?

8) What is the role of the operating system?

\[ \text{SUM} = X + Y \]

CPU/Processor

System Bus  Memory  (Addresses)

PSW - process status word

ALU \((+,-, *, \text{ etc})\)

Control Unit

Register File

R0  R1  R2  R3  \(\vdots\)  R31

PC, program counter

IR, instruction register

MAR, memory addr. reg

MBR, memory buffer reg

Operating System Area

Unused

Stack

Heap

A Program's Address Space

Global Data

Program Area

Load R3, Y
Load R2, X
Add R1, R2, R3
Store R1, SUM

I/O Controller

Disk

Processing (Instruction/Machine) Cycle of stored-program computer - repeat all day
1. Fetch Instruction - read instruction pointed at by the program counter (PC) from memory into Instruction Reg. (IR)
2. Decode Instruction - figure out what kind of instruction was read
3. Fetch Operands - get operand values from the memory or registers
4. Execute Instruction - do some operation with the operands to get some result
5. Write Result - put the result into a register or in a memory location
(Note: Sometime during the above steps, the PC is updated to point to the next instruction.)
9) What is the role of a compiler?

10) What advantages do high-level languages (Ada, C, C++, Java, Python, etc.) have over assembly language?

11) Why do people program in assembly language (AL)?
<table>
<thead>
<tr>
<th>Type of Instruction</th>
<th>MIPS Assembly Language</th>
<th>Register Transfer Language Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Memory Access (Load and Store)</td>
<td>lw $4, Mem</td>
<td>$4 ← [Mem]</td>
</tr>
<tr>
<td></td>
<td>sw $4, Mem</td>
<td>Mem ← $4</td>
</tr>
<tr>
<td></td>
<td>lw $4, 16($3)</td>
<td>$4 ← [Mem at address in $3 + 16]</td>
</tr>
<tr>
<td></td>
<td>sw $4, Mem</td>
<td>[Mem at address in $3 + 16] ← $4</td>
</tr>
<tr>
<td>Move</td>
<td>move $4, $2</td>
<td>$4 ← $2</td>
</tr>
<tr>
<td></td>
<td>li $4, 100</td>
<td>$4 ← 100</td>
</tr>
<tr>
<td>Load Address</td>
<td>la $5, mem</td>
<td>$4 ← load address of mem</td>
</tr>
<tr>
<td>Arithmetic Instruction (reg. operands only)</td>
<td>add $4, $2, $3</td>
<td>$4 ← $2 + $3</td>
</tr>
<tr>
<td></td>
<td>mul $10, $12, $8</td>
<td>$10 ← $12 * $8 (32-bit product)</td>
</tr>
<tr>
<td></td>
<td>sub $4, $2, $3</td>
<td>$4 ← $2 - $3</td>
</tr>
<tr>
<td>Arithmetic with Immediates (last operand must be an integer)</td>
<td>addi $4, $2, 100</td>
<td>$4 ← $2 + 100</td>
</tr>
<tr>
<td></td>
<td>mul $4, $2, 100</td>
<td>$4 ← $2 * 100 (32-bit product)</td>
</tr>
<tr>
<td>Conditional Branch</td>
<td>bgt $4, $2, LABEL</td>
<td>Branch to LABEL if $4 &gt; $2</td>
</tr>
<tr>
<td></td>
<td>(bge, bgt, ble, beq, bne)</td>
<td></td>
</tr>
<tr>
<td>Unconditional Branch</td>
<td>j LABEL</td>
<td>Always Branch to LABEL</td>
</tr>
</tbody>
</table>

Fibonacci Sequence:          0  1  1  2  3  5  8  13  21
Position in Sequence:        0  1  2  3  4  5  6  7  8

A high-level language program to calculate the $n$th fibonacci number would be:

```
temp2 = 0
temp3 = 1
for i = 2 to n do
    temp4 = temp2 + temp3
    temp2 = temp3
    temp3 = temp4
end for
result = temp4
```

A complete assembly language MIPS program to calculate the $n$th fibonacci number.

```
.data
n: .word 8  # variable in memory
result: .word 0  # variable in memory

.text
.globl main
main:  li $2, 0  # $2 holds temp2
       li $3, 1  # $3 holds temp3
for_init:  li $6, 2  # initialize i ($6) to 2
           lw $5, n  # load "n" into $5
for_loop:  bgt $6, $5, end_for  # if $6 >= $5, then branch to end_for label
           add $4, $2, $3  # $4 holds temp4
           move $2, $3  # shift temp2 to temp2
           move $3, $4  # shift temp4 to temp3
           addi $6, $6, 1  # increment i ($6)
           j for_loop  # unconditionally jump to for_loop label
end_for:  sw $4, result  # store the result to memory
          li $v0, 10  # system code for exit
          syscall  # call the operating system
```
Computer Technology

Built from two types of components plus interconnection:

Gates:

Binary Inputs → Binary Function → Output

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>OR</th>
</tr>
</thead>
<tbody>
<tr>
<td>F</td>
<td>F</td>
<td>F</td>
</tr>
<tr>
<td>F</td>
<td>T</td>
<td>T</td>
</tr>
<tr>
<td>T</td>
<td>F</td>
<td>T</td>
</tr>
<tr>
<td>T</td>
<td>T</td>
<td>T</td>
</tr>
</tbody>
</table>

Memory Cells:

Input → One Bit Storage → Output

Read/Write control signal

Circuit - collection of gates and/or memory cells to perform some function
Computer Generations

1st Generation - vacuum tubes with wires for interconnection (one gate per vacuum tube)

Examples: ENIAC - 1943-46 Army's Ballistics Lab Mauchly/Eckert (U. of Penn.) 1st general-purpose electronic digital computer
30 tons, 15,000 sq. ft., 18,000 vacuum tubes, 140 KW, 5000 adds/sec
Programmed with wires and switches

1945-52 - IAS John von Neumann (Princeton) used stored-program concept

2nd Generation - transistors - solid state device made from silicon, but single gate each
10,000 to 100,000 transistors soldered to circuit board
Advantages: improved speed, reliability, size, power consumption
Some system software: early OS and High-level programming languages
3rd Generation - Small-scale integrated circuits (ICs)
Many gates on same wafer of silicon (chip) and connected to form circuits.
Advantages:
1) cheaper since the cost per chip the same, but fewer needed since more powerful
2) denser chips → shorter connections → faster
3) smaller computers → used in more places
4) reduced power and cooling consumption
5) interconnections on ICs more reliable that interconnections between ICs

4th Generation - Large-scale ICs - microprocessor
Enough gates per chip to implement whole CPU
Intel 4004 (1971) - 1st
Intel 8080 (1974) - 1st general-purpose microprocessor
Moore's Law

Gordon Moore - Intel cofounder (1965)

Predicted that gate density would double every year into near future.

This held true until early 1970s when the rate slowed to doubling every 18 months.
Instruction/Machine Cycle of stored-program computer - repeat all day
1. Fetch Instruction - read instruction pointed at by the program counter (PC) from memory into Instr. Reg. (IR)
2. Decode Instruction - figure out what kind of instruction was read
3. Fetch Operands - get operand values from the memory or registers
4. Execute Instruction - do some operation with the operands to get some result
5. Write Result - put the result into a register or in a memory location
(Note: Sometime during the above steps, the PC is updated to point to the next instruction.)
Today's stored-program computers have the following characteristics:
- Three hardware systems:
  - A central processing unit (CPU)
  - A main memory system
  - An I/O system
- The capacity to carry out sequential instruction processing.
- A single data path between the CPU and main memory.
  - This single path is known as the *von Neumann bottleneck*.
  - Register File - store a small amount of data in the CPU close to the computation circuits
  - Level 1 (L1) and Level 2 (L2) caches on CPU store larger amounts of data and instruction on CPU
Programming Languages

<table>
<thead>
<tr>
<th>Machine Language</th>
<th>Assembly Language</th>
<th>High-Level Languages</th>
</tr>
</thead>
<tbody>
<tr>
<td>10100100</td>
<td>Load R3, Y</td>
<td>Compiler</td>
</tr>
<tr>
<td>10011011</td>
<td>Load R2, X</td>
<td>SUM = X+Y</td>
</tr>
<tr>
<td>00110011</td>
<td>Add R1,R2,R3</td>
<td></td>
</tr>
<tr>
<td>10011100</td>
<td>Store R1, SUM</td>
<td></td>
</tr>
</tbody>
</table>

Instruction Set Architecture (ISA)/Family of Computers
Several implementation of computer can run the same assembly/machine language with different cost/performance ratios.
Examples:
IBM 700/7000 Series (overhead)
Intel 486, Pentium I, II, III, IV (overhead)

Advantages of High-Level Programming Languages:
1) Portability - computer independent
2) Productivity - more and better software in less time
3) Application specific languages
Computer Networking

1962 - RAND (Research and Development a nonprofit institution) and Paul Baran begin research into packet switching.

1965 - ARPA - Advanced Research Projects Agency sponsors networking research

1969 - ARPA commissioned a Cambridge, Mass. company (Bolt, Beranek, and Newman) to build first packet switches Later, four computers connected

.  
.  
. 

Soon networking bandwidth will be unlimited???

Computer Applications

Where are applications going???
The Computer Level Hierarchy

- Computers consist of many things besides chips.

- Before a computer can do anything worthwhile, it must also use software.

- Writing complex programs requires a "divide and conquer" approach, where each program module solves a smaller problem.

- Complex computer systems employ a similar technique through a series of virtual machine layers.
The machines at each level execute their own particular instructions, calling upon machines at lower levels to perform tasks as required.
• **Level 6: The User Level**
  – Program execution and user interface level.
  – The level with which we are most familiar.

• **Level 5: High-Level Language Level**
  – The level with which we interact when we write programs in languages such as C, Pascal, Lisp, and Java.

• **Level 4: Assembly Language Level**
  – Acts upon assembly language produced from Level 5, as well as instructions programmed directly at this level.

• **Level 3: System Software Level**
  – Controls executing processes on the system.
  – Protects system resources.
  – Assembly language instructions often pass through Level 3 without modification.
• Level 2: Machine Level
  – Also known as the Instruction Set Architecture (ISA) Level.
  – Consists of instructions that are particular to the architecture of the machine.
  – Programs written in machine language need no compilers, interpreters, or assemblers.

• Level 1: Control Level
  – A control unit decodes and executes instructions and moves data through the system.
  – Control units can be microprogrammed or hardwired.
  – A microprogram is a program written in a low-level language that is implemented by the hardware.
  – Hardwired control units consist of hardware that directly executes machine instructions.
• Level 0: Digital Logic Level
  – This level is where we find digital circuits (the chips).
  – Digital circuits consist of gates and wires.
  – These components implement the mathematical logic of all other levels.
1) Complete the following table.

<table>
<thead>
<tr>
<th>Number of digits:</th>
<th>Decimal (Base 10)</th>
<th>Binary (Base 2)</th>
<th>Hexadecimal (Base 16)</th>
</tr>
</thead>
<tbody>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Digits:</td>
<td>0, 1, 2, 3, 4, 5, 6, 7, 8, 9</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Counting:</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>3</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>5</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>6</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>7</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>8</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>9</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>12</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>13</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>14</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>15</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>16</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>17</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

3. Convert $375_{10}$ to a binary (base 2) value.

4. Convert $375_{10}$ to a hexadecimal (base 16) value.

5. Convert $2BA_{16}$ to a decimal (base 10) value.

6. Perform the following arithmetic operations:
   
   $1001010_2 + 1101110_2 = 10100110_2$  
   $1100010_2 - 1001011_2 = 0010100_2$  
   $CB31A_{16} + 73A18_{16} = A19D1_{16}$  
   $- 4A73_{16}$
Options for representing signed integers: 8-bit example for -19₁₀

<table>
<thead>
<tr>
<th>sign bit</th>
<th>magnitude</th>
</tr>
</thead>
<tbody>
<tr>
<td>a) signed magnitude:</td>
<td>1 0 0 1 0 0 1 1</td>
</tr>
<tr>
<td>positive is 0</td>
<td>negative is 1</td>
</tr>
</tbody>
</table>

b) one's complement: positive values are their binary #. For negative values, invert all the bits of binary # of the absolute value

\[
\text{abs}(-19) = +19 = 0 0 0 1 0 0 1 1 \\
-19 = 1 1 1 0 1 1 0 0 \quad \text{Invert bits to get one's complement}
\]

c) two's complement: positive values are their binary #. For negative values, invert all the bits of binary # of the absolute value, then add 1

\[
\text{abs}(-19) = +19 = 0 0 0 1 0 0 1 1 \\
1 1 1 0 1 1 0 0 \quad \text{Invert bits to get one's complement, then} \\
+ 1 \quad \text{Add 1 to get two's complement} \\
-19 = 1 1 1 0 1 1 0 1
\]

7. Represent the following decimal numbers in binary using 8-bit signed magnitude, one's complement, and two's complement:

<table>
<thead>
<tr>
<th>decimal number</th>
<th>signed magnitude 8-bits</th>
<th>one's complement 8-bits</th>
<th>two's complement 8-bits</th>
</tr>
</thead>
<tbody>
<tr>
<td>97₁₀</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>-45₁₀</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

8. Using 8-bits what is the range of values for each of the following representations:

a) unsigned integers:

b) signed integers using two's complement:
<table>
<thead>
<tr>
<th>Counting</th>
<th>Binary</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>01</td>
</tr>
<tr>
<td>2</td>
<td>Counting 0</td>
</tr>
<tr>
<td>1000</td>
<td>0001</td>
</tr>
<tr>
<td>0009</td>
<td>0010</td>
</tr>
<tr>
<td>0010</td>
<td>0011</td>
</tr>
<tr>
<td>0011</td>
<td>0100</td>
</tr>
<tr>
<td>0019</td>
<td>1011</td>
</tr>
<tr>
<td>0020</td>
<td>1010</td>
</tr>
</tbody>
</table>

\[ 8 \times 2^3 + 4 \times 2^2 + 2 \times 2^1 + 0 \times 2^0 = 168421 \leftarrow \text{Decimal} \]

\[ 01101 = 13_{10} \]

\[ 8 + 4 + 1 = 13_{10} \]

\[ \text{Converting: } \quad 101001_2 = ?_{10} \]

\[ 32 + 8 + 1 = 41_{10} \]

\[ 86_{10} = ?_2 \]

\[ \begin{array}{cccccccc}
256 & 128 & 64 & 32 & 16 & 8 & 4 & 2 & 1 \\
- & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0_2 \\
\end{array} \]

\[ \begin{array}{c}
86 \\
-64 \\
-2^{2} \\
-1^{2} \\
-0 \\
\hline
2^{6} \\
\end{array} \]
Hexadecimal

Digits: 0, 1, 2, ..., 9, A, B, C, D, E, F

Counting:

\[
\begin{align*}
10 & = \text{A}_{16} \\
11 & = \text{B}_{16} \\
12 & = \text{C}_{16} \\
13 & = \text{D}_{16} \\
14 & = \text{E}_{16} \\
15 & = \text{F}_{16} \\
16 & = 10_{10} \\
17 & = 101_2 \\
18 & = 1010_2 \\
19 & = 1011_2 \\
20 & = 10100_2 \\
21 & = 10101_2 \\
22 & = 10110_2 \\
23 & = 10111_2 \\
24 & = 11000_2 \\
25 & = 11001_2 \\
26 & = 11010_2 \\
27 & = 11011_2 \\
28 & = 11100_2 \\
29 & = 11101_2 \\
30 & = 11110_2 \\
\end{align*}
\]
Why hex?

Base 2 ↔ Base 16

\[ 86_{10} = \overline{\begin{array}{c|c}
9 & 6 \\
3 & 10101010 \\
16 & 6 \end{array}} \]

\[ 16 = 2^4(8421) \]

\[ 5 \]

Hex / Dec

<table>
<thead>
<tr>
<th>Hex</th>
<th>Dec</th>
</tr>
</thead>
<tbody>
<tr>
<td>A</td>
<td>10</td>
</tr>
<tr>
<td>B</td>
<td>11</td>
</tr>
<tr>
<td>C</td>
<td>12</td>
</tr>
<tr>
<td>D</td>
<td>13</td>
</tr>
<tr>
<td>E</td>
<td>14</td>
</tr>
<tr>
<td>F</td>
<td>15</td>
</tr>
</tbody>
</table>

3E16 = ?2

8421 8421 8421
0011 1110 10102

Base 10 ↔ Base 2 ↔ Base 16

Lecture 2-③
Binary:

\[ \begin{array}{c}
0.101001_2 \\
+ 0.100101_2 \\
\hline
1.001110_2
\end{array} \]

Decimal:

\[ \begin{array}{c}
234_{10} \\
+ 186_{10} \\
\hline
420_{10}
\end{array} \]

\[ \begin{array}{c}
12 \\
\hline
734_{10} \\
- 186_{10} \\
\hline
048_{10}
\end{array} \]

\[ \begin{array}{c}
1 \times 11 + 10 = 22_{10} - 16_{10} = 6_{10} \\
3B_{16} \\
+ 1A_{16} \\
\hline
56_{16}
\end{array} \]

\[ \begin{array}{c}
A_{16} + 8_{16} = 24_{10} - 9_{10} = 15_{10} \\
3B_{16} \\
- 1A_{16} \\
\hline
20F_{16}
\end{array} \]

\[ \begin{array}{c}
A_{16}  \\
B_{16} \\
C_{16} \\
D_{16} \\
E_{16} \\
F_{16}
\end{array} \]

\[ \begin{array}{c}
A_{10} \\
B_{10} \\
C_{10} \\
D_{10} \\
E_{10} \\
F_{10}
\end{array} \]

Lecture 2-4
1. Consider 4-bit BINARY numbers, because of “roll-over” the number line wraps around.
   a) List unsigned decimal values on the outside
   b) List signed (two’s complement) decimal values on the inside
   c) Mark the point of unsigned overflow
   d) Mark the point of signed overflow

Perform the following additions:
   e) for unsigned numbers: 
      \[ 0100_2 \ (4_{10}) + 0110_2 \ (6_{10}) = 1001_2 \ (9_{10}) + 1010_2 \ (10_{10}) \]
   f) for signed numbers: 
      (two’s compliment) 
      \[ 0100_2 \ (4_{10}) + 0110_2 \ (6_{10}) = 1100_2 \ (-4_{10}) + 1010_2 \ (-6_{10}) \]

2. For 4-bit unsigned numbers, when do we have overflow and get the wrong result during addition? (Hint: think about the carry bits into and/or out of the most-significant bit)

3. a) For 4-bit signed numbers, complete the following table about signed overflow:

<table>
<thead>
<tr>
<th>Sign of Operands for addition</th>
<th>Expected Sign of Result</th>
<th>Wrong Sign of Result (indicates overflow)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Operand 1</td>
<td>Operand 2</td>
<td></td>
</tr>
<tr>
<td>+</td>
<td>+</td>
<td></td>
</tr>
<tr>
<td>+</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>-</td>
<td>+</td>
<td></td>
</tr>
<tr>
<td>-</td>
<td>-</td>
<td></td>
</tr>
</tbody>
</table>

b) For 4-bit signed numbers, when do we have overflow and get the wrong result during addition? (Hint: think about the carry bits into and/or out of the most-significant bit)
4. Use Booth's algorithm to calculate the 8-bit product of $0110_2 \times 1101_2$.

\[
\begin{array}{cccc}
\text{Multiplicand} & \text{"Initial Product"} & \text{"Multiplier"} & \text{"Previous bit"} \\
0110 & 0000 & 1101 & 0 \\
1010 & & & \\
\end{array}
\]
Lecture 3 Video

Sign-Magnitude

19_{10} 8-bits
64 32 16 8 4 2 1
1 0 0 1 0 0 1 1

10010011

\frac{19}{3} = \frac{-16}{3}

+23_{10}

0 0 0 1 0 1 1 1

Range with 8-bits

0 1 1 1 1 1 1 1 = +127_{10}

128 = \overline{10000000}

\overline{11111111} = -127

8-bit unsigned:

0000 0000_{2} \Rightarrow 0_{10}

256 128 64

1 1 1 1 1 1 1 1_{2} \Rightarrow +255_{10}
One's complement:

0 or +: Normal binary #
- ① Start with + value
  ② Flip all bit 0 → 1
  ③ Add 1

8-bit -19₁₀:
① +19 = 0  0  0  0  1  0  0  1
②  1  1  1  0  1  1  0  0
  "sign bit"

Two's complement:

0 or +: Normal binary #
- ① Start with + value
  ② Flip bits
  ③ Add 1

8-bit -19₁₀:
① +19 = 0  0  0  1  0  0  1  1
②  1  1  1  0  1  1  0  0
③ Add 1
  1  1  1  0  1  1  0  0
  "sign bit"
-19_{10} \quad \text{two's complement:} \quad 11101101

1. \text{flip bit:} \quad 00010010

2. \text{add 1:} \quad 00010011
\quad 16842_{10} = +19
IEEE 754 Standard Floating Point Representation

<table>
<thead>
<tr>
<th></th>
<th>Sign bit</th>
<th>8-bit Exponent (bias 127)</th>
<th>23-bit Mantissa (for normalized values, leading 1 not stored)</th>
<th>Single Precision 32-bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>+</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th></th>
<th>Sign bit</th>
<th>11-bit Exponent (bias 1023)</th>
<th>52-bit Mantissa (for normalized values, leading 1 not stored)</th>
<th>Double Precision 64-bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>+</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th></th>
<th>Sign bit</th>
<th>15-bit Exponent (bias 16,383)</th>
<th>112-bit Mantissa (for normalized values, leading 1 stored)</th>
<th>Quad Precision 128-bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>+</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Single Precision</th>
<th>Double Precision</th>
<th>Object</th>
</tr>
</thead>
<tbody>
<tr>
<td>Exponent</td>
<td>Mantissa</td>
<td>Exponent</td>
</tr>
<tr>
<td>1-254</td>
<td>any value</td>
<td>1-2046</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>nonzero</td>
<td>0</td>
</tr>
<tr>
<td>255</td>
<td>0</td>
<td>2,047</td>
</tr>
<tr>
<td>255</td>
<td>nonzero</td>
<td>2,047</td>
</tr>
</tbody>
</table>

1) Convert the value $23.625_{10}$ to its binary representation.

```
64 32 16 8 4 2 1
```

2) Normalize the above value so that the most significant 1 is immediately to the left of the radix point. Include the corresponding exponent value to indicate the motion of the radix point.

```
1.
```

3) Write the corresponding 32-bit IEEE 754 floating point representation for $23.625_{10}$.

4) Write the corresponding 128-bit IEEE 754 floating point representation for $23.625_{10}$. 
5) How would you add two normalized IEEE 754 floating point numbers?

6) Consider adding $1.011 \times 2^{40}$ and $1.01 \times 2^{5}$.
   a) How many places does the second number's mantissa get shifted?
   b) After we add these two numbers and store the results back into a 32-bit IEEE 754 value, what would be the result?

7) How would you multiply two normalized IEEE 754 floating point numbers?

8) What would be the smallest positive normalized 32-bit IEEE 754 floating point value?

9) What would be the largest positive denormalized 32-bit IEEE 754 floating point value?
   (denormalized: exponent is still bias 127, but no implied "1." (i.e., "leading 1 not stored"), but "0." instead)

10) What would be the smallest positive denormalized 32-bit IEEE 754 floating point value?
Lecture 4 Floating Point

Decimal:

\[ 2.3456_{10} = (2.3456 \times 10^0) \]
\[ 2345.6 \times 10^{-2} \]

8-bits exponent: \( 2^8 = 256 \) values

Storing exp. with bias 127
73.3125_{10} \quad \text{radix}

64 \quad 32 \quad 16 \quad 8 \quad 4 \quad 2 \quad 1

0.5 \quad 0.25 \quad 0.125 \quad 0.0625

0.01000010101

1.33

1.00100010010101 \times 2^{6+127}

64-\text{bit}

128643216842

010000100000010

01001001001000

64+1028

1029

128-\text{bit}

010000000000010101010000

01001001010100

128+255

Denormalized 

\#5

\text{largest}

\text{normalized} \quad 0.0

\text{denormalized}

\text{smallest} \quad \text{+ normalized}

\text{Lecture 4 - (2)}
## ASCII Character Representation

<table>
<thead>
<tr>
<th>Code</th>
<th>Character</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>NUL</td>
</tr>
<tr>
<td>1</td>
<td>SOH</td>
</tr>
<tr>
<td>2</td>
<td>STX</td>
</tr>
<tr>
<td>3</td>
<td>ETX</td>
</tr>
<tr>
<td>4</td>
<td>EOT</td>
</tr>
<tr>
<td>5</td>
<td>ENQ</td>
</tr>
<tr>
<td>6</td>
<td>ACK</td>
</tr>
<tr>
<td>7</td>
<td>BEL</td>
</tr>
<tr>
<td>8</td>
<td>BS</td>
</tr>
<tr>
<td>9</td>
<td>HT</td>
</tr>
<tr>
<td>10</td>
<td>LF</td>
</tr>
<tr>
<td>11</td>
<td>VT</td>
</tr>
<tr>
<td>12</td>
<td>FF</td>
</tr>
<tr>
<td>13</td>
<td>CR</td>
</tr>
<tr>
<td>14</td>
<td>SO</td>
</tr>
<tr>
<td>15</td>
<td>SI</td>
</tr>
<tr>
<td>16</td>
<td>DLE</td>
</tr>
<tr>
<td>17</td>
<td>DC1</td>
</tr>
<tr>
<td>18</td>
<td>DC2</td>
</tr>
<tr>
<td>19</td>
<td>DC3</td>
</tr>
<tr>
<td>20</td>
<td>DC4</td>
</tr>
<tr>
<td>21</td>
<td>NAK</td>
</tr>
<tr>
<td>22</td>
<td>SYN</td>
</tr>
<tr>
<td>23</td>
<td>ETB</td>
</tr>
<tr>
<td>24</td>
<td>CAN</td>
</tr>
<tr>
<td>25</td>
<td>EM</td>
</tr>
<tr>
<td>26</td>
<td>SUB</td>
</tr>
<tr>
<td>27</td>
<td>ESC</td>
</tr>
<tr>
<td>28</td>
<td>FS</td>
</tr>
<tr>
<td>29</td>
<td>GS</td>
</tr>
<tr>
<td>30</td>
<td>RS</td>
</tr>
<tr>
<td>31</td>
<td>US</td>
</tr>
</tbody>
</table>

### Abbreviations

<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>NUL</td>
<td>Null</td>
</tr>
<tr>
<td>SOH</td>
<td>Start of heading</td>
</tr>
<tr>
<td>STX</td>
<td>Start of text</td>
</tr>
<tr>
<td>ETX</td>
<td>End of text</td>
</tr>
<tr>
<td>EOT</td>
<td>End of transmission</td>
</tr>
<tr>
<td>ENQ</td>
<td>Enquiry</td>
</tr>
<tr>
<td>ACK</td>
<td>Acknowledge</td>
</tr>
<tr>
<td>BEL</td>
<td>Bell (beep)</td>
</tr>
<tr>
<td>BS</td>
<td>Backspace</td>
</tr>
<tr>
<td>HT</td>
<td>Horizontal tab</td>
</tr>
<tr>
<td>LF</td>
<td>Line feed, new line</td>
</tr>
<tr>
<td>VT</td>
<td>Vertical tab</td>
</tr>
<tr>
<td>FF</td>
<td>Form feed, new page</td>
</tr>
<tr>
<td>CR</td>
<td>Carriage return</td>
</tr>
<tr>
<td>SO</td>
<td>Shift out</td>
</tr>
<tr>
<td>SI</td>
<td>Shift in</td>
</tr>
<tr>
<td>DLE</td>
<td>Data link escape</td>
</tr>
<tr>
<td>DC1</td>
<td>Device control 1</td>
</tr>
<tr>
<td>DC2</td>
<td>Device control 2</td>
</tr>
<tr>
<td>DC3</td>
<td>Device control 3</td>
</tr>
<tr>
<td>DC4</td>
<td>Device control 4</td>
</tr>
<tr>
<td>NAK</td>
<td>Negative acknowledge</td>
</tr>
<tr>
<td>SYN</td>
<td>Synchronous Idle</td>
</tr>
<tr>
<td>ETB</td>
<td>End of transmission block</td>
</tr>
<tr>
<td>CAN</td>
<td>Cancel</td>
</tr>
<tr>
<td>EM</td>
<td>End of medium</td>
</tr>
<tr>
<td>SUB</td>
<td>Substitute</td>
</tr>
<tr>
<td>ESC</td>
<td>Escape</td>
</tr>
<tr>
<td>FS</td>
<td>File separator</td>
</tr>
<tr>
<td>GS</td>
<td>Group separator</td>
</tr>
<tr>
<td>RS</td>
<td>Record separator</td>
</tr>
<tr>
<td>US</td>
<td>Unit separator</td>
</tr>
<tr>
<td>DEL</td>
<td>Delete/Idle</td>
</tr>
</tbody>
</table>
1) The ASCII code for character ‘A’ is 65₁₀, ‘B’ is 66₁₀, ... and ‘a’ is 97₁₀, ‘b’ is 98₁₀, ... .
   a) What would be the 7-bit binary value used to represent ‘A’?

   b) What would be the 7-bit binary value used to represent ‘a’?

   c) How does an upper-case letter differ from its corresponding lower-case letter?

   d) *Even parity* prepends a 0 or 1 so as to make the total number of 1’s be even. What is the 8-bit ASCII value for
      ‘A’:

      ‘a’:

   e) What error(s) cannot be detected by even parity?

2 a) For the 8-bit data 01001011₂ develop the Hamming codeword for one-bit error detection and correction:

   \[
   \begin{array}{cccccccccccc}
   12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 \\
   D_7 & D_6 & D_5 & D_4 & P_8 & D_3 & D_2 & D_1 & P_4 & D_0 & P_2 & P_1 \\
   0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 1 \\
   4+8 & 1+2+8 & 2+8 & 1+8 & 8 & 2+4 & 1+4 & 1+4 & 4 & 1+2 & 2 & 1 \\
   \end{array}
   \]

   Check bit P₁ looks at bit positions 1, 3, 5, 7, 9, and 11
   Check bit P₂ looks at bit positions 2, 3, 6, 7, 10, and 11
   Check bit P₃ looks at bit positions 4, 5, 6, 7, and 12
   Check bit P₄ looks at bit positions 8, 9, 10, 11, and 12

   b) If bit D₅ gets flipped (an error), then how would we be able to detect an error?

   c) If bit D₃ gets flipped (an error), then how would we be able to know which bit to correct?

2 d) For the 8-bit data 11001001₂ develop the Hamming codeword for one-bit error detection and correction:

   \[
   \begin{array}{cccccccccccc}
   12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 \\
   D_7 & D_6 & D_5 & D_4 & P_8 & D_3 & D_2 & D_1 & P_4 & D_0 & P_2 & P_1 \\
   1 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 1 \\
   4+8 & 1+2+8 & 2+8 & 1+8 & 8 & 1+2+4 & 2+4 & 1+4 & 1+4 & 4 & 1+2 & 2 & 1 \\
   \end{array}
   \]
3. The CRC, Cyclic Redundancy Check, are used to detect errors in long data transmissions/blocks that are subject to burst errors (several bits in sequence that are corrupted). The main idea is to append to the data-bits a small (16 or 32 bits) amount of information to help detect errors. Retransmission of the data is done if an error is detected.

The basic calculation of the CRC is integer division.

\[
\begin{array}{c}
\text{quotient} \\
\downarrow \\
\text{remainder } R \\
\text{divisor } D \\
\downarrow \\
\text{dividend}
\end{array}
\]

\[
\frac{26}{5} \text{ remainder } 4
\]

\[
\frac{134}{5}
\]

a) What is the quotient of \( (D - R) / G \)?

b) What is the remainder of \( (D - R) / G \)?

c) What is the range of the remainders of \( D / G \)?

Assume the sender and receiver agree on \( G \) and the sender sends:

\[
\begin{array}{c|c}
D & R
\end{array}
\]

If the receiver performs the calculation \( (D - R)/G \), what would the remainder be if no transmission error occurred?

To simplify calculations all CRC calculations are done in modulo-2 arithmetic without carries in addition or borrows in subtraction. Therefore, both addition and subtraction are identical and equivalent to bitwise XOR (exclusive OR). For example,

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>XOR</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

\[
\begin{array}{c|c|c|c}
A & B & \text{XOR} & 1100 & 1100 \\
0 & 0 & -1010 & 0110 & 0110 \\
0 & 1 & +1010 & 0110 & 0110
\end{array}
\]
Let
\[ D = d\text{-bit data} \]
\[ R = n\text{-bit remainder} \]
\[ G = \text{degree } n \text{ polynomial generator, e.g., } G = x^5 + x^2 + 1 \text{ is 100101}_2 \]
\[ C = (d+n)\text{-bit codeword to be transmitted} \]

The goal is to generate \( C \) such that \( C \div G \) would have no remainder. To do this, we generate \( R \) as
\[ \begin{align*}
\frac{D \times 2^n}{G} &= Q \oplus \frac{R}{G} \\
\end{align*} \]
where \( Q \) is the quotient and \( R \) is the remainder.

The sender sends the codeword \( C = D \times 2^n \oplus R \). When the receiver receives \( C \), it divides by \( G \),
\[ \begin{align*}
\frac{C}{G} &= \frac{D \times 2^n \oplus R}{G} = \frac{D \times 2^n}{G} \oplus \frac{R}{G} = Q \oplus \frac{R \oplus R}{G} \\
\end{align*} \]

E) What is \( R \oplus R \)?

Let
\[ D = 10100101_2 \text{ (8-bit data)}, \]
\[ G = x^5 + x^2 + 1 \text{ is 100101}_2 \text{ (degree 5 polynomial)} \]

F) Determine the remainder of \( \frac{D \times 2^n}{G} \).

\[ \begin{array}{c}
10010101 \\
\hline
1010000000 \\
\hline
1100 \\
\hline
10101010101010000000 \\
\hline
10010101 \\
\hline
11000000 \\
\hline
10010101 \\
\hline
1010100 \\
\end{array} \]

append n 0's since we start with \( D \times 2^n \)
g) The codeword sent is the data appended with the remainder, so what codeword is sent by the sender in part f?

h) Divide the codeword by the generator $G = x^5 + x^2 + 1 \ (100101_2)$ to check for an error. Remainder should be zero if no errors.

i) Introduce some random error into the codeword and check for an error by dividing by the generator $G = x^5 + x^2 + 1 \ (100101_2)$
Lecture 5

Characters - ASCII 7-bits $0_{10} - 127_{10}$

A-Z contiguous $A' = 65_{10}$

Unicode: 16-bit $A^u = 65_{10}$

parity bit 7-bit

<table>
<thead>
<tr>
<th>0</th>
<th>1</th>
<th>0</th>
<th>0</th>
<th>0</th>
<th>0</th>
<th>1</th>
</tr>
</thead>
<tbody>
<tr>
<td>64</td>
<td>32</td>
<td>16</td>
<td>8</td>
<td>4</td>
<td>2</td>
<td>1</td>
</tr>
</tbody>
</table>

even parity - total # of 1's is even
odd parity - total # of 1's is odd

Hamming code 8-bit data $10101101_2$

even parity

<table>
<thead>
<tr>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>1</th>
<th>0</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>8</td>
<td>7</td>
<td>6</td>
<td>5</td>
<td>4</td>
<td>3</td>
<td>2</td>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8+4</td>
<td>8+2+1</td>
<td>8+2</td>
<td>8+1</td>
<td>4+2+1</td>
<td>4+2</td>
<td>4+1</td>
<td>2+1</td>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Lecture 5 - 1
Hamming Code Words:

12 11 10 9 8 7 6 5 4 3 2 1

1 0 1 0 0 1 1 0 1 1 0 0

4 + 2 = 6 bit incorrect

8-bit

10 1 0 1 1 | 0 1
1) Complete the truth tables for the logical gates and draw the gates.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

2) Draw the logic circuit using ANDs, ORs, and NOT gates for $F = A'B'C + ABC + AB'C + BC$.

a) What is the complexity (sum of # inputs and # gates)?

b) How many gate delays for your circuit?
## Computer Organization

### Lecture 6

<table>
<thead>
<tr>
<th>Identity Name</th>
<th>AND Form</th>
<th>OR Form</th>
</tr>
</thead>
<tbody>
<tr>
<td>Identity Law</td>
<td>$1x = x$</td>
<td>$0 + x = x$</td>
</tr>
<tr>
<td>Null (or Dominance) Law</td>
<td>$0x = 0$</td>
<td>$1 + x = 1$</td>
</tr>
<tr>
<td>Idempotent Law</td>
<td>$xx = x$</td>
<td>$x + x = x$</td>
</tr>
<tr>
<td>Inverse Law</td>
<td>$x \bar{x} = 0$</td>
<td>$x + \bar{x} = 1$</td>
</tr>
<tr>
<td>Commutative Law</td>
<td>$xy = yx$</td>
<td>$x + y = y + x$</td>
</tr>
<tr>
<td>Associative Law</td>
<td>$(xy)z = x(yz)$</td>
<td>$(x+y)+z = x+(y+z)$</td>
</tr>
<tr>
<td>Distributive Law</td>
<td>$x + yz = (x+y)(x+z)$</td>
<td>$(x+y+z) = xy + xz$</td>
</tr>
<tr>
<td>Absorption Law</td>
<td>$x(x+y) = x$</td>
<td>$x + xy = x$</td>
</tr>
<tr>
<td>DeMorgan's Law</td>
<td>$\bar{x}y = \bar{x} \bar{y}$</td>
<td>$(x+y) = \bar{x} \bar{y}$</td>
</tr>
<tr>
<td>Double Complement Law</td>
<td>$\bar{\bar{x}} = x$</td>
<td></td>
</tr>
</tbody>
</table>

### General Simplification Technique

1) Look for two produces that have one variable negated
   \[ ... + \bar{x}y\bar{z} + ... + \bar{x}y\bar{z} + ... \]

2) “Factor out” common terms (distr. law)
   \[ ... + \bar{x}z(\bar{y} + y) + ... \]

3) Apply inverse law
   \[ ... + \bar{x}z + 1 + ... \]

4) Apply identity law
   \[ ... + \bar{x}z + ... \]

3) Using Boolean Algebra simplify \( F = \overline{ABC} + \overline{AB}C + \overline{ABC} + BC \).

4) Draw the simplified logic circuit using ANDs, ORs, and NOT. What is the complexity (sum of # inputs and # gates)?
Lecture 6: Boolean Logic and Gates

Digital/Binary: Boolean T, F
Binary 1, 0

<table>
<thead>
<tr>
<th>Voltage</th>
<th>0</th>
<th>1</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Boolean algebra to describe functions for logical circuit

<table>
<thead>
<tr>
<th>Gate: OR, AND, NOT, XOR, NOR, NAND</th>
</tr>
</thead>
<tbody>
<tr>
<td>( A )</td>
</tr>
<tr>
<td>---------</td>
</tr>
<tr>
<td>0</td>
</tr>
<tr>
<td>0</td>
</tr>
<tr>
<td>1</td>
</tr>
<tr>
<td>1</td>
</tr>
</tbody>
</table>

\[ (\overline{x} \cdot y) + (\overline{z} \cdot \overline{x} \cdot y) \]

Lecture 6 -1
### NOR

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>( (A+B) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

### NAND

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>( (A\cdot\overline{B}) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

\[ F = \overline{A\overline{B}\overline{C}} + \overline{A}BC + ABC + AB\overline{C} \]

\text{"Sum-of-Products" SOP}

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>C</th>
<th>F</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>
\[ F = \overline{A} \overline{B} \overline{C} + \overline{A} \overline{B} C + \overline{A} B \overline{C} + A \overline{B} C + A B C \]

\# of gate delay = 2

circuit complexity = 5 + (3 \times 4) + 4 = 21

(\# gates + \# inputs... for)
1. A *decoder* takes an n-bit binary number (range of values 0<sub>i</sub> to 2<sup>n</sup>-1) and determines ("decodes") its value by *asserting* (i.e., output of 1) only the output corresponding the binary number's value.

   Consider a 2-to-4 decoder:

   ![2-to-4 decoder diagram](image)

   - Input 10<sub>2</sub>
   - Decodes to 2<sub>10</sub>

   a) Connect input of AND-gates to generate the correct values.

b) Draw a 3-to-8 decoder circuit below:
2. A multiplexer (MUX) allows one of $2^n$ inputs to be switched to a single output based on an n-bit binary number (range of values $0_{10}$ to $2^n-1$) on the control lines.

Consider a 4-to-1 multiplexer:

- a) Connect input of AND-gates and decide on how to merge AND-gate outputs into a single Output.

- b) Draw a 8-to-1 multiplexer circuit below:
3. Sum the following binary (base 2) numbers

\[10011_2 + 10110_2\]

\[101101_2 + 110111_2\]


<table>
<thead>
<tr>
<th>(x_0)</th>
<th>(y_0)</th>
<th>sum (s_0)</th>
<th>carry-out (c_1)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

5. a) Complete the Full-Adder truth table for the sum \((s_i)\) and carry-out \((c_{i+1})\) functions.

<table>
<thead>
<tr>
<th>(x_i)</th>
<th>(y_i)</th>
<th>carry-in (c_i)</th>
<th>sum (s_i)</th>
<th>carry-out (c_{i+1})</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

b) Use k-maps to minimize the sum \((s_i)\) and carry-out \((c_{i+1})\) functions of the Full-Adder:

c) For the one-bit Full-Adder, how many gate delays are needed before the carry-out \((c_{i+1})\) wire is correct?

d) A 32-bit, ripple-adder is made up of a collection of single-bit Full-Adders connected together as shown below

e) How many gate delays are needed before \(c_{32}\) is correct?
6. To speed up the calculation of the carry-out \((C_{i+1})\) signals, consider constructing a 32-bit adder using two-bit adders as shown in:

\[
\begin{array}{c}
\text{i} \rightarrow \text{31} & \text{y} \rightarrow \text{31} & \text{i} \rightarrow \text{30} & \text{y} \rightarrow \text{30} \\
\text{C}_{32} & \text{C}_{30} & \cdots & \text{C}_{i+1} & \text{C}_{i-1} & \cdots & \text{C}_2 & \text{C}_0
\end{array}
\]

\[\text{Sum}_{31}, \text{Sum}_{30}, \text{Sum}_i, \text{Sum}_{i-1}, \text{Sum}_1, \text{Sum}_0\]

a) If \(c_{i+1}\) is calculated directly from the inputs as 
\[
c_{i+1} = x_i y_i + x_i x_{i-1} y_{i-1} + x_i x_{i-1} c_{i-1} + x_i y_{i-1} c_{i-1} + y_i x_{i-1} c_{i-1} + y_i y_{i-1} c_{i-1},
\]
then how many gate delays would be needed to calculate the \(c_{i+1}\) signal in a two-bit adder?

b) What would be the total number of gate delays in a 32-bit adder before the \(c_{32}\) signal is generated correctly if two-bit adders were used?

7. What would be the total number of gate delays in a 32-bit adder before the \(c_{32}\) signal is generated correctly if three-bit adders were used (10 three-bit adders and a 2-bit adder)?

8. What would be the total number of gate delays in a 32-bit adder before the \(c_{32}\) signal is generated correctly if four-bit adders were used?
Lecture 7 Common Circuits for Computers

Decoder $n$-inputs (n-bit binary #)

2-to-4 decoder

Exactly one 1 corresponding to the binary # $1 \bar{X}_1 X_0$

2-to-4 decoder

Lecture7-10
Multiplier, MUX - switch one of the inputs to a single output based on a binary # on the control wires

4-to-1 MUX

<table>
<thead>
<tr>
<th>Input</th>
<th>Output</th>
</tr>
</thead>
<tbody>
<tr>
<td>I₀</td>
<td>d</td>
</tr>
<tr>
<td>I₁</td>
<td></td>
</tr>
<tr>
<td>I₂</td>
<td></td>
</tr>
<tr>
<td>I₃</td>
<td></td>
</tr>
<tr>
<td>C₀</td>
<td></td>
</tr>
<tr>
<td>C₁</td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
</tr>
<tr>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>

Lecture 7-②
Adder 32-bit #5
\[ X_0 \times X_{31} \times X_{30} \]
\[ Y_0 \times Y_{31} \times Y_{30} \]
\[ S_0 \]

1-bit adder
half-adder (least-sign, bit only)

<table>
<thead>
<tr>
<th>( X_0 )</th>
<th>( Y_0 )</th>
<th>(Sum)</th>
<th>( S_0 )</th>
<th>( C_1 )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

\[ X_0 \times Y_0 \]
\[ C_1 \]
\[ S_0 \]
1. a) If \( R = 0 \) and \( S = 1 \), then what will be the output on \( Q \) and \( \overline{Q} \)?

b) Now, if \( S \) goes to a 0 value, what happens to the output on \( Q \) and \( \overline{Q} \)?

c) Complete the following timing diagram for the SR latch:

```
\[
S _______
\]

\[
R _______
\]

one gate delay

\[
Q _______
\]

\[
\overline{Q} _______
\]
```

2. Complete the following timing diagram for the Master-Slave D flip-flop:

```
\[
D \quad Q \quad Q_m \quad Q \quad Q_s \quad Q \quad \overline{Q}
\]

Clock ________

D ________

Q_m ________

Q=Q_s ________
```

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>( (A + B) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>
3. Complete the below diagram of a 4-bit register so that it is able to perform the following operations:
   • parallel read/output of all bits (just look at the Q values)
   • parallel write/input of all bits
   • logical shift left one bit position (value shifted out of most-sign bit is lost and a “0” is shifted into the least-significant bit)
   • circular shift right one bit position (value shifted out of least-sign bit is shifted into the most-significant bit)
   • arithmetic shift right (sign-extend the most-significant bit)

4. Suppose we have a register file with the following specifications:
   • 8 registers numbered from 0 to 7
   • each register has 16-bits
   • one write port
   • two read ports

   a) What how many bits (“wires”) would be need for each of the following?
      • data to be written for a write port
      • specifying the register number of a read port
      • specifying the register number of a write port
      • output read from a read port

   b) How many write enable wires would be needed for the whole register file?

   c) How many decoders would be needed in the implementation of the whole register file (like Figure 3.32)? Explain how you arrived at that number and specify the type of decoders (i.e., number of inputs and number of outputs for each decoder)

   d) How many multiplexers would be needed in the implementation of the whole register file (like Figure 3.32)? Explain how you arrived at that number and specify the type of multiplexers.
Lecture 8 - 1-bit SR-latch/memory

Decoder, MUX, adder

SR latch 1-bit memory XOR gate

\[
\begin{array}{c|c|c|c}
A & B & (A + B) \\
0 & 0 & 1 \\
0 & 1 & 1 \\
1 & 0 & 1 \\
1 & 1 & 0 \\
\end{array}
\]

If R change from 1 to 0, then keep remembering a 0.

R (Reset) to 0  \rightarrow  SR-latch remembers whether R or S had a "1" most recently.
S (set) to 1

Timing diagram

Time
SR latch  Characteristic Table

<table>
<thead>
<tr>
<th>S</th>
<th>R</th>
<th>Q_{n+1}</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>Q_n</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>don't</td>
</tr>
<tr>
<td></td>
<td></td>
<td>do</td>
</tr>
</tbody>
</table>

D latch

When CLK=1, remember D's value; otherwise CLK=0 - ignore D but keep remembering

Flip-Flops  1-bit memory only change state (not based on gate delays), but with respect to either the falling or rising edge of the clock pulse.
Clock - wire with a regular changing signal

Clock -

one clock cycle

Master-Slave D Flip-Flop

D

D "slave" Qs

D latch Qs

CLK

D "master" Q

D latch Q

Q

Q
The most basic 1-bit memory is the SR-latch with consists of two cross-coupled NOR gates.

Recall the NOR gate truth table:

<table>
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>(A + B)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

The "S" stands for Set to remember 1, and the R for Reset to remember 0. For example, if the input values of R = 0 and S = 1 are held constant, then the bottom NOR gate will start outputting 0. This 0 is fed as an input to the upper NOR with the R = 0, so both of the upper NOR gate's inputs are 0. Thus, the upper NOR gate will start outputting a 1 which is fed back to the bottom NOR as an input. Now, the bottom NOR has an input of two 1's, but it continues to output a 0 because that's what NORs do (see truth table).

If the S input's value is now changed back to 0 (both R = 0 and S = 0), then bottom NOR will continue to output 0 (i.e., \( \bar{Q} = 0 \)) since the top NOR is still outputting a 1. The top NOR has two 0 inputs, so it continues to output 1 (i.e., \( Q = 1 \)). This is a stable state where the SR-latch is remembering a 1.

A timing diagram for the above scenario would look like:

1. Initial state of R = 0 and S = 1
2. After a gate delay, the bottom NOR outputs a 0 which is fed to the upper NOR (the upper NOR has two 0s for inputs)
3. After a gate delay, the top NOR outputs a 1 which is fed to the bottom NOR (the bottom NOR has two 1s for inputs)
4. When S returns to 0, no change in outputs

The SR-latch is actually symmetric from top-to-bottom, so a similar argument can be made for the inputs of R = 1 and S = 0 causing a stable state of it remembering a 0.

The main problem with the SR-latch is that its output changes are determined by some number of gate delays after its inputs are changed. This makes it difficult for a collection of one-bit memories to work in a lock-step fashion to hold a multiple bit value. It also has the problem that inputs of R = 1 and S = 1 lead to an undefined state.
"Memory" Supplement for Section 3.6 of the textbook

We can solve these problems in a series of steps. First we’ll add a Clock input to prevent the memory from changing states if the Clock = 0. This leads us to a **clocked-SR latch**:

![Clocked-SR latch diagram]

When the Clock = 0, both AND gates will output 0s. This guarantees that the original SR-latch has inputs of R' = 0 and S' = 0, so it cannot change its memory state.

When the Clock = 1, the AND gates allow the R and S inputs to be passed through without change. This does not exactly solve either of the previously mentioned problems, but we can at least prevent the **clocked-SR latch** from changing by having the Clock = 0.

To solve the problem of inputs of R = 1 and S = 1 leading to an undefined state. We’ll consider a slightly different clocked latch called a **clocked D-latch**. The clocked D-latch remembers the value on the D input when the Clock = 1. If the Clock = 0, then the D input is ignored and the one-bit memory is not changed.

The clocked D-latch circuit is very similar to the clocked SR latch.

![Clocked D-latch diagram]

To fix the problem of the one-bit memory having its output changing being determined by some number of gate delays after its inputs are changed. We can use two clocked D-latches to build a **Master-Slave D flip-flop** which changes state only on the falling edge of the clock pulse. The below timing diagram shows how the internal clocked D-latches change within the master-slave flip-flop to achieve this.

When the Clock = 1, the master D-latch can change to D's value, but the slave D-latch cannot change since it receives the negated Clock signal of 0. When the Clock = 0, the master D-latch cannot change, but the slave D-latch can change to remember the value being remembered in the master D-latch (Qᵢ).
This master-slave D flip-flop allows for a collection of one-bit memories to work in a lock-step fashion to hold a multiple bit value. The value currently contained in a D flip-flop can be read (from the slave’s Q) in the same clock pulse as a new value is written to it (into the master’s D input). The following block diagram shows how four D flip-flops can be used to load and read 4-bit value in parallel. The “triangle” input of each flip-flop connected to the Load wire is the “Clock” input in the above diagram. This triangle input indicates that a flip-flop is being used instead of a latch.
Another more complex example would be a *shift-register* that allows the value stored in the register to also be shifted by one bit position several ways. The below diagram is a 4-bit shift-register that is able to perform the following operations:

- parallel read/output of all bits (just look at the Q values)
- (Control of 00_2) parallel write/input of all bits
- (Control of 01_2) logical shift left one bit position (value shifted out of most-significant bit is lost and a “0” is shifted into the least-significant bit)
- (Control of 10_2) circular shift right one bit position (value shifted out of least-significant bit is shifted into the most-significant bit)
- (Control of 11_2) arithmetic shift right (sign-extend the most-significant bit)

For each D-flip flop, the output of a MUX is used as the D-input. The Control wires control the MUXs to vary the value routed to the D inputs.

**Register File:**

Computers typically have a collection of registers, called a *register file*, to store data values currently being calculated with by the CPU. Individual registers are numbered (register 0 or R0, register 1 or R1, etc.) and assigned by the compiler or the assembly-language programmer to hold variable values.

Since most arithmetic operations need two operands (e.g., addition: Y + Z), most register files have at least two *read/output ports* and one *write/input port* to accommodate sending two values to the ALU and receiving one result.
To control each read port we need to be able to specify a register number for the register to be read. The width/number of bits read equals the number of bits per register. For example, 32 register (R0 to R31) need 5-bits to represent this range of values 000002 to 111112, i.e., 010 to 3110.

To control a write port we need to be able to specify a register number for the register to be written, the data to be written (equal to the number of bits in a register), and a write-enable signal. The write-enable signal indicates if we are writing a register, i.e., indicates if we should ignore the write register # and data.

A register-file block-diagram assuming:
- 32 registers numbered R0 to R31 (000002 to 111112 in binary)
- 64-bit registers
- two read ports
- one write port
One-bit (i\textsuperscript{th} bit) Slice of a Register File Implementation Using D-latches:
The diagram below shows only a one-bit slice for the i\textsuperscript{th} bit of the register file to simplify the diagram.

The write-port register number (5-bit number from R0 to R31) is input to a decoder to "wake-up" the corresponding register to be written by sending a 1 into the clock signal for register to be written. The decoder sends 0s to all the other registers' clock signals. Notice that the selected register only receives the 1 from the decoder if the write-enable signal is also a 1.

When a register is read at a read-port, the whole register is read in parallel (i.e., all 64-bits at the same time). For example, 64-bit registers requires 64 data wires of output. Each output wire is fed by a multiplexer which uses the register number to be read as its Control to select the bit from the appropriate register. In this example 64 MUXs each with 32-inputs (one for each register) are need per read port. Since we have two read ports, a total of 128 MUXs would be needed.
Reg. # for Read Port 0 5 bits

Reg. # for Write Port 6 bits

Decoder 5-to-32

R0 Clock

D Q

R1 Clock

D Q

R31 Clock

D Q

1\textsuperscript{st} bit

MUX 32-to-1

0 1

31

1\textsuperscript{st} bit of 64 bits

Data for Write Port 64 bits

Write Enable 1 bit

Note: To write a D flip-flop, we want its reg. # to be specified and the "Write Enable" to be asserted.

Note: 64 MUXs are need per Read Port - one per bit
“Memory” Supplement for Section 3.6 of the textbook

To help clarify the above discussion, a complete (not just a one-bit slice) register file that has:
- 4 registers (numbered R0 to R3, i.e., register numbers 00₂ to 11₂)
- 2-bits per register
- one write-port
- two read-ports

Figure 3.32 on page 143 is a similar to the above picture, except it is for a 4 word x 3 bit memory. Memories unlike register files typically have only a single combined read/write port. So only a read or a write can be performed at the same time. That’s why Figure 3.32 has only one set of address (S₀ and S₁) lines that act like the register # in the above diagram. Notice that Figure 3.32 shows the AND gates and OR gates need to implement the decoder (along the bottom) and the MUXs (along the right and above the D flip-flops).
Implementation of Large Memory Chips:

The above register-file design does not scale well for large (realistic) memories for several reasons:
- the number of gates in the address decoder (and MUXs) grows exponentially with the number of bits in the address.
- lots of wires into/out of the memory chip for address, data, and control

These problems are solved by
- using square-array of bits and decoding the address in two parts (row then column number)
- eliminate MUX's by using tri-state buffers
- single-port RAM memory - data wires shared for reading and writing, and one address specified

Consider a larger memory chip with 4M word x 4-bit words, i.e., a 4M x 4 memory chip. This chip would have a 22-bit addresses since $4M = 2^{22}$.

Conceptually (logically/externally) we view a 4M x 4-bit memory as pictured below with:
- each memory word made up of four bits: $b_3, b_2, b_1, b_0$

<table>
<thead>
<tr>
<th>Decimal Address</th>
<th>$b_3$</th>
<th>$b_2$</th>
<th>$b_1$</th>
<th>$b_0$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>...</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>$2^{22} - 1$</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

When we want to write a word, we supply a 22-bit address and the corresponding new 4-bit word for that row. If we actually implemented this memory using a register-file design, we'd need a 22-to-$2^{22}$ decoder for the write-port. If we think about how a decoder is implemented, each decoder output is fed by an AND gate looking for its corresponding address. Thus, we'd need $2^{22}$ (or 4 M) 22-input AND gates. Since 22-input AND gates exceed the limit of 9-inputs or less for an AND, we would need to build each 22-input AND gate using 3 smaller ANDs as:
A total of $3 \times 2^{22}$ AND gates are needed to implement just the write-port decoder.

As in Figure 3.32, the read port could use the same decoder as the write port. Still, each bit in the memory requires a 2-input AND gate. Additionally, each of the four output wires is fed by a really BIG $2^{22}$-input OR gate. Because of the 9-input limit, each $2^{22}$-input OR must be implemented by multiple levels of smaller OR gates. The number of actual OR gates needed to implement each $2^{22}$-input OR would be about:

$$\left[ \frac{2^{22}}{9} \right] + \left[ \frac{2^{22}}{9^2} \right] + \left[ \frac{2^{22}}{9^3} \right] + \ldots + 1 = 524,289$$

The total for number of gates for the read port would be $4 \times 2^{22}$ ANDs + $4 \times 524,289$ ORs.

Assuming D-latches to store each bit of the memory, the total number of gates needed for storing all $2^{22} \times 4$ bits of memory is $2^{22} \times 4 \times 4 = 2^{26}$ gates.

Overall, the number of gates in the whole memory chip is about $(3 \times 2^{22}) + (4 \times 2^{22} + 4 \times 524,289) + (2^{26})$.

Since the D-latch gates are the only ones that are used for actually storage, the gates associated with the read and write ports can be considered overhead. The percentage of gates used for actual storage (the D-latch gates) of bits is:

$$\frac{2^{26}}{(3 \times 2^{22}) + (4 \times 2^{22} + 4 \times 524,289) + (2^{26})} \times 100 = 68.1\%$$

In practice memory chips do not use SR-latch based 1-bit memories, called static memory, because each bit requires multiple gates per bit of storage (e.g., 4 gates per bit for a D-latch). To increase the amount of memory on a chip, dynamic memory is used that only requires one gate per bit of storage. The one gate used in one bit of dynamic memory acts like a capacitor. You might recall from Physics that a capacitor is a device for storing a charge across two nearby "plates." The electrons from one plate are forces from one plate to the other plate. Because the plates are insulated from each other, the charge is stored. The magnitude of the charge difference between the plates determines whether a "0" is being stored (say a voltage difference of 0 to 1 volts) or a "1" is being stored (say a voltage difference of 2 to 5 volts) in the memory.

Dynamic memory (the "D" in DRAM) can store about four times more bits on a chip than static memory, but is also has some problems:

- reading the value being stored is relatively slow since the volt difference must be sensed to determine the binary value being stored.
- chips are made from semiconductor material, so a dynamic-memory gate storing a "1" with a high voltage difference (say 5 volts) will have electrons leak across the "plates" and eventually be storing a "0" (voltage difference of 0). To solve this problem, dynamic memory chips must be refreshed which means that periodically the whole memory must be read and rewritten to boost the "1" values up to a high voltage difference. Usually, the memory refreshing only takes a fraction of a percent of the memory’s overall execution time.
"Memory" Supplement for Section 3.6 of the textbook

If we recalculate the percentage of gates used for actual storage assuming dynamic memory (one gate per bit), we get:

\[
\frac{2^{24}}{(3 \times 2^{22} + 4 \times 2^{22} + 4 \times 524,289) + (2^{24})} \times 100 = 34.8\% 
\]

Thus, 65.2% of the gates on the chip are being "wasted" in overhead for decoding, MUXs, etc.

To help us see how the 4M x 4-bit memory gets mapped to the 2048 x 2048 x 4 memory array on the next page consider splitting memory into 2048 word blocks as shown below. Notice that the most-significant 11 bits of the binary address correspond to the row number, and the least-significant 11 bits of the address to the offset within the row.
"Memory" Supplement for Section 3.6 of the textbook

Each bit of a word is split into a separate 2048 x 2048 square memory "array". Each of these square memory arrays is $2048 \times 2048 = 2^{11} \times 2^{11} = 2^{22} = 4M$ bits. The 22-bit address of the 4M memory is split into two 11-bit parts. The upper 11-bits is first used to activate the correct row within the square memory arrays. Of the 2048-bit row that is read from each memory array, we are interested in only one bit. The lower 11-bits of the address specifies the location of the desired bit within the 2048-bit row.

Note: This is a tri-state (three-state) buffer. It acts as a switch. When the "Control" is a 1, the "In" signal is passed to the "Out" wire. When the "Control" is a 0, the "In" is disconnected from the "Out" wire.
For this “square-memory” design of memory, we can recalculate the percentage of gates used for actual storage (the dynamic-memory “capacitor” gates) of bits. We now have two, smaller, 11-to-2^11 decoders. Each decoder output is fed by an AND-gate looking for its corresponding address. Thus, we’d need 2^{11} (or 2 K) 11-input AND gates. Since 11-input AND gates exceed the limit of 9-inputs or less for an AND, we would need to build each 11-input AND gate using 2 smaller ANDs as:

A total of 2 \times 2 \times 2^{11} = 2^{13} AND gates are needed to implement both the row and column decoders.

No MUXs are needed in the square-memory design, but we do need 4 \times 2^{11} = 2^{13} tri-state buffers.

Assuming dynamic memory to store each bit of the memory, the total number of gates needed for storing all 2^{22} x 4 bits of memory is 2^{22} \times 4 = 2^{24} gates.

Overall, the number of gates in the whole memory chip is about 2^{13} + 2^{13} + 2^{24}.

For the square-memory design, the percentage of gates used for actual storage of bits is:

\[ \frac{2^{24}}{2^{13} + 2^{13} + 2^{24}} \times 100 = 99.9\% \]

Thus, only about 0.1 % of the memory chips capacity is wasted on overhead for decoding, MUX, etc.

The square-memory design also offers a couple of other important benefits:

- when refreshing the memory, whole rows can be read and rewritten at a time so refreshing is relatively quick
- since whole rows are read from the slow dynamic memory at a time, a block of memory from contiguous memory locations can be read about as fast as a reading a single memory access. Thus, many real DRAM memory chips provide a fast-page mode or burst feature that allow accessing a block of memory efficiently.
Implementing Large Memory with Smaller Chips

Consider for example, implementing $4M \times 32$ bits memory with $256KB \times 1$ bit chips. The $256KB \times 1$ chips are implemented as square arrays of $512 \times 512$.

We will use an two-dimensional array of the $256KB \times 1$ chips to implement the larger memory.

The number of chips per row would be $32$ bits / $1$ bit = $32$ chips.
The number of chips per column would be $4M / 256K = 2^{22} / 2^{18} = 2^4 = 16$ chips per column.

The 22-bit address of the $4M \times 32$ bit memory would be split up as follows:

22-bit Address

```
<table>
<thead>
<tr>
<th>chip row</th>
<th>row # within chip</th>
<th>column # within chip</th>
</tr>
</thead>
<tbody>
<tr>
<td>4 bits</td>
<td>18 bits</td>
<td></td>
</tr>
<tr>
<td>4-to-16</td>
<td>0 1</td>
<td></td>
</tr>
<tr>
<td>Decoder</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0 1 15</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

```
512 x 512 chip
Out
CS
```

```
Row 0 of Chips
• • •
```

```
Row 1 of Chips
• • •
```

```
Row 15 of Chips
• • •
```

"CS" means Chip Select

b_{31} b_{30} b_0
1) Summarize in your own words why a “square-memory” implementation of main memory is used instead of a “register-file” implementation.

2) Why do RAM memory chips use dynamic memory (capacitor based) instead of static memory (SR-latch based) even though dynamic memory has the drawback that it needs to be refreshed and is slower?

3) To illustrate how poorly a register-file design scales, consider implementing a 1 G x 8 (1 G registers, each with 8 bits) register file with one write-port and one read-port.  
   a) How many and what type of decoder(s) would be needed?
   
b) How many total gates (assume 9-input limit on AND & OR gates) would be needed to implement this (these) decoder(s)?
   
c) How many and what type of MUX(s) would be needed?
   
d) How many total gates (assume 9-input limit on AND & OR gates) would be needed to implement this (these) MUX(s)?
   
e) Assuming D flip-flops to store each bit (4 gates/flip-flop), what % of the total gates is used to implement the D flip-flops? The formula for this would be:

\[
\frac{\text{Total number of gates for the D flip-flops}}{\text{Total number of gates for D flip-flops + Total for the MUX(s) + Total for the Decoder(s)}} \times 100
\]
4) Redo the previous question using the 1 G x 8 square-memory implementation similar to the Memory supplement. Assume a single shared read/write port and assume dynamic memory is used to store each bit (1 gate/bit) for part (e).

a) How many and what type of decoder(s) would be needed?

b) How many total gates (assume 9-input limit on AND & OR gates) would be needed to implement this (these) decoder(s)?

c) How many and what type of MUX(s) would be needed?

d) How many total gates (assume 9-input limit on AND & OR gates) would be needed to implement this (these) MUX(s)?

e) Assuming dynamic memory is used to store each bit (1 gates/flip-flop). What % of the total gates is used to implement the dynamic memory? The formula for this would be:

\[
\frac{\text{Total number of gates for the dynamic memory}}{\text{Total number of gates for dynamic memory} + \text{Total for the MUX(s)} + \text{Total for the Decoder(s)}} \times 100
\]
Lecture 9: Reg. File vs. square-memory RAM

Reg. File design does not scale well to RAM (main) memory

4M words × 4bit RAM

\[ 2^{22} = 2^2 \times 2^{20} \]

Data

Reg. File design

22-bit addr

22 22-input ANDs

9 \rightarrow D

4 \times 2^{22} \times 3 AND

2^{22} to-1

4-bit Real

MSB

2^{22} to-1

OR

4 \times 2^{22} \times 3 AND

\[ 2^{a2} \rightarrow 1 \]

\[ \text{MUX} \]

\[ \text{LSB} \]
"Square-Memory" RAM - Single port
only read or write single value

\( \left( \frac{2^{21}}{2^{22}} \right) \) 4M x 4 bits

11-bit 11-bit
22-bit addr

11-to-2 \( \times \) 2\( \times \) 2
Decide

Row Decider

MSb \( b_3 \)

2\( ^{11} \) 2\( ^{10} \) 2\( ^{9} \) 2\( ^{8} \) 2\( ^{7} \) 2\( ^{6} \) 2\( ^{5} \) 2\( ^{4} \) 2\( ^{3} \) 2\( ^{2} \) 2\( ^{1} \) 2\( ^{0} \)

2\( ^{11} \) bit

2\( ^{10} \) bit

2\( ^{9} \) bit

2\( ^{8} \) bit

2\( ^{7} \) bit

2\( ^{6} \) bit

2\( ^{5} \) bit

2\( ^{4} \) bit

2\( ^{3} \) bit

2\( ^{2} \) bit

2\( ^{1} \) bit

2\( ^{0} \) bit

 LSB

 tri-state buffer

Control

0 = L

1 = H

MSb wire

4-bits

Lecture 9 (2)
D-latch 4 gates/bit Static Memory

Synchronous RAM

1 gate/bit Dynamic Memory

capacitor based

\[ V \text{ voltage difference} \]

Adv.
+ 4 times more storage

Disadv.
- periodically read and rewrite whole memory
  "refresh" memory \( \gg \ll 1\% \text{ time} \)
- slower R/W sense whether 1 or 0 stored
Test 1 will be Thursday Sept. 28, in class. It will be closed book and notes, except for one 8.5" x 11" sheet of paper (front and back) with notes. Review topics for Test 1 are:

Chapter 1.
Basic structure and functional units of a computer
Functional units: Input, CPU, Memory, Output, System bus
CPU components: control unit, ALU, regs., internal CPU interconnection
History: Computer Generations: 1st (vacuum tubes), 2nd (transistors, early operating systems, and high-level programming languages), 3rd (small and medium scale integrated circuits), 4th (large and very large scale integrated circuits and microprocessors)
Moore's law, von Neumann model, Instruction/Machine cycle
Programming Languages: High-level language, Assembly Language, and Machine Language
Computer Level Hierarchy

Chapter 2.
Unsigned binary numbers
Conversion between base 10, base 2, and base 16
Signed number representation: sign bit and magnitude, one's complement, two's complement
Addition and subtraction of signed and unsigned numbers
Overflow in integer arithmetic
Booth's Algorithm for integer multiplication
Floating-point Representation: 32-bit, 64-bit, and 128-bit IEEE 754 standard and special values
Addition and multiplication of floating-point numbers
Character Representations: Binary-Coded Decimal (BCD), EBCDIC, ASCII, Unicode
Error Detection and Correction: Hamming Code and CRC

Chapter 3. (SKIP SUBSECTION 3.6.6)
Truth tables for the gates
Boolean algebra notations: sum-of-products
Combinational circuit design (no memory): 1) determine truth table for function, 2) using Boolean identities or K-maps to get minimized sum-of-products function, 3) draw implementation of minimized function (using gates)
Common combinational circuits: decoder, multiplexer (MUX), 1-bit Adders (half and full), ripple-carry adder, faster (2-bit) carry-lookahead adders,
Number of gate delays for a circuit.
Complexity number of a circuit (# of gates + # inputs to those gates)
Sequential Circuits (1-bit memory): SR latch - know how it remembers (two stable states, etc.), know how it changes states;
Gated/Clocked D latches. Master-slave D Flip Flop; their characteristic tables
Timing diagrams for latches and Flip Flops
Register: shifting & rotating
Register file - design and usage
Square-memory implementation of large memories
Computer Organization Test 1

Question 1. (15 points)
a) Convert 450\textsubscript{10} to a binary (base 2) value.

b) Convert 450\textsubscript{10} to a hexadecimal (base 16) value.

c) Convert -450\textsubscript{10} to a two's complement value.

d) Perform the following arithmetic operations:
\[
\begin{array}{c}
11001102 \\
+ 1011112
\end{array}
\quad \begin{array}{c}
11001102 \\
- 1011112
\end{array}
\quad 976C3_{16}
\quad 976C3_{16}
\quad 49EA8_{16}
\quad 49EA8_{16}
\]

Question 2. (8 points)
a) Convert the value +108.5625\textsubscript{10} to its binary representation.

\[
\begin{array}{cccccccc}
64 & 32 & 16 & 8 & 4 & 2 & 1 & .5 & .25 & .125 & .0625 \\
\hline
\text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }
\end{array}
\]

b) Normalize the above value so that the most significant 1 is immediately to the left of the radix point. Include the corresponding exponent value to indicate the motion of the radix point.

\[
1.\boxed{\text{ }} \times 2^n
\]

c) Write the corresponding normalized 32-bit IEEE 754 floating point representation for +108.5625\textsubscript{10}.

<table>
<thead>
<tr>
<th>Sign bit</th>
<th>8-bit (bias 127) Exponent</th>
<th>23-bit Mantissa</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>(for normalized values, leading 1 not stored)</td>
</tr>
</tbody>
</table>

Question 3. (7 points) Consider adding the values 1.101\textsubscript{2} \times 2^{40} and 1.011\textsubscript{2} \times 2^{35}.

a) How many places does the second number's mantissa get shifted before the addition?

b) After we add these two numbers and store the results back into a 32-bit IEEE 754 variable, what would be the result stored as a 32-bit IEEE 754 variable?

<table>
<thead>
<tr>
<th>Sign bit</th>
<th>8-bit (bias 127) Exponent</th>
<th>23-bit Mantissa</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>(for normalized values, leading 1 not stored)</td>
</tr>
</tbody>
</table>
Fall 2016

Question 4. (8 points) Determine the Hamming codeword if the 8-bits of data ($D_7$ to $D_0$) are 0110 1100, i.e., fill in the data bits and determine the values of the four even-parity bits ($P_4$, $P_3$, $P_2$, and $P_1$) to allow for one-bit error detection and correction.

<table>
<thead>
<tr>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
</tr>
</thead>
<tbody>
<tr>
<td>$D_7$</td>
<td>$D_6$</td>
<td>$D_5$</td>
<td>$D_4$</td>
<td>$P_8$</td>
<td>$D_3$</td>
<td>$D_2$</td>
<td>$D_1$</td>
<td>$P_4$</td>
<td>$D_0$</td>
<td>$P_2$</td>
<td>$P_1$</td>
</tr>
<tr>
<td>4+8</td>
<td>1+2+8</td>
<td>2+8</td>
<td>1+8</td>
<td>8</td>
<td>1+2+4</td>
<td>2+4</td>
<td>1+4</td>
<td>4</td>
<td>1+2</td>
<td>2</td>
<td>1</td>
</tr>
</tbody>
</table>

Question 5. (12 points) Consider the 12-bit data 10111001110012 and generator polynomial $G = x^5 + x^2 + 1$ (1001012). Using the Cyclic Redundancy Check (CRC) method:

a) what code word (data and remainder) would be sent to the receiver?

```
1 0 0 1 0 1 | 1 0 1 1 1 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0
```

b) Upon receiving a codeword, describe how the receiving computer checks for an error in transmission?
Question 6. (20 points)  For the Boolean function represented an unsimplified sum-of-products (SOP) Boolean expression $F = \overline{ABD} + \overline{ABCD} + \overline{ABCD} + ABCD + \overline{ABCD} + \overline{ABCD}$

a) Using a K-map or the identities of Boolean algebra simplify this function $F$ as much as possible.

<table>
<thead>
<tr>
<th>Identity Name</th>
<th>AND Form</th>
<th>OR Form</th>
</tr>
</thead>
<tbody>
<tr>
<td>Identity Law</td>
<td>$1x = x$</td>
<td>$0 + x = x$</td>
</tr>
<tr>
<td>Null (or Dominance) Law</td>
<td>$\overline{x} = 0$</td>
<td>$1 + x = 1$</td>
</tr>
<tr>
<td>Idempotent Law</td>
<td>$xx = x$</td>
<td>$x + x = x$</td>
</tr>
<tr>
<td>Inverse Law</td>
<td>$x\overline{x} = 0$</td>
<td>$x + \overline{x} = 1$</td>
</tr>
<tr>
<td>Commutative Law</td>
<td>$xy = yx$</td>
<td>$x + y = y + x$</td>
</tr>
<tr>
<td>Associative Law</td>
<td>$(xy)z = x(yz)$</td>
<td>$(x + y) + z = x + (y + z)$</td>
</tr>
<tr>
<td>Distributive Law</td>
<td>$x + yz = (x + y)(x + z)$</td>
<td>$x(y + z) = xy + xz$</td>
</tr>
<tr>
<td>Absorption Law</td>
<td>$x(x + y) = x$</td>
<td>$x + xy = x$</td>
</tr>
<tr>
<td>DeMorgan's Law</td>
<td>$(\overline{x}y) = \overline{x} + \overline{y}$</td>
<td>$(\overline{x} + \overline{y}) = xy$</td>
</tr>
<tr>
<td>Double Complement Law</td>
<td>$\overline{\overline{x}} = x$</td>
<td>$\overline{x} = x$</td>
</tr>
</tbody>
</table>

b) Draw the simplified circuit for your answer in part (a) using AND, OR, and NOT gates. (NOTE: If you were unable to simplify the circuit, you may alternatively draw the unsimplified circuit instead.)

c) Ignoring NOT gates, how many gate delays are in your above circuit?

d) What is the complexity (# of non-NOT gates + # of inputs to those gates) of your circuit?

Question 7. (5 points) Why do RAM memory chips use dynamic memory (capacitor based) instead of static memory (SR-latch based) eventhough dynamic memory has the drawbacks of needing to be refreshed and being slower?
Fall 2016

Question 8. (10 points)

a) If \( R = 1 \) and \( S = 0 \) for a while (> 2 gate delays), then, then what will be the output on \( Q \) and \( \overline{Q} \)?

\[
\begin{array}{c|c|c}
A & B & A \text{ NOR } B \\
0 & 0 & 1 \\
0 & 1 & 0 \\
1 & 0 & 0 \\
1 & 1 & 0 \\
\end{array}
\]

b) Now, if \( R \) goes to a 0 value while keeping \( S = 0 \), explain how the SR-latch continues to remember.

Question 9. Recall the sum-of-products (SOP) Boolean formula for the carry-out \( (c_{in}) \) of a \( n \)-bit adder were:

1-bit adder: \( c_{i+1} = x_i y_i + x_i c_i + y_i c_i \).

2-bit adder: \( c_{i+1} = x_i y_i + x_i (x_i y_i + x_i c_i + y_i c_i) + y_i (x_i y_i + x_i y_i + y_i c_i) \)
\[
= x_i y_i + x_i x_i y_i + x_i x_i c_i + x_i y_i c_i + y_i x_i y_i + y_i x_i c_i + y_i y_i c_i
\]

3-bit adder: \( c_{i+1} = x_i y_i + x_i (7 \text{ product terms of 2-bit adder}) + y_i (7 \text{ product terms of 2-bit adder}) \)
\[
= x_i y_i + x_i x_i y_i + ... \quad (12 \text{ product terms omitted}) + y_i y_i c_i + y_i y_i c_i
\]

The table summarizing the gate delays for different types of adders **assuming an 9-input limit into any gate**.

<table>
<thead>
<tr>
<th>Type of Adder</th>
<th># of product terms in SOP expression to be OR’ed</th>
<th>Most # of inputs in any of the product terms</th>
<th># gate delays due to product/AND terms</th>
<th># gate delays due to sum/OR</th>
<th>Gate Delay per Adder</th>
<th>Gate Delays for 32-bit ripple adder using adders of this type</th>
</tr>
</thead>
<tbody>
<tr>
<td>1-bit</td>
<td>3</td>
<td>2</td>
<td>1</td>
<td>1</td>
<td>2</td>
<td>32 \times 2 = 64</td>
</tr>
<tr>
<td>2-bit</td>
<td>7</td>
<td>3</td>
<td>1</td>
<td>1</td>
<td>2</td>
<td>16 \times 2 = 32</td>
</tr>
<tr>
<td>3-bit</td>
<td>15</td>
<td>4</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>10 \times 3 + 2 = 32</td>
</tr>
</tbody>
</table>

a) (5 points) Explain why each 3-bit adder required 3 gate delays while each 2-bit adder only requires 2 delays.

Question 10. (10 points) Suppose we have a register file with the following specifications:

- 8 registers (numbered 0 to 7) each with 32 bits
- one write port
- two read ports

a) How many "wires" would be needed to transmit data read from a read port?

b) How many "wires" would be needed to specify the register number of a read port?

c) How many decoders would be needed in the implementation of the whole register file?

d) What type of decoder(s) (i.e., # of inputs-to-# of outputs) are needed in part (c)?

e) How many multiplexers (MUXs) would be needed in the implementation of the whole register file?

f) What type of multiplexer(s) (i.e., # of inputs-to-# of outputs) are needed in part (e)?