cs470 - Computer Architecture 1
Spring 2000

Final Exam
open books, open notes

Starts: 7:30 pm          Ends:  9:30 pm

Name:________________________(please print)
ID:________________________

<table>
<thead>
<tr>
<th>Problem</th>
<th>Max points</th>
<th>Your mark</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>20</td>
<td></td>
<td>5+5+10</td>
</tr>
<tr>
<td>2</td>
<td>55</td>
<td></td>
<td>10+10+5+5+5+5+5+5+5+5+5+5</td>
</tr>
<tr>
<td>3</td>
<td>15</td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>100</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

What happens with this paper (mark one):**Discard**  **Mail**  at the following address:
1. Some computers have explicit instructions to extract an arbitrary field from a 32-bit register and place it in the least significant bits of a register. The operation is depicted below:

a) Find the shortest sequence of MIPS native instructions that extracts a field, assuming that $k=8$, $j=17$, the field is extracted from register $t1$ and the result is stored in register $t0$.

b) Assume that we want to introduce such an instruction in the MIPS instruction set. Indicate a possible format for this instruction.

c) What extra hardware resources would be needed to support this instruction on the single clock cycle datapath MIPS (figure 5.19 at page 360)? Use the attached sheet to show your solution.
2. Assume the following piece of MIPS assembly:

```
.data 0x10000000
var1: .word 0x8192a3b4
var2: .word 0

.text 0x400040
main: subu $sp, $sp, 4
sw $ra, 0($sp)
jal Misery
lw $ra, 0($sp)
addu $sp, $sp, 4
jr $ra

.text 0x400100
Misery:
lui $t0, 4096
lb $t1, X($t0) # X is the first digit of your SSN modulo 4
sw $t1, 4($t0)
jr $ra
```

a) Show the sequence of addresses issued by the CPU to execute this code. You may assume that the initial value of the stack pointer is 0x7ffffffe0.

<table>
<thead>
<tr>
<th>Address (in hexadecimal)</th>
<th>Cache Hit or Miss</th>
<th>TLB Hit or Miss</th>
<th>Page Faults</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

---

† Virgil Barciceanu, 1997
b) Assume a direct mapped cache with a size of 32 bytes, with 4 bytes per block and a write-back policy. Assume that all the blocks in the cache are invalid at the time the program starts executing. Show the contents of the cache after the last statement has been executed. Make sure to label each reference as a hit or miss in the ‘Cache Hit or Miss’ column of the above table. Use the ruled page attached to this exam to show the cache. Clearly mark the limits of your cache and label the fields of it.

c) What is the miss-rate?

d) The system on which you run the program has virtual memory, with a page size of 2 KB and a main memory of 32 MB. Show the size of various fields in the virtual address.

e) What is the size of the page table? Assume that each entry in the table has a size which is a multiple of byte; each entry uses four bits for various purposes (‘Valid’, ‘Dirty’, etc.).
f) What is the contents of the variable var2 after the program executes? Assume a Big Endian memory model.

3. Consider a program consisting of 100

   \texttt{lw $8, 0($8)}

   instructions (don’t worry about whether this code is any good). What would the actual CPI be if the program were run on the pipelined datapath on Figure 6.45 on page 491?

4. You have a register-register architecture that has two addressing modes, base-displacement and memory indirect, besides register and immediate. You want to improve this architecture by eliminating the memory indirect addressing mode: this will decrease the clock cycle by 10\% but will increase somehow the instruction count because you will have to replace instructions like:

   \texttt{lw R1, @(R2)}

   with a sequence of instructions. Assume that the frequency of memory indirect addressing
is 5%, and that the overall CPI does not change.

a) show the sequence of instructions that will replace every memory indirect addressing;

b) how does the performance of the new architecture compare with the performance of the original?