Quiz #3: April 15, 2016

<table>
<thead>
<tr>
<th>Name</th>
<th>Athena login name</th>
<th>Score</th>
</tr>
</thead>
<tbody>
<tr>
<td>Jonathan</td>
<td>Rakesh</td>
<td>Lucy</td>
</tr>
<tr>
<td>☐ WF 10, 34-302</td>
<td>☐ WF 2, 34-302</td>
<td>☐ WF 12, 35-308</td>
</tr>
<tr>
<td>☐ WF 11, 34-302</td>
<td>☐ WF 3, 34-302</td>
<td>☐ WF 1, 35-308</td>
</tr>
<tr>
<td>Robert</td>
<td>Chris</td>
<td>Gabriel</td>
</tr>
<tr>
<td>☐ WF 12, 34-302</td>
<td>☐ WF 10, 34-301</td>
<td>☐ WF 2, 4-153</td>
</tr>
<tr>
<td>☐ WF 1, 34-302</td>
<td>☐ WF 11, 34-301</td>
<td>☐ WF 3, 4-153</td>
</tr>
</tbody>
</table>

Please enter your name and Athena login name in the spaces above. Enter your answers in the spaces provided after each question. You can use the extra white space and the backs of the pages for scratch work. We can give partial credit only if your work is shown on the exam.

Problem 1. Building the Beta (10 points)

For this problem assume that each register has been initialized to the value 0x0000??00 where “??” is the register number as a two-digit hex number. So R0 is initialized to 0x00000000, R1 to 0x00000100, …, and R30 to 0x000001E00. R31 of course always reads as 0.

For the instruction below, please indicate the values that will be found in the unpipelined Beta datapath just before the end of the clock cycle in which the instruction is executed. If the value doesn’t matter since it’s not used during the execution of the instruction or can’t be determined, write “—”. Otherwise enter the value in decimal, hex (use a “0x” prefix), or binary (use a “0b” prefix). You may wish to refer to the Unpipelined Beta diagram included in the reference material.

```
. = 0x100
LD(R3, -0x200, R7)
// hex for instruction
0x60E3FE00
```

RA1 input to register file: 3
RA2 input to register file: —
RD1 output from register file (JT): 0x300
RD2 output from register file (MWD):
A input to ALU: 0xfffe
B input to ALU: 0x0l
Output from ALU (MA): 0x00
Output from Data Memory (MRD): 0x60E3
WD input to Register File: 0x60E3
Input to the PC register (output of RESET mux): 0x104
Problem 2. Caches (11 points)

(A) (1 point) The timing for a particular cache is as follows: checking the cache takes 1 cycle. If there's a hit the data is returned to the CPU at the end of the first cycle. If there's a miss, it takes 10 additional cycles to retrieve the word from main memory, store it in the cache, and return it to the CPU. If we want an average memory access time of 1.4 cycles, what is the minimum possible value for the cache's hit ratio?

\[ 1.4 = 1 + (1 - \alpha) 10 \]

\[ \Rightarrow 0.4 = 1 - \alpha \]

Minimum possible value of hit ratio: \( \alpha_{min} \)

(B) (3 points) If the cache block size, i.e., words/cache line, is doubled but the total number of data words in the cache is unchanged, how will the following cache parameters change? Please circle the best answer.

- \# of offset bits: \( \text{UNCHANGED} \) ... +1 ... -1 ... 2x ... 0.5x ... CAN'T TELL
- \# of tag bits: \( \text{UNCHANGED} \) ... +1 ... -1 ... 2x ... 0.5x ... CAN'T TELL
- \# of cache lines: \( \text{UNCHANGED} \) ... +1 ... -1 ... 2x ... 0.5x ... CAN'T TELL

Consider a direct-mapped cache with 64 total data words with 1 word/cache line, which uses a LRU replacement strategy and a write-back write strategy. This cache architecture is used for parts (C) through (F).

(C) (1 point) If cache line number 5 is valid and its tag field has the value 0x1234, what is the address in main memory of the data word currently residing in cache line 5?

Main memory address of data word in cache line 5: \( 0x1234 \)

The program shown on the right repeatedly executes an inner loop that sums the 16 elements of an array that is stored starting in location 0x310.

The program is executed for many iterations, then a measurement of the cache statistics is made during one iteration through all the code, i.e., starting with the execution of the instruction labeled outer_loop: until just before the next time that instruction is executed.

\[
. = 0 \\
\text{outer_loop:} \\
\text{CMOVE}(16,R0) \quad // \text{initialize loop index } J \\
\text{CMOVE}(0,R1) \\
\text{loop:} \quad // \text{add up elements in array} \\
\text{SUBC}(R0,1,R0) \quad // \text{decrement index} \\
\text{MULC}(R0,4,R2) \quad // \text{convert to byte offset} \\
\text{LD}(R2,0x310,R3) // \text{load value from A[J]} \\
\text{ADD}(R3,R1,R1) \quad // \text{add to sum} \\
\text{BNE}(R0,loop) \quad // \text{loop until all words are summed} \\
\text{BR(outer_loop)} \quad // \text{perform test again!} \\
\]

6.004 Spring 2016

Quiz #3
(D) (2 points) In total, how many instruction fetches occur during one complete iteration of the outer loop? How many data reads?

\[ \text{fetch} = 2 + (5 \times 16) + 1 \]
\[ \text{read} = 1 \times 16 \]

Number of instruction fetches: 83
Number of data reads: 16

(E) (2 points) How many instruction fetch misses occur during one complete iteration of the outer loop? How many data read misses? Hint: remember that the array starts at address 0x310.

Number of instruction fetch misses: 4
Number of data read misses: 4

(F) (2 points) What is the hit ratio measured after one complete iteration of the outer loop?

\[ \text{total memory} = 83 + 16 = 99 \]
\[ \text{total hits} = 99 - 3 \]

Hit ratio: \[ \frac{99 - 3}{99} = \frac{96}{99} \]
Problem 3. Pipelined Beta (9 points)

The program shown on the right is executed on a 5-stage pipelined Beta with full bypassing and annulment of instructions following taken branches.

The program has been running for a while and execution is halted at the end of cycle 108.

The pipeline diagram shown below shows the history of execution at the time the program was halted.

<table>
<thead>
<tr>
<th>cycle</th>
<th>100</th>
<th>101</th>
<th>102</th>
<th>103</th>
<th>104</th>
<th>105</th>
<th>106</th>
<th>107</th>
<th>108</th>
</tr>
</thead>
<tbody>
<tr>
<td>IF</td>
<td>MULC</td>
<td>LD</td>
<td>ADD</td>
<td>BNE</td>
<td>BNE</td>
<td>BNE</td>
<td>BR</td>
<td>SUBC</td>
<td>MULC</td>
</tr>
<tr>
<td>RF</td>
<td>SUBC</td>
<td>MULC</td>
<td>LD</td>
<td>ADD</td>
<td>ADD</td>
<td>ADD</td>
<td>BNE</td>
<td>NOP</td>
<td>SUBC</td>
</tr>
<tr>
<td>ALU</td>
<td>NOP</td>
<td>SUBC</td>
<td>MULC</td>
<td>LD</td>
<td>NOP</td>
<td>NOP</td>
<td>ADD</td>
<td>BNE</td>
<td>NOP</td>
</tr>
<tr>
<td>MEM</td>
<td>BNE</td>
<td>NOP</td>
<td>SUBC</td>
<td>MULC</td>
<td>LD</td>
<td>NOP</td>
<td>NOP</td>
<td>ADD</td>
<td>BNE</td>
</tr>
<tr>
<td>WB</td>
<td>ADDC</td>
<td>BNE</td>
<td>NOP</td>
<td>SUBC</td>
<td>MULC</td>
<td>LD</td>
<td>NOP</td>
<td>NOP</td>
<td>ADD</td>
</tr>
</tbody>
</table>

Please indicate on which cycle(s), 100 through 108, each of the following actions occurred. If the action did not occur in any cycle, write “NONE”. You may wish to refer to the signal names in the 5-stage Pipelined Beta Diagram included in the reference material.

Register value used from Register File: \[100, 105, 106, 108\]

Register value bypassed from ALU stage to RF stage: \[101, 102\]

Register value bypassed from MEM stage to RF stage: \[NONE\]

Register value bypassed from WB stage to RF stage: \[105\]

IRSrc\(^{IF}\) was 1: \[106\]

IRSrc\(^{IF}\) was 2: \[NONE\]

STALL was 1: \[103, 104\]

PCSEL was 1: \[106\]

WDSEL was 2: \[105\]

END OF QUIZ 3!