CIS 351 Cache Homework
Due: Wednesday, 25 November, 2020. (Only problems 1-6. We may not get to 7 and 8 this semester.)
All Patterson and Hennessey problems are from the 3rd edition.
- (Harris and Harris, problem 8.1) In less than one page,
describe four everyday activities (other than using a library) that
exhibit temporal or spatial locality. List two activities for each
type of locality, and be specific.
- For each cache configuration below, fill in the corresponding table column
to show whether each reference is a hit or a miss.
- 1MB, direct-mapped cache with 4 byte blocks.
- 1MB, direct-mapped cache with 8 byte blocks.
|
Part (a) |
Part (b) |
|
|
References |
Index |
H/M |
Index |
H/M |
|
|
0x00000004 |
1 |
Miss |
0 |
Miss |
|
|
0x00800018 |
6 |
Miss |
3 |
Miss |
|
|
0x00D00010 |
|
|
|
|
|
|
0x00000004 |
|
|
|
|
|
|
0x0080001C |
|
|
|
|
|
|
0x00D00014 |
|
|
|
|
|
|
0x00A00008 |
|
|
|
|
|
|
0x00A00004 |
|
|
|
|
|
|
0x00000008 |
|
|
|
|
|
|
0x00200030 |
|
|
|
|
|
|
0x00200024 |
|
|
|
|
|
|
0x00200034 |
|
|
|
|
|
|
0x00D00034 |
|
|
|
|
|
|
0x00200030 |
|
|
|
|
|
|
- There should be four rows where columns (a) and (b)
differ. Explain precisely how the change in cache configuration
produces the observed behavior differences.
- For each cache configuration below, fill in the corresponding table column
to show whether each reference is a hit or a miss.
- 1MB, direct-mapped cache with 4 byte blocks.
- 1MB, direct-mapped cache with 8 byte blocks.
- 1MB, two-way set associative cache with 8 byte blocks.
- 1.5MB three-way set associative cache with 8 byte blocks
|
Part (a) |
Part (b) |
Part (c) |
Part (d) |
References |
Index |
H/M |
Index |
H/M |
Index |
H/M |
Index |
H/M |
0x00000018 |
6 |
Miss |
3 |
Miss |
3 |
Miss |
3 |
Miss |
0x00D00018 |
6 |
Miss |
3 |
Miss |
3 |
Miss |
3 |
Miss |
0x00A0003C |
|
|
|
|
|
|
|
|
0x00D00018 |
|
|
|
|
|
|
|
|
0x00000018 |
|
|
|
|
|
|
|
|
0x00500010 |
|
|
|
|
|
|
|
|
0x00400004 |
|
|
|
|
|
|
|
|
0x00A00038 |
|
|
|
|
|
|
|
|
0x00400000 |
|
|
|
|
|
|
|
|
0x00600024 |
|
|
|
|
|
|
|
|
0x00500014 |
|
|
|
|
|
|
|
|
0x00D0001C |
|
|
|
|
|
|
|
|
0x00E0001C |
|
|
|
|
|
|
|
|
0x00D0001C |
|
|
|
|
|
|
|
|
0x0000001C |
|
|
|
|
|
|
|
|
- Compute the total number of bits required to implement the
caches described below given a 32-bit address space.
Fill in the requested information in the table provided. List (1) the total number of bits per line, (2) the total
number of bytes of SRAM needed, and (3) the "overhead" as a
percentage of data. (For example, a 1MB cache that requires
1.25MB of data total would have a 25% overhead.)
Don't forget to include the valid bit and any LRU bits if necessary.
Assume the number of LRU bits is
n - 1
for an
n
-way set associative cache.
- 1MB direct-mapped cache with 1 byte blocks (Patterson and Hennessy problem 7.9)
- 1MB direct-mapped cache with 4 byte blocks (Patterson and Hennessy problem 7.10)
- 1MB 2-way set associative cache with 1 byte blocks and LRU
replacement (Patterson and Hennessy problem 7.25 -- almost)
- 1MB 2-way set associative cache with 4 byte blocks and LRU replacement
- 1MB fully associative cache with 1 byte blocks and LRU
replacement (Patterson and Hennessy problem 7.26)
- 1MB fully associative cache with 4 byte blocks and LRU
replacement (Patterson and Hennessy problem 7.27)
|
tag |
Line size |
Total size (with valid) |
%overhead |
1 |
|
21 bits/line |
2,752,512 bytes (2.625MB) |
162% |
2 |
|
___ bits/line |
|
|
3 |
|
___ bits/line |
|
|
4 |
|
___ bits/line |
|
|
5 |
|
___ bits/"way" |
|
|
6 |
|
___ bits/"way" |
|
|
- (Patterson and Hennessy 7.29) Suppose a byte-addressable
computer's address size is
k
bits, the cache size is
S
bytes, the block size is B
bytes, and
the cache is A-
way set associative. Give the formulas for the
following in terms of S
, B
,
A
, and k
:
- the number of sets (aka "lines") in the cache,
- the number of index bits in the address,
- the number of tag bits in the address,
- the number of bits needed to implement the cache.
(You can check your formula by applying it to the previous problem.)
- Consider two CPUs:
- P1 has a 1KB L1 cache with an 11.4% miss rate and a .62ns hit time.
- P2 has a 2KB L1 cache with an 8% miss rate and a .66ns hit time.
For simplicity, assume that instruction fetches don't cause cache misses.
If 36% of all instructions are memory accesses (e.g.,
lw
and sw
),
the memory access time is
70ns, and the CPUs clock period is set to the cache hit time, then
- What is the CPI for each machine?
- Which machine is faster? (Remember, they have different clock periods.)
- Suppose a 4KB L1 cache would have a .7ns hit time. What miss rate
would be necessary for this CPU to out-perform the other two?
- What speedup would result from doubling the speed of P1's L1
cache and clock frequency while leaving the main
memory speed at 70ns? (In other words, what happens
if the clock period is now .31ns, a cache access is still 1
cycle, and a cache miss is still 70ns?)
- What speedup would result from doubling P1's main memory
speed (to 35ns) and leaving the clock at .62ns?
-
Consider a CPU with a split L1 cache and a unified L2 cache.
- 90% of instruction access hit in the L1 instruction cache.
- 98% of the L1 instruction cache misses hit in the L2 cache.
- 85% of data accesses hit in the L1 data cache.
- 93% of L1 data cache misses hit in the L2 cache.
- 35% of instructions make a data access.
- The L1 caches have a 1 cycle access time.
- The L2 cache has a 6 cycle access time.
- Main Memory has a 90 cycle access time.
Compute the following:
- What is the average memory access time for instruction accesses?
- What is the average memory access time for data accesses?
- What is the overall CPI?
- We could double the size of each L1 cache if we made the clock
20% slower. What would the new CPI need to be in order for the new
CPI to be faster?
- By what percent would the L1 miss rates need to decrease
in order for the slower CPU to have better performance? (Assume that
the L2 miss rate does not change, but that the L2 and memory access
cycle times reflect the slower clock period.)
For more practice, try Harris and Harris problems 8.7, 8.8, 8.9, 8.10*, 8.11*, 8.12*