CIS 351 |
Cache |
Fall 2020 |
This lab will be due in two pieces. Part 1 is problems 1-5; it will be due next week. Part 2 is problems 6 - 11; it will be due the Monday after Thanksgiving. Officially, there is not lab Thanksgiving week; but, I will be online to answer questions with Part 2. Anyone is welcome to addend (regardless of your scheduled lab time).
Simplescalar is a suite of
programs that simulate the execution of programs compiled using a
MIPS-like instruction set called PISA. You can simulate the execution
of any program using Simplescalar by simply re-compiling it
using a version of gcc
that knows how to generate PISA
instructions as well as x86 instructions.
For this lab, you will be using the tool sim-cache
. This
tool takes as input a description of a machine's memory hierarchy
(i.e., cache levels) and reports on the number of hits and misses in
each cache. Section 4.2 of The Simplescalar
Tech Report explains how to describe the cache setup you want to
simulate. When you read through this section, take note of three things:
sim-cache
, you don't specify the size of the
cache directly. Instead you specify (1) the number of lines, (2)
the block size, and (3) the associativity of the cache. The size
of the cache is the product of these three numbers. Thus, a 4-way set
associative cache with 1024 lines of 16 bytes each is 4*1024*16 =
65536 bytes (64 kilobytes).
dl1
" will appear twice when
configuring the L1 data cache.)
sim-cache
frequently uses both
"1
" (the numeral "one") and "l
" (a
lower-case letter "L"). Watch carefully because the differences
in print between the two can be subtle.
For example, to configure a machine with an 8KB, direct-mapped L1 data
cache with 32 byte blocks, use this command: -cache:dl1
dl1:256:32:1:l
. Notice that 256 blocks times 32 bytes per
block equals 8192 bytes.
When running sim-cache
, I recommend sending the output
directly to a file using the command line
parameters -redir:sim file1
and -redir:prog file2
. file1
will contain the results of the simulation (i.e., cache hit and miss
rates). file2
will contain the output produced by
the program simulated. This data is generally not interesting.
Simplescalar is designed to run on 32-bit machines. So, instead of using Arch/EOS directly, we will be using a Virtual Machine. Our sysadmin Tom has set up five virtual machines for us to use. They have these IP addresses:
192.168.216.22
192.168.216.26
192.168.216.27
192.168.216.28
192.168.216.29
To access these machines, you must first log into an EOS or Arch machine. Then, use ssh
to log into
one of the VMs: ssh simple@192.168.216.X
(where X
is the number of the particular machine you chose. Just pick one at random.)
The username is simple
and the password is SimpleSim2020
.
You will all be sharing these machines; so,
Note: The VMs do not have a GUI.
Your first task is to examine the effects of block size on a "toy"
program. Look
at blockSize1.c
. This small
program iterates through each byte in a large array. First, you will need to copy
this file to your space on a VM:
scp blockSize1.c simple@192.168.216.X:yourDirectory
Next, compile this C program for Simplescalar using the following
command: ss_gcc blockSize1.c (Make sure you are in your directory so you don't end up
overwriting someone else's file.) Running this command will produce a file named a.out
.
(As with the normal version of gcc
, you can specify the
name of the executable generated using the -o
flag.)
This file will not run by itself. It will run only as input to one of
the Simplescalar programs. If it does, you generated it using the wrong version of gcc
.
sim-cache
to determine the miss-rates of an 8KB, direct-mapped cache with the
following block sizes: 8 bytes, 16 bytes, 32 bytes, and 64 bytes. To
do so, use commands that look like this:
sim-cache -cache:dl1 dl1:line:block:1:l -redir:prog /dev/null -redir:sim output_block a.out
Where block ranges from 8 to 64, and line is set such that product of block times line is 8192.
Hints for running sim-cache:1
followed by the letter l
(as in
"lru").
sim-cache
is in /Simplescalar/simplesim-3.0/sim-cache
sim-cache
with
varying block sizes and present the results. This line provides an example of how to perform
arithmetic in a bash script:
let num_lines=8192/$i After you have run sim-cache
for each block size,
grep
each output file (output_8
, output_16
,
etc.) for the line "dl1.miss_rate
". List the miss rate
for each block size tested.
NUM_LOOPS
1000000.
blockSize1.c
that array
is
an array of characters; therefore, each item in the cache is exactly
1 byte. As a result, it is easy to identify data items that will or
will not conflict in the cache. For example, in an 8KB direct-mapped
cache, array bytes 0 and 8192 will conflict. Your job is to find
sets of array elements that conflict with a 16 byte block, but not an
8 byte block.
gcc
is a C compiler. Your code must be
straight C. No iostreams; no "//"-style comments; and, all variables
must be declared at the beginning of each function.
array[0]
should be mapped to cache slot 0. Sometimes, it gets mapped to array[8]
.
If you are confident your solution should work; but, it doesn't try adding 8 to each array index.
qsort
.
qsort
given a
1KB, 4KB, and 16KB cache. Present your results using a graph with
block size on the x-axis and the miss rate on the y-axis. Please
generate one graph with three lines: One each for 1KB, 4KB, and 16KB.
Valid block sizes are 8, 16, 32, and 64. Your graph should have a form similar to
Figure 8.18 in Harris and Harris (2nd edition).
input_1e4
for input. (It contains 50,000
randomly generated integers.)
ss_qsort
executable and sample inputs are found in
~/TestData
.
sim-cache -cache:dl1
dl1:64:16:1:l -redir:prog opt -redir:sim output_dl1:64:16:1:l
~/TestData/ss_qsort ~/TestData/input_1e4
opt
may give you a clue. If not,
ask the instructor for help.
gnuplot
, entering the line set style data linespoints
will plot all
data files using the "linespoints" style. Using this shortcut means that you won't have to type
with linespoints
after every file.
qsort
(or another interesting program of
your choice) and a cache size. Produce a graph showing miss
rates as associativity ranges over 1, 2, 4, 8, 16, and fully
associative. Your graph should have associativity on the x-axis,
and miss-rate on the y-axis. It should also contain four lines:
one for each block size. Be sure to clearly label your graph
with the cache size. Your graph should have a form similar to
Figure 5.30 in Patterson and Hennessey (4th edition, revised).
Updated Monday, 16 November 2020, 2:05 PM