UP ONE LEVEL:
ENCM 369 Winter 2004 Course Handouts
ENCM 369: Computer Organization
Lab 10 - for the weeks of April 5 and 12, 2004
Author: Steve Norman
Paper copies handed out: Monday, April 5, 2004
Last modified: Sun Apr 4 14:12:35 MDT 2004
Contents
Updates / Corrections
If updates or corrections are needed they will appear in this space
in the online version of this handout.
[back to top of document]
This lab is important, but won't be marked
There isn't enough time to get this assignment collected,
marked, and returned before the final exam.
However, the exercises cover final exam material, so
you should make a serious effort to do all the exercises.
Solutions will be posted on the Web during the week
of April 12.
[back to top of document]
Exercise A: Tracing cache hits and misses
What to do
Do Exercises 7.7, 7.8, 7.20 and 7.21 from the textbook.
You will have to do a little reading to find out what a
fully-associative cache is, because
fully-associative caches were not covered in lectures.
Note that the problems deal with (somehwhat unusual)
computers in which word
addresses increase by one (not four) from each word
to the next.
So in these problems addresses don't contain byte offsets.
The largest address is 56, so you can assume that addresses are
six bits wide.
For example,
in Exercise 7.7 the tag will be the most significant
two bits of the address
and the cache index will be the least significant four bits.
[back to top of document]
Exercise B: How many bits of storage are needed for a cache?
What to do
Do Exercise 7.9 from the textbook.
[back to top of document]
Exercise C: Tag and index sizes
What to do
- Consider the cache of Figure 7.19 in the textbook,
which can hold 1024 32-bit words.
Consider modifying it so that it is still four-way
set-associative, but
holds 16384 32-bit words in four-word blocks.
How will 32-bit addresses be broken into
tag, index, block offset and byte offset?
(That is, how many bits will be used for
each of tag, index, block offset and byte offset?)
- Now consider a computer with 64-bit addresses
and 64-bit memory words.
(These sizes are what you would find in
today's big Unix servers and in the PCs of
the not-too-distant future.)
Consider a cache that is two-way set-associative,
stores words in four-word blocks,
and holds 256K (= 256 * 1024 = 262144) bytes.
How will 64-bit addresses be broken into
tag, index, block offset and byte offset?
[back to top of document]
Exercise D: Simulating a Cache
If you haven't already done them,
go back and do Exercises A, B, and C first.
Read This First
In this exercise you will write some C code to simulate some
aspects of the behaviour of data caches.
For two different sequences of memory accesses you will
determine which accesses are cache hits and which accesses
are cache misses.
(This is relatively easy to do;
a complete simulation that modeled write-through or write-back
to main memory would be a much more complex problem.)
The sequences of memory accesses were generated by determining
the data memory accesses that would be made in sorting an array
of 3000 integers using two different sort algorithms on
a machine similar to a 32-bit MIPS computer.
Memory is stored in 32-bit words,
and byte addresses are 32-bits in length.
All accesses to data memory by the sort routines are loads and stores
of words, at addresses that are multiples of four.
These sequences are availble in text files.
Here are the first ten lines of one of the files:
r 804dfe8
r 804f758
w 804dfe8
w 804f758
r 804dfe4
r 804f750
r 804f754
r 804f754
w 804dfe4
r 804dfe0
r means read (load)
and w means write (store).
The addresses are in hexadecimal notation,
even though there is no leading 0x.
What to do, Part I
Make a copy of the directory /local/courses/encm369/lab11/exD
Read the C source file read_trace.c.
The program doesn't do any cache simulation, but it does
demonstrate how to loop through the input files.
Build an executable and run it, following the instructions
in the comment at the top of the source file.
Suppose that the MIPS-like computer has a 1024-word
direct-mapped data cache with one-word blocks.
Suppose that this cache is empty (all V-bits are zero)
when the sort code starts executing.
Write a C program that can
process either one of the two data files
as inputs,
and counts the number of cache hits
on reads and the number of cache misses on reads.
(With a block size of one word, there is really no point
in distinguishing write hits from write misses.)
Hints.
This is a useful type:
struct cache_line {
int v_bit;
unsigned tag;
};
And this is a useful array:
struct cache_line the_cache[1024];
Processing each line of input will involve inspecting
and possibly updating an element in the_cache.
What to do, Part II
Modify your program of Part I for a 1024-word
direct-mapped cache with four-word blocks.
In addition to counting read hits and read misses,
your program should count write hits and write misses.
With multi-word blocks,
write misses are more costly than write hits because
on a write miss,
the neighbours of the destination word must be read from main memory
into the same cache block that will hold the destination word.
Final note
Cache-friendliness (a tendency towards low miss rates)
is important in performance of algorithms
on modern computer hardware.
But don't use this exercise to draw any firm conclusions about which of
the two sort algorithms is more cache-friendly.
(My own suspicion is that mergesort is in fact
more cache-friendly than heapsort.)
The caches being simulated are unrealistically small,
and using one run of each algorithm for a single array
hardly constitutes enough input data for a good
experiment.
[back to top of document]
Exercise E: An Exercise on Virtual Memory
Part I
Read the instructions for Exercise 7.32 in
Patterson and Hennessy.
Note that it mentions bits called
the valid, protection, dirty, and use bits.
For each of these four bits, write a short paragraph
explaining what its role is in helping a virtual
memory system work.
(The answers can be found in Section 7.4 of the text;
the point of this part of the exercise is to make you do
some careful reading.)
Part II
Do Exercise 7.32 in
Patterson and Hennessy.
[back to top of document]