UP ONE LEVEL: ENCM 369 Winter 2004 Course Handouts

ENCM 369: Computer Organization
Lab 10 - for the weeks of April 5 and 12, 2004

Author: Steve Norman
Paper copies handed out: Monday, April 5, 2004
Last modified: Sun Apr 4 14:12:35 MDT 2004

Contents


Updates / Corrections

If updates or corrections are needed they will appear in this space in the online version of this handout.

[back to top of document]


This lab is important, but won't be marked

There isn't enough time to get this assignment collected, marked, and returned before the final exam.

However, the exercises cover final exam material, so you should make a serious effort to do all the exercises.

Solutions will be posted on the Web during the week of April 12.

[back to top of document]


Exercise A: Tracing cache hits and misses

What to do

Do Exercises 7.7, 7.8, 7.20 and 7.21 from the textbook.

You will have to do a little reading to find out what a fully-associative cache is, because fully-associative caches were not covered in lectures.

Note that the problems deal with (somehwhat unusual) computers in which word addresses increase by one (not four) from each word to the next. So in these problems addresses don't contain byte offsets. The largest address is 56, so you can assume that addresses are six bits wide. For example, in Exercise 7.7 the tag will be the most significant two bits of the address and the cache index will be the least significant four bits.

[back to top of document]


Exercise B: How many bits of storage are needed for a cache?

What to do

Do Exercise 7.9 from the textbook.

[back to top of document]


Exercise C: Tag and index sizes

What to do

  1. Consider the cache of Figure 7.19 in the textbook, which can hold 1024 32-bit words. Consider modifying it so that it is still four-way set-associative, but holds 16384 32-bit words in four-word blocks. How will 32-bit addresses be broken into tag, index, block offset and byte offset? (That is, how many bits will be used for each of tag, index, block offset and byte offset?)
  2. Now consider a computer with 64-bit addresses and 64-bit memory words. (These sizes are what you would find in today's big Unix servers and in the PCs of the not-too-distant future.) Consider a cache that is two-way set-associative, stores words in four-word blocks, and holds 256K (= 256 * 1024 = 262144) bytes. How will 64-bit addresses be broken into tag, index, block offset and byte offset?

[back to top of document]


Exercise D: Simulating a Cache

If you haven't already done them, go back and do Exercises A, B, and C first.

Read This First

In this exercise you will write some C code to simulate some aspects of the behaviour of data caches. For two different sequences of memory accesses you will determine which accesses are cache hits and which accesses are cache misses. (This is relatively easy to do; a complete simulation that modeled write-through or write-back to main memory would be a much more complex problem.)

The sequences of memory accesses were generated by determining the data memory accesses that would be made in sorting an array of 3000 integers using two different sort algorithms on a machine similar to a 32-bit MIPS computer. Memory is stored in 32-bit words, and byte addresses are 32-bits in length. All accesses to data memory by the sort routines are loads and stores of words, at addresses that are multiples of four.

These sequences are availble in text files. Here are the first ten lines of one of the files:

r  804dfe8
r  804f758
w  804dfe8
w  804f758
r  804dfe4
r  804f750
r  804f754
r  804f754
w  804dfe4
r  804dfe0
r means read (load) and w means write (store). The addresses are in hexadecimal notation, even though there is no leading 0x.

What to do, Part I

Make a copy of the directory /local/courses/encm369/lab11/exD
Read the C source file read_trace.c. The program doesn't do any cache simulation, but it does demonstrate how to loop through the input files. Build an executable and run it, following the instructions in the comment at the top of the source file.

Suppose that the MIPS-like computer has a 1024-word direct-mapped data cache with one-word blocks. Suppose that this cache is empty (all V-bits are zero) when the sort code starts executing. Write a C program that can process either one of the two data files as inputs, and counts the number of cache hits on reads and the number of cache misses on reads. (With a block size of one word, there is really no point in distinguishing write hits from write misses.)

Hints. This is a useful type:

struct cache_line {
  int v_bit;
  unsigned tag;
};
And this is a useful array:
struct cache_line the_cache[1024];
Processing each line of input will involve inspecting and possibly updating an element in the_cache.

What to do, Part II

Modify your program of Part I for a 1024-word direct-mapped cache with four-word blocks. In addition to counting read hits and read misses, your program should count write hits and write misses. With multi-word blocks, write misses are more costly than write hits because on a write miss, the neighbours of the destination word must be read from main memory into the same cache block that will hold the destination word.

Final note

Cache-friendliness (a tendency towards low miss rates) is important in performance of algorithms on modern computer hardware.

But don't use this exercise to draw any firm conclusions about which of the two sort algorithms is more cache-friendly. (My own suspicion is that mergesort is in fact more cache-friendly than heapsort.) The caches being simulated are unrealistically small, and using one run of each algorithm for a single array hardly constitutes enough input data for a good experiment.

[back to top of document]


Exercise E: An Exercise on Virtual Memory

Part I

Read the instructions for Exercise 7.32 in Patterson and Hennessy. Note that it mentions bits called the valid, protection, dirty, and use bits. For each of these four bits, write a short paragraph explaining what its role is in helping a virtual memory system work. (The answers can be found in Section 7.4 of the text; the point of this part of the exercise is to make you do some careful reading.)

Part II

Do Exercise 7.32 in Patterson and Hennessy.

[back to top of document]