My last post was on simming an L1 cache. Being pleased with how it came out, I thought I’d say a little about its construction.
The cache implementation is:
interface CacheLine {
tag: number | null; // main-memory address
last_used: number;
}
type Program = () => Generator<number, undefined, undefined>;
export class Cache {
sets: number;
ways: number;
data: CacheLine[][];
hits: number;
misses: number;
evictions: number;
id: number;
generation: number;
last_address: number | null;
program: Program;
constructor(sets: number, ways: number, program: Program) {
this.sets = sets;
this.ways = ways;
this.data = [];
this.hits = 0;
this.misses = 0;
this.evictions = 0;
this.generation = 0;
this.last_address = null;
this.program = program;
this.id = 0;
for (let i = 0; i < sets; i++) {
if (this.data[i] === undefined) this.data[i] = [];
for (let j = 0; j < ways; j++) {
this.data[i][j] = {
tag: null,
last_used: 0,
};
}
}
}
reset(id: number) {
this.id = id;
this.hits = 0;
this.misses = 0;
this.evictions = 0;
this.generation = 0;
this.last_address = null;
for (let i = 0; i < this.sets; i++) {
if (this.data[i] === undefined) this.data[i] = [];
for (let j = 0; j < this.ways; j++) {
this.data[i][j] = {
tag: null,
last_used: 0,
};
}
}
}
access(byte_address: number) {
this.last_address = byte_address;
this.generation += 1;
const setIndex = this.getSetIndex(byte_address);
const set = this.data[setIndex];
const address = (byte_address >> 6) << 6;
for (const way of set) {
if (way.tag === address) {
this.hits += 1;
return;
}
}
this.misses += 1;
let least_recently_used: number = Number.MAX_SAFE_INTEGER;
let least_recently_used_index: number = 0;
for (const [i, way] of set.entries()) {
if (way.tag === null) {
this.data[setIndex][i] = {
tag: address,
last_used: this.generation,
};
return;
} else {
if (way.last_used < least_recently_used) {
least_recently_used = way.last_used;
least_recently_used_index = i;
}
}
}
this.evictions += 1;
this.data[setIndex][least_recently_used_index] = {
tag: address,
last_used: this.generation,
};
}
getSetIndex(address: number) {
const set = (address >> 6) & 0b111111;
return set;
}
}
Let’s break it down. The definition of a cache line is:
interface CacheLine {
tag: number | null; // main-memory address
last_used: number;
}
The tag
field tells us whether a cache line is unused (tag === null
)
or whether it is used in which case the tag
is the main-memory address of the
cache line.
In a given set the simulation implements a least-recently used (LRU) cache, so we record the generation the cache line / address was last accessed.
type Program = () => Generator<number, undefined, undefined>;
This is our “program” definition. Those familiar with javascript will recognise this as being the type of a javascript generator. I will come back to this later.
The cache is then defined with a javascript class:
export class Cache {
sets: number;
ways: number;
data: CacheLine[][];
hits: number;
misses: number;
evictions: number;
id: number;
generation: number;
last_address: number | null;
program: Program;
constructor(sets: number, ways: number, program: Program) {
this.sets = sets;
this.ways = ways;
this.data = [];
this.hits = 0;
this.misses = 0;
this.evictions = 0;
this.generation = 0;
this.last_address = null;
this.program = program;
this.id = 0;
for (let i = 0; i < sets; i++) {
if (this.data[i] === undefined) this.data[i] = [];
for (let j = 0; j < ways; j++) {
this.data[i][j] = {
tag: null,
last_used: 0,
};
}
}
}
reset(id: number) {
this.id = id;
this.hits = 0;
this.misses = 0;
this.evictions = 0;
this.generation = 0;
this.last_address = null;
for (let i = 0; i < this.sets; i++) {
if (this.data[i] === undefined) this.data[i] = [];
for (let j = 0; j < this.ways; j++) {
this.data[i][j] = {
tag: null,
last_used: 0,
};
}
}
}
access(byte_address: number) {
this.last_address = byte_address;
this.generation += 1;
const setIndex = this.getSetIndex(byte_address);
const set = this.data[setIndex];
const address = (byte_address >> 6) << 6;
for (const way of set) {
if (way.tag === address) {
this.hits += 1;
return;
}
}
this.misses += 1;
let least_recently_used: number = Number.MAX_SAFE_INTEGER;
let least_recently_used_index: number = 0;
for (const [i, way] of set.entries()) {
if (way.tag === null) {
this.data[setIndex][i] = {
tag: address,
last_used: this.generation,
};
return;
} else {
if (way.last_used < least_recently_used) {
least_recently_used = way.last_used;
least_recently_used_index = i;
}
}
}
this.evictions += 1;
this.data[setIndex][least_recently_used_index] = {
tag: address,
last_used: this.generation,
};
}
getSetIndex(address: number) {
const set = (address >> 6) & 0b111111;
return set;
}
}
This most important part of this is probably the access
method. Given a main-memory address this method looks
up the set, measures a hit / miss and evicts where necessary.
So what does a program look like? As I said before, the program is represented with a javascript generator. An example program might look like this:
export function linearAccess(
offset: number,
n: number,
stride: number,
loops: number
) {
return function* () {
let j = 0;
while (j < loops) {
let i = 0;
while (i < n) {
yield offset + i * stride;
i += 1;
}
j += 1;
}
};
}
And indeed that is the program used in the previous post: consecutive memory access modulo some stride. The “program” simply yields addresses.
This is basically equivalent to an actual program you might write with the
arr
living at some offset
in memory:
let arr = [];
let j = 0;
while (j < loops) {
let i = 0;
while (i < n) {
arr[i * stride];
i += 1;
}
j += 1;
}
Aside: I wonder if there is some clever trick I could play with javascript proxies to write effectively that but get addresses?
Okay, cool. What I’ve not shown yet is how the access
function of the
Cache
is invoked…how is our simulation actually run? I have wrapped
the simulation in a CacheView.svelte
component. I will only show
the <script>
part of the component…the rest is simply the HTML
to draw the table and styling:
<script lang="ts">
import { frame } from '$lib/frame';
import type { Cache } from './cache';
export let cache: Cache;
let scroll: boolean = true;
let run_id: number = 0;
async function run() {
const id = (run_id += 1);
console.log(`Thread ${id}: starting`);
cache.reset(id);
for (const address of cache.program()) {
await frame();
if (run_id !== id || cache.id !== id) {
console.log(`Thread ${id}: exiting`);
return;
}
cache.access(address);
cache.data = [...cache.data];
}
}
function reset() {
run_id = 0;
cache.reset(0);
cache.data = [...cache.data];
}
</script>
A couple of immediate things to note:
- We have a
run
function. Calling this starts the simulation. - The
run
function isasync
For the dual purposes of not blocking the main thread
and “animating” the simulation, the run
function is
async
. Within the for
loop the first thing we do
is await frame()
. What is frame
? It’s simply a
wrapper around requestAnimationFrame
in a a way that
is lovely to use from an async
function:
export function frame(): Promise<number> {
return new Promise((resolve) => requestAnimationFrame(resolve));
}
So the operation goes like this:
- The user clicks the
Run
button which kicks off therun
function. - The
run
function then iterates over thecache.program
(again a generator) whichyield
s main-memory addresses. - After waiting for an animation frame the address is passed to
cache.access
which invokes the logic of the cache. - The
CacheView
then redraws using the usualsvelte
reactivity:
...
<div class="buttons">
<button on:click={(ev) => run()}>Run</button>
<button on:click={(ev) => reset()}>Reset</button>
</div>
<div>
<div>Hits: {cache.hits}</div>
<div>Misses: {cache.misses}</div>
<div>
Miss rate: {cache.misses + cache.hits === 0
? '-'
: ((cache.misses / (cache.misses + cache.hits)) * 100).toFixed(2)}%
</div>
<div>Evictions: {cache.evictions}</div>
</div>
...
Pretty straightforward and very neat I think, hence I am writing about it.
There are a some changes I would like to make:
- Given the
await frame()
after every memory address is yielded the simulation is stuck at 60-addresses per second (assuming your browser is drawing at 60 FPS). For a larger example this is probably a bit slow, so I want to have some multiplier (i.e. N number of accesses per frame). - It would be nice to step through a simulation rather than the continuous run until completion.
- I think it could generally use some colour / animation.
With those changes in place I’ll be ready for part 2 of the series. Stay tuned.