Your goal for this assignment is to introduce support for Garbage Collection into your VM from assignment 3.

Deadline: April 13

A basic Mark and Sweep collector

The main starter code for this assignment is in the `gc.h` file. This file defines two key classes that you will be working with for this assignment. The first one is called 'Collectable'. All objects tracked by the Garbage collector must be 'Collectable' objects, and any fields in the 'Collectable' class can be thought of as part of the header that will be used by the garbage collector to store metadata relevant to this object. The second key class is called 'CollectedHeap'. This is the class that keeps track of all the objects managed by the garbage collector and implements the actual garbage collection algorithm. In addition to these classes, your skeleton code includes a 'main.cpp' method that you can use to test your garbage collector before you connect it to your VM, as well as a file 'timerclass.h' that you can use to time your implementation.

Minimal requirements

Your VM should support a command line argument of the form `"-mem N"`, where 'N' is the maximum amount of memory (in MB) that the VM can use. You can assume that 'N' will be at least twice as large as the maximum amount of memory that will ever be required by the program you are running. You can also assume that N will always be at least 4. Your garbage collector should make sure that the VM never uses more than this amount of memory. Credit for this assignment will depend exclussively on the ability of your implementation to stay within the memory limits given by "-mem N'.

Basic implementation strategy

For a very basic implementation strategy, you could do the following: * Make your stack frames and your value classes extend 'Collectable' * Implement the 'allocate' methods in 'CollectedHeap' to allocate objects using the 'new' operator, keeping track of the objects and their sizes (potentially by writing information into private fields of 'Collectable'. ) * Implement 'mark' and 'sweep' methods to be called from 'gc()'. If you follow this approach, be aware of the following challenges. * You will want to be careful when accounting for the memory used by different objects. In particular, remember that the 'sizeof' function can tell you the size of a struct or a class, but if your class is allocating additional memory, either directly or through STL objects, you will need to keep track of that too. With STL objects in particular, you won't be able to tell exactly how much memory they are allocating, so you will have to approximate. The factor of 2 between the maximum memory needed by the application and the memory limit gives you some wiggle room in terms of this approximation, but not too much. * You need to decide when to call gc(). Keep in mind that garbage collection is expensive, so you do not want to do it until you have to.


Below are some suggested enhancements on top of the basic implementation strategy. * Booleans and None can just be statically allocated * You may find it useful to replace any STL in your value classes with custom data-structures. This can help in several ways. * In the Record class, you can define your own map using only 'Collectable' objects. This will give you more precise estimates of how much memory your maps are consuming. * There is a version of the 'new' operator, 'new (ptr) A()' that instead of allocating its own memory, expects you to pass a pointer 'ptr' to some preallocated memory where the object should live. You can allocate memory with 'malloc' and then pass it to this version of 'new'. The main reason to do this is to allocate multiple pieces of data in a single buffer. For example, for 'String' values, you can use malloc to allocate a single buffer to contain both the 'Value' class as well as the characters that actually make up the string. You can also use this trick to allocate stack frames so that all the different tables that make up the stack frame are allocated in the same contiguous buffer. If you do this, keep in mind three things: * Memory allocated with 'malloc' should always be deallocated with 'free'; memory allocated with the standard 'new' should always be deallocated with 'delete'. * When you deallocate memory with 'delete', the compiler will automatically call the destructor, but if you create an object with 'new (ptr)' and then use 'free' to deallocate the underlying memory, the destructor will not be called automatically. You need to call the destructor explicitly before freeing the memory. * it is really really important that your buffer is big enough to hold everything you want to put in it. * For integers, you may find it advantageous to keep a cache of recently allocated integers so that if the program tries to allocate the same integer value multiple times in a row your allocator just returns a pointer to the same object instead of allocating a new one.


Each team must submit a working VM together with 5 stress tests that allocate enough objects to stress the garbage collector. None of your tests should require a memory limit greater than 20MB. Additionally, your documentation must explain at least the following: * How does the implementation keep track of allocated objects. * How does the implementation keep track of the memory usage, especially the memory used by 'Record' and 'String' values. * What triggers garbage collection.