Code Generation and Optimization

Deadlines

Description

In this final assignment, your goal is to reduce the execution time of a given application when executed on your virtual machine. This assignment is completely opened-ended; you are not required to implement any specific optimization except for simple code generation (see Milestone section). Your grade will be based on 1) your overall design (with heavy emphasis on the writeup) and 2) how well your virtual machine performs as compared to the submissions of the other groups. Namely, we will hold a Virtual Machine Derby where your virtual machine implementation will compete against the implementations of your fellow classmates on a hidden program (the derby program).

In the final month of class, we will cover various performance improving optimizations. A substantial portion of this project phase is to determine which optimizations will provide a benefit given the programs of the provided benchmark suite and the target architecture.

Some optimizations that we will cover in class include:

In order to identify and prioritize optimizations, we will provide you with a benchmark suite of programs. These programs are more complex than the code that has been provided during the previous phases, so your first priority is to produce correct unoptimized assembly code for each benchmark.

After your language implementation produces correct unoptimized code, you should begin to study the applications of the benchmark suite. You will use these programs (and any other programs provided during previous phases) to determine which optimizations you will implement. Specifically, you are expected to analyze the performance of your virtual machine to classify the effectiveness of each proposed optimization.

You are not limited to the optimizations covered in class. You are free to implement any optimization that you come across in your reading or any optimization that you devise. However, your writeup must demonstrate that each optimization is effective and semantics preserving. Furthermore, you must argue that each optimization is general, meaning it is not a special-case optimization that works only for the derby program or a specific application in the benchmark suite.

Milestone

For the milestone, you are expected to deliver 1) a virtual machine that implements code generation and 2) a preliminary writeup of the optimizations you plan to implement for your final virtual machine.

Code Generation: For the milestone, you are not expected to produce great (generated) code. In fact, even horrendous code is acceptable. When considering tradeoffs, always choose simplicity of implementation over performance. You are not constrained as to how you go about generating your assembly However, the general approach presented in lecture and recitation is a reasonable start.

Your generated code must include all runtime checks listed in the MITScript semantics given Assignment 2. These may be optimized (or removed in some cases) as long as the optimized programs reports a runtime error whenever the unoptimized program reports an error. Note that because we will only give you MITScript and because you are free to choose your intermediate representation, the runtime errors from Assignment 3 correspond to your own internal implementation errors (e.g., InsufficientStack) and have no relation to the MITScript source semantics. While you may consider removing these checks, it is good development practice to support runtime checking for such errors in, for example, a debug build.

For the milestone you should focus your creative energies on 1) designing your intermediate representation, 2) familiarizing yourself with our target ISA, 3) designing your machine-code representations of the runtime structures, and 4) generating correct assembly code.

The X86-64 Architecture Guide, provided as a supplement to this handout on the course reference materials page, presents a not-so-gentle introduction to the x86-64 ISA. It presents a subset of the x86-64 architecture that should be sufficient for the purposes of this project. You should read this handout before you start to write the code that traverses your IR and generates x86-64 instructions.

Requirements: For a given function to be executed, you should generate a sequence of assembly instructions that implements its semantics. Your generated code must use the machine code stack and/or registers to compute intermediate results (See Code Generation Lecture #2 slides for an illustration). Specifically, you must use machine code (and machine code only) to load constants (LoadConst), access local variables (LoadLocal, StoreLocal, LoadReference, StoreReference), compute arithmetic operations that are not overloaded by type (Sub, Mul, Div, Neg, Gt, Geq, And, Or, Not), and implement control flow (if and while). Exceptions to these requirements are as follows:

Note that while we are using MITScript bytecode instructions to refer to these operations, you need not necessarily translate MITScript source to bytecode to perform code generation. You may find, however, that it simplifies the task of naive code generation for the Milestone.

Writeup: As you are implementing code generation in your virtual machine, you will also need to begin to assess where you may be able to achieve speedup in your virtual machine implementation. For your writeup, you need to deliver a preliminary assessment of the optimizations that you plan to implement. See the following section on the final write up to understand how to proceed.

Writeup

The writeup for this project is extremely important. Although it explicitly accounts for only 20% of the grade, it will also be used to determine your score for the Implementation aspect of the project (50%).

Your written documentation must discuss each optimization that you considered. The thoroughness of your exploration of the optimization space will be an important aspect of your grade. For each optimization that you implement, your writeup must convince the reader that the optimization was beneficial, general, and correct.

For runtime representation and garbage collection optimizations, your write up should include diagrams that illustrate your represenation changes and algorithms. For code generation optimizations, you should include assembly or IR code examples for each optimization. Show how your generated code is transformed by the optimization (might be hand-applied for optimizations you did not implement). Highlight the benefit of the optimization on the assembly or IR code. Discuss exactly how the benefit was achieved given the characteristics of our target architecture. Include empirical evidence that proves your conclusion for the optimization.

Your writeup should include the following:

Command-Line Arguments: Your virtual machine should include a “full optimizations” command-line option (see below). Your written documentation should present a detailed discussion of this option including how you determined the order in which your optimizations are performed and how many times you apply the optimizations.

Division of Work: A brief description of how your group divided the work. This will not affect your grade; it will be used to alert the instructors to any possible problems.

Assumptions: A list of any clarifications, assumptions, or additions to the problem assigned. The project specifications are fairly broad and leave many of the decisions to you. This is an aspect of real software engineering. If you think major clarifications are necessary, feel free consult the instructors.

Overview: An overview of your design, an analysis of design alternatives you considered, and key design decisions. Be sure to document and justify all design decisions you make. Any decision accompanied by a convincing argument will be accepted. If you realize there are flaws or deficiencies in your design late in the implementation process, discuss those flaws and how you would have done things differently. Also include any changes you made to your implementations from previous assignments and explain why they were necessary.

Implementation: A brief description of interesting implementation issues. This should include any non-trivial algorithms, techniques, and data structures. It should also include any insights you discovered during this phase of the project.

Derby

We will hold a virtual machine derby. We will race your virtual machine against the submissions from the other groups in the class. The derby benchmark will be revealed one day prior to the derby. 30% of the grade will be determined by your ranking in the derby. At this time, we will not reveal the exact formula for determining this portion of the project grade.

On derby day, your group will also give a short presentation on the design of your virtual machine as well as the optimizations that you implemented.

Reference Materials

Beyond the normal course materials, we have provided links on the course website to useful and well-known papers on register allocation. Reading these before designing and writing your register allocator may be very useful. We will also post links to papers about other useful optimizations to the course website as we discover them. You can find these links under the “Reference Materials” section.

Known Issues

The x64asm library we have included for your use has a known issue where the assembler does not reserve enough memory for the code generation of long functions. Specifically, there is no logic to dynamically reserve more space in the instruction buffer as instructions are added. It is therefore your responsibility to call the reserve() method of each x64asm::Function as needed to ensure there is enough space in the buffer. We recommend doing this once per function, at the start. You can reserve irlist.size() * MAX * 15 bytes per function, where irlist is a vector containing your IR to be compiled, and MAX is a static upper bound on the number of assembly instructions that can result from a single IR instruction.

Tips

Deliverables

Your virtual machine must provide a command-line option for each code generation optimization and/or garbage collector algorithm. Your project writeup should include documentation for each command-line option. For example:

Required Command-line Options: You must provide a --opt=machine-code-only flag that only turns on machine code generation. We will use this for the milestone. You must provide a --opt=all flag to turn on all optimizations (“full optimizations”). We will use this command-line for the derby. For code generation optimizations, this option should consider how many times each code generation optimization is applied and the application order of the optimizations.

Grading

This assignment will be graded as follows: