Iván Burgert (iburgert)
ivanbur.github.io/proposal.html
Main PageMy project is to add auto-vectorization to the C0 compiler that I built for the 15-411 course here at CMU. This project involves doing analysis of a program (particularly loops within the program) and determining whether or not it is possible to vectorize parts of the program (replace normal instructions with SIMD instructions).
Most programmers, when dealing with arrays, write loops that can easily be done using vector instructions, however they may not be aware that these instructions exist, or they may not know how to properly utilize them. The goal of this project is to allow programmers to write loops and programs the way they normally would, but for the compiler to identify areas of the program that can be vectorized, and to perform that vectorization automatically.
This problem is challenging because it is very difficult to determine when a program or a loop can be transformed to use vector instructions. The analysis that is required is very complex, and involves doing data dependency and loop analysis, as well as possibly some alias analysis and purity analysis. It can be made even more effective when combined with other optimizations, such as loop invariant code motion, induction variable elimination, constant propagation, copy propagation, and constant folding. The actual transformation of the code into using vector instructions after the analysis has been performed is not incredibly difficult when the code is simple, but it can be difficult when control flow starts to get involved. The more complex the control flow, the more complex the transformation.
I will go off my existing code base from the compilers course, which includes a fully correct L4 compiler (an almost complete subset of C0) and implements some optimizations, including copy propagation, constant propagation, constant folding, common subexpression elimination, and dead code elimination. No other special resources are required. I am basing my work off of “Dynamic trace-based analysis of vectorization potential of applications” (https://dl.acm.org/doi/10.1145/2345156.2254108).
I plan to achieve auto-vectorization for simple loops with no control flow and with loop-independent data dependence. This will involve successfully being able to do analysis to determine whether or not a loop is vectorizable, which is perhaps the hardest part of this project. Successfully transforming the common case of no control flow inside of the loop should be doable once that is achieved. The next goal that I hope to achieve if I have time is being able to handle simple control flow inside of the loop, and following that, more complex control flow inside of the loop. This will be more difficult because the control flow graph inside of the loop will need to be analyzed and “flattened” and turned into masks, because vector instructions work with masks so that the operations only affect the elements that they are intended to.
My compiler outputs x86-64 machine code, so that is my target platform. This means that my machine code should be able to run on the AFS machines, and I also have a docker container that should be able to run my output assembly.