Wednesday, June 29, 2005

Progress so far

I began hacking on GCC-CIL shortly after the Summer of Code was announced, and right now enough has been implemented to compile some programs, but the backend is still far from complete. For an example, here's a C program I submitted with my Google Summer of Code application that solves the N-Queens Problem for N=8 (standard chessboard), and the corresponding output from the June 14 snapshot of GCC-CIL at optimization levels 0 and 1: nqueens.c nqueens.0.s nqueens.1.s You'll need ilasm from mono, .NET or DotGNU to assemble and run the program. It was extremely satisfying to see a "real" program actually working. :-)

Hello, World!

Hello, world. My name is Jeyasankar "Jey" Kottalam, and I'm working on a CIL backend for GCC for The Mono Project as part of Google's Summer of Code.

"What??"
GCC is a retargetable compiler (loosely, software that "translates" high-level computer programs into machine-level computer programs) developed by the GNU Project. GCC takes programs written in a variety of languages as input, and produces assembler output for any of dozens of target architectures. My task is to add support for generating code for the ECMA Common Language Infrastructure, more commonly known by the popular implementations of this standard, Microsoft's .NET or Ximian's Mono. The low-level IL used for expressing programs for the ECMA CLI is called "CIL", or the Common Intermediate Language. My modifications to GCC allow it to emit CIL.

This is somewhat unique because the CLI is a stack architecture, and the target backend infrastructure in GCC is designed for register machines. My current approach is to generate CLI instructions directly from the optimized GIMPLE trees, which is one of the internal representations used by GCC. Traditionally, GCC expands these GIMPLE trees into RTL (Register Transfer Langauge) instructions, and further optimization, register allocation, and code generation is performed. In other words, the current approach bypasses nearly all RTL-related portions of GCC. Some GCC hackers suggested that it may be possible to use RTL up to the register allocation stage, and emitting CIL from the RTL instructions at that point... but that's for later.

"OK, so what?"
This [theoretically] allows programmers to easily port or target existing code written in C, C++, Objective C, FORTRAN, Ada, Pascal, D, or anything else that has a GCC front end to the ECMA CLI platform.