The difference between compilers and interpreters

How exactly do machines execute our high-level programs? We all have a general sense of what compilers and interpreters are, but the distinction between the two is so blurred that it's worth cracking open some real language implementations to see what's going on. Along the way, we'll define some important terms and establish precise boundaries between compilers and interpreters. In order to explore these terms, this article retraces my learning path through the jumble of programming languages and their implementations.

My programming adventures began in 1980 with the BASIC language using a teletype and 110 baud modem. After BASIC I moved on to assembly code for the MOS Technology 6502 processor in the Apple II. Assembly code is a Human-readable text format whose instructions map one-to-one to machine instructions. An assembler is a program that translates assembly code to machine code. Because of my 6502 coding, I eventually figured out that Apple's BASIC implementation was an interpreter. The interpreter was a kind of language or machine simulator that sat between my program and the underlying 6502 processor. Anything not running directly on the processor was interpreted. The most obvious distinction between the two was speed. BASIC was very slow and my 6502 programs were very fast in comparison.

Apple eventually released a Pascal interpreter based upon UCSD Pascal (http://en.wikipedia.org/wiki/UCSD_Pascal) which was to directly influence James Gosling's Java interpreter. This Pascal was not compiled from source code down to 6502 machine code. Instead the Pascal compiler generated instructions called p-codes for a theoretical machine called the p-machine. The p-machine was a virtual machine (VM), which is nothing more than a program that emulates a physical processor by executing low-level instructions. Today, we would call p-codes bytecodes (bytecode instructions begin with an 8-bit byte holding the operation code). Virtual machines can summon imaginary machines for which there is no direct hardware implementation or even emulate real processors. For example, as an undergraduate, I had to emulate a DEC PDP-11 on a CDC Cyber 6000-series machine. (Those machines were ancient even in 1982).

Based upon my experience with the BASIC interpreter on the Apple II, I thought that ``interpreted'' was synonymous with ``interactive.'' The p-machine, however, was an interpreter but not at all interactive. I couldn't type println('hi'); and have it execute immediately to give results like a calculator. Worse, I sometimes had to use Pascal on the CDC mainframe machines, which always operated in batch mode. We submitted programs to the mainframe and got the result later as a text file. In either case, the basic sequence was edit Pascal file, compile to bytecodes, and then ask the virtual machine to execute the bytecodes. That is not to say we couldn't write an interactive Pascal shell for learning and exploration purposes. Interaction is a characteristic of a particular implementation. We could even build an interactive system that translated bytecodes to machine code before executing it as we'll see later. For now, let's take a look at the batch translation of source code to machine code.

Native code execution

It wasn't until 1984 that I got to use a compiler (for C) that generated actual machine code instead of bytecodes. Well, to be precise, the compiler generated assembly code from C source code, which was then translated to machine code by the underlying UNIX assembler. C was great because it was much faster than Pascal and had a lot of useful features. Two of those features were macros and include files. After programming in C for a while, I realized that, technically speaking, the macros and include file mechanisms were not part of the compiler tool itself. Indeed, those features were not even part of the C language. Like the assembly phase after compilation, there was a preprocessing phase prior to compilation. A preprocessor expands macros and include files and passes the preprocessed output to the actual compiler. Such separation of concerns allows us to break complicated language implementations into smaller, focused components or tools.

The task of preprocessing before actual compilation can be extraordinarily complex. Very shortly after I learned C, I heard a lecture on a new language called C++. Everyone in the hall chuckled when the speaker revealed the name, which means ``increment C''. The first implementation of C++ was not a compiler. A tool called cfront translated C++ source code to C. The complete chain of tools was therefore: macro preprocessor, cfront, C compiler, assembler. The trick of reducing a new problem to a well understood problem is one we'll see over and over again in the language implementation realm.

C++ was more expressive than C in that C++ programs could express things more easily and with fewer lines. C++ also promoted more code reuse than C. Unfortunately, C++'s libraries were so anemic in those days that I ended up being not that much more productive in C++. Object-oriented programming in itself promotes code reuse, but productivity is more about reusing somebody else's robust and efficient code. In the late 1980s, a friend introduced me to Smalltalk, another object-oriented language that had just that: extensive libraries.

A new level of programmer productivity

Smalltalk-80, the first public release, had a bytecode compiler and interpreter implementation. Smalltalk came with a complete GUI development environment and rich libraries unlike the stark C compiler that ran from the command line. The interactive interface allowed you to evaluate Smalltalk fragments anytime or you could define classes and methods. When you saved a method, the environment invisibly compiled the method to bytecodes and stitched it into the surrounding class definition.

Smalltalk was my first experience with automatic garbage collection. Garbage collection frees the programmer from having to write software to track and deallocate previously-allocated objects. A few memory leaks in a C program back then were no problem because they usually didn't execute long enough to leak out of memory. Smalltalk's interactive and long-lived environment, however, could not tolerate memory leaks. The runtime cost of automatically finding and deallocating unused memory was worth the gain in robustness and reduced programming burden. Most languages in widespread use today make the same trade-off and provide automatic garbage collection; for example, Java, Ruby, Python, C#, and VisualBasic. At this point, I don't think I could go back to doing manual garbage collection just as I no longer want to write assembly code.

Bootstrapping languages

An interesting thing about Smalltalk-80 was that it was written almost entirely in itself. The graphical development environment, the compiler, and even the interpreter were written in Smalltalk. The system needed only a tiny piece of machine code as an interface to the underlying machine and operating system. Many languages are bootstrapped until they are largely written in themselves like this. Bootstrapping a language is analogous to booting up a computer. The ROM contains a small chunk of code that is just smart enough to load a bigger chunk of the operating system from the disk, which then loads more of the operating system, and so on. To bootstrap a compiler for language A, we start with a trivial subset of A implemented in another language (often assembly code). Development of the compiler can then proceed in that initial subset of A. As the compiler implements more and more of A, the compiler itself can use more of A.

Why didn't programmers flock to Smalltalk from C and C++ if they could be so much more productive? To answer that, let's start with the advantages of interpretation over native execution. Interpreters such as Smalltalk's have four really useful characteristics:

Interpreters make it easier easier to build interactive development environments. Interpreters easily execute code statement by statement whereas the usual (batch) compilers chew on entire programs and translate them to machine code.
Interpreters tend to be easier to port to different operating systems and computer architectures. Because the bytecode compiler and most of the interpreter are written in the language itself, a port only requires only a small amount of native machine code to get the interpreter running.
The bytecode version of a source program is portable to any system with an interpreter port. In contrast, programs compiled to machine code run only on a specific operating system and processor.
Bytecodes are denser than machine code by about 5x (L. Peter Deutsch and Allan M. Schiffman. Efficient implementation of the Smalltalk-80 system. In Conference Record of the Eleventh Annual ACM Symposium on Principles of Programming Languages, pages 297-302, Salt Lake City, Utah, 1984). This was more important when machines were much smaller and we needed the compiler and interpreter to have really small memory footprints.

An interpreter's only weakness is lackluster performance compared to natively compiled machine code. Execution speed is an ingrained concern for most programmers, particularly 20 years ago when computers were drastically slower. Speed is not the only reason programmers did not adopt Smalltalk in large numbers, but Smalltalk was perceived as just too slow. As computers have become faster and get larger memories, programmers are adopting highly productive programming languages such as Ruby and Python despite their heavy runtime burdens. Wouldn't it be great though if we could keep the advantages of an interpreter but use compiler technology to make it go faster? Two researchers did just that for Smalltalk in 1984.

Hybrid compiler-interpreters

Peter Deutsch and Allan Schiffman pioneered dynamic compilation, a general term for shifting compilation to program run-time. The opposite of dynamic compilation is the usual static compilation or batch compilation that completes before program deployment.

Early dynamic compilers translated methods to machine code upon first invocation (just-in-time compilation, usually abbreviated as JIT). All executions of a method would, therefore, execute natively. The mechanism is straightforward. Assume all methods exist in native form. Calls to untranslated methods trap, failing over to the compiler, which translates the method and completes the call. The user, therefore, experiences the cost of compilation at runtime.

The trick is to make the compile time plus the native execution time faster than simply interpreting it. To avoid actually increasing execution time, Deutsch's translation was ``more akin to macro-expansion than compilation.'' Craig Chambers and David Ungar's Self language system also compiled methods upon first invocation, but with much heavier optimizations (Craig Chambers and David Ungar. Customization: Optimizing compiler technology for self, a dynamically-typed object-oriented programming language. In PLDI, pages 146-160, 1989). Their compiled code ran code about twice as fast as the Smalltalk compiler's compiled code. For more details, you can watch Eliot Miranda's talk (http://video.google.com/videoplay?docid=-8988857822906068209) on Smalltalk and Self compilation. For an open source implementation of a dynamic Smalltalk compiler, see Squeak. (http://www.squeak.org)

It's hard to bring to bear really heavy optimizations during dynamic compilation without causing execution delays. Even if you generate really fast code, you've wasted execution time if the method is not a hotspot. A hotspot is an area of code that is executed repeatedly. When manually optimizing programs, developers try to identify and optimize hotspots because they constitute a significant fraction of the overall execution time. A dynamic compiler can use the same strategy to save execution time, only compiling hotspots in the bytecodes. The interpreter can count how often a method executes and then focus its attention on methods that execute frequently. It can even recompile the same code with heavier optimizations if a method gets hit very frequently.

Dynamic compilers have an advantage over static compilers. Dynamic compilers have information about the runtime types and values of actual objects. A static compiler only has information about their potential types and values. The use of such runtime feedback to generate efficient code is called adaptive optimization. For example, if the interpreter observes that method size() always invokes ArrayList.size(), then the compiler can generate code that directly calls or inlines that function. In the general case, invoking method size() means doing an expensive, full polymorphic message send (virtual method call in C++ terminology). The target object might be a LinkedList, for example, instead of an ArrayList. The general method call dispatch has to choose between size() implementations.

The Self language relied heavily upon adaptive optimization for performance. Much of that technology eventually found its way into Sun's HotSpot Java virtual machine. (http://java.sun.com/javase/technologies/hotspot/index.jsp) Note that, in common usage, the term ``Java VM'' is understood to include everything: the interpreter, optimizing compiler, and garbage collector. The HotSpot VM has been the focus of a massive research and development effort for the past decade. It is likely the most efficient VM in use today in terms of raw program statement execution (ignoring library implementations and other performance parameters). Consequently, many language implementations emit Java bytecodes to take advantage of the Java VM's efficiency (for example, Jython (http://www.jython.org), JRuby (http://jruby.codehaus.org/) and Groovy (http://groovy.codehaus.org).

At this point, we've explored a number of languages and identified basic strategies behind the execution of their programs. We saw that a single language can be implemented via a static native code compiler, an interpreter, an interpreter plus dynamic compiler, or an interpreter plus a dynamic and adaptive hotspot compiler. Strictly speaking, it is not meaningful to ask ``is language X interpreted or compiled?''. It's only meaningful to ask that about a language's implementation. That said, programmers understand the question to refer to the standard implementation of that language. There is nothing wrong with asking whether or not Ruby is interpreted, for example. The following table classifies some well-known languages by the execution strategy of their reference implementation.

Execution strategy	Languages
Bytecode interpreter	UCSD and Berkeley Pascal, JavaScript, Java ≤1.2, Python, Ruby 1.9, Smalltalk-80, Visual Basic ≤4
Tree-based interpreter	Perl ≤5, Ruby ≤1.8
Native code compilation	C, C++, Visual Basic 5 and 6
Bytecodes + dynamic compilation	Java ≥1.3, C#, Self, Smalltalk, Visual Basic .NET

To summarize this article, let's explicitly state the difference between an interpreter and a compiler with respect to programming languages. An interpreter is a program that executes other programs whereas a compiler translates a program to an equivalent program in a machine language. The compiled machine code may or may not be directly executable on a physical processor. If no corresponding processor exists, we call the machine code bytecode. So, both a traditional C compiler and Java's javac bytecode compiler qualify as compilers. To execute bytecode, we use a virtual machine to simulate the imaginary processor. A virtual machine is just another name for a specialized interpreter called a bytecode interpreter.