Learning Compiler 1 Overview
Note
Strategies
- Interpreters(slightly order)
- Compilers(slightly newer)
Difference Between Interpreters and Compilers
- Interpreters run programs “as is”: little or no preprocessing
- Compilers do extensive preprocessing
Language Implementations
- Low level: C/C++
- Higher level: Python
- Interpreter + “Just In Time”(JIT) compiler: Java
History
FORTRAN I
Struture of Compiler
- Lexical analysis
- Parsing
- Semantic analysis
- Optimization
- Code generation
Lexical analysis: First step: recognize words. Divides program text into “words” or “tokens”.
Parsing: Understand sentence structure. Parsing = Diagramming Sentences(a tree).
Semantic analysis: Understand “meaning”. Compilers perform limited analysis to catch inconsistencies, and many semantic checks besides variable bindings.
Optimization: Automatically modify programs to run faster, use less memory and in general, conserve some resource.
Code generation: Usually produces assembly code. A translation into another language.
Intermediate Languages(IL)
Many compilers perform translations between successive intermediate forms:
- All but first and last are intermediate languages internal to the compiler
- Typically there is 1 IL
IL’s generally ordered in descending level of abstraction
- Highest is source
- Lowest is assembly
IL’s are useful because lower levels expose features hidden by higher levels
- registers
- memory layout
- etc.
But lower levels obscure high-level meaning
Issues
Language design has big impact on compiler
- Determines what is easy and hard to compile
- Course theme: many trade-offs in language design
Today’s Compiler
The proportions have changed since FORTRAN
- Early: lexing, parsing most complex, expensive
- Today: optimization dominates all other phases, lexing and parsing are cheap
Reference
Slides: Overview
Video: Compiler Stanford 2014