Learning Compiler 1 Overview

Note

Strategies

  • Interpreters(slightly order)
  • Compilers(slightly newer)

Difference Between Interpreters and Compilers

  • Interpreters run programs “as is”: little or no preprocessing
  • Compilers do extensive preprocessing

Language Implementations

  • Low level: C/C++
  • Higher level: Python
  • Interpreter + “Just In Time”(JIT) compiler: Java

History

FORTRAN I

Struture of Compiler

  1. Lexical analysis
  2. Parsing
  3. Semantic analysis
  4. Optimization
  5. Code generation
  • Lexical analysis: First step: recognize words. Divides program text into “words” or “tokens”.

  • Parsing: Understand sentence structure. Parsing = Diagramming Sentences(a tree).

  • Semantic analysis: Understand “meaning”. Compilers perform limited analysis to catch inconsistencies, and many semantic checks besides variable bindings.

  • Optimization: Automatically modify programs to run faster, use less memory and in general, conserve some resource.

  • Code generation: Usually produces assembly code. A translation into another language.

Intermediate Languages(IL)

Many compilers perform translations between successive intermediate forms:

  • All but first and last are intermediate languages internal to the compiler
  • Typically there is 1 IL

IL’s generally ordered in descending level of abstraction

  • Highest is source
  • Lowest is assembly

IL’s are useful because lower levels expose features hidden by higher levels

  • registers
  • memory layout
  • etc.

But lower levels obscure high-level meaning

Issues

Language design has big impact on compiler

  • Determines what is easy and hard to compile
  • Course theme: many trade-offs in language design

Today’s Compiler

The proportions have changed since FORTRAN

  • Early: lexing, parsing most complex, expensive
  • Today: optimization dominates all other phases, lexing and parsing are cheap

Reference

Slides: Overview

Video: Compiler Stanford 2014