Phase of Compiler



This content originally appeared on DEV Community and was authored by Mujahida Joynab

A compiler works in several stages to transform high-level source code into efficient machine code. Each stage has its own role:

1. Lexical Analyzer (Scanner)

  • Input: Pure high-level language source code
  • Process: Breaks the input into meaningful sequences called tokens (e.g., keywords, identifiers, operators).
  • Output: String (sequence) of tokens

2. Syntax Analyzer (Parser)

  • Input: String of tokens from the lexical analyzer
  • Process: Checks whether tokens follow the correct grammar of the language.
  • Output: Parse Tree (also called Syntax Tree)

3. Semantic Analyzer

  • Input: Parse tree
  • Process: Ensures the program has semantic correctness (e.g., type checking, variable declarations, scope rules).
  • Output: Intermediate representation, often Three-Address Code (TAC)

4. Intermediate Code Generator

  • Input: Validated parse tree with semantic meaning
  • Process: Converts into an intermediate code that is easier to optimize and translate into machine code.
  • Output: Intermediate code

5. Code Optimizer

  • Input: Intermediate code
  • Process: Improves efficiency without changing meaning (e.g., removing redundant code, improving memory use).
  • Output: Optimized intermediate code

6. Target Code Generator

  • Input: Optimized intermediate code
  • Process: Translates into the final target machine code (assembly or binary).
  • Output: Executable machine code

In short:
High-Level Code → Tokens → Parse Tree → TAC → Optimized Code → Machine Code


This content originally appeared on DEV Community and was authored by Mujahida Joynab