CSCI 310 Spring 2005, Day 2

  1. Syllabus issues.

  2. Getting into Real Work: Project 1
    1. The grammar for the language: (Remember EBNF form?)
      STM -> STM ; STM
      STM -> id := EXP
      STM -> print(EXPLIST)
      EXP -> id
      EXP -> num
      EXP-> EXP binop EXP
      EXP -> (STM,EXP)
      EXPLIST -> EXP,EXPLIST
      EXPLIST -> EXP
      
      id := ( ) , num binop are tokens, pulled out during lexical analysis.

    2. When we write a parser we create code, and data structures that determine if the program is in the correct syntactical form. (Essentially a data structure for each production.)

    3. An Abstract Syntax Tree removes unnecessary tokens from the parse tree (like := , ( ))

    4. Project 1 starts at the point of representing the program via an AST. The data structures you download represent these productions in abstract form. This is the point at which we can begin to do semantic analysis of the code. In project 1 we're really just practicing dealing with the tree form and moving information around.

  3. Lexical Analysis
    1. What it is: breaking the program into a stream of tokens -- similar to what happens when we hear or read language, where we actually notice the distinct words.
    2. This allows the parser to deal with the grammar without worrying about things like whitespace, comments and any other unnecessary stuff.
    3. What are tokens? Make a list...
    4. Task: Give regular expressions or FAs for
      1. identifiers
      2. integers
      3. floats
      4. if
      5. else
      6. while
    5. How do the FAs get implemented?
    6. How can we combine them to make a single scanner?
    7. is "ifta" two tokens or one?
    8. Can we teach an FA to decide which one? Do we need to add info (and what info)?

Gary Lewandowski
Last modified: Wed Jan 12 10:07:07 EST 2005