Seed7 
 FAQ 
 Manual 
 Screenshots 
 Examples 
 Algorithms 
 Download 
 Links 

 Manual 
 Introduction 
 Tutorial 
 Syntax 
 Statements 
 Types 
 Parameters 
 Objects 
 File System 
 Declarations 
 Tokens 
 Expressions 
 OS access 
 Actions 
 Errors 

10. TOKENS

A program consists of a sequence of tokens which may be delimited by white space. There are two types of tokens:

    identifiers
    literals

Syntax:

    program ::=
      { white_space | token } .

    token ::=
      identifier | literal .

10.1 White space

There are three types of white space

    spaces
    comments
    line comments

White space always terminates a preceding token. Some white space is required to separate otherwise adjacent tokens.

Syntax:

    white_space ::=
      ( space | comment | line_comment )
      { space | comment | line_comment } .

10.1.1 Spaces

There are several types of space characters which are ignored except as they separate tokens:

    blanks, horizontal tabs, carriage returns and new lines.

Syntax:

    space ::=
      ' ' | TAB | CR | NL .

10.1.2 Comments

Comments are introduced with the characters (* and are terminated with the characters *) . For example:

    (* This is a comment *)

Comment nesting is allowed so it is possible to comment out larger sections of the program which can also include comments. Comments cannot occur within string or character literals.

Syntax:

    comment ::=
      '(*' { any_character } '*)' .

10.1.3 Line comments

Line comments are introduced with the character # and are terminated with the end of the line.
For example:

    # This is a comment

Comments cannot occur within string, character or numerical literals.

Syntax:

    line_comment ::=
      '#' { any_character } NL .

10.2 Identifiers

There are three types of identifiers

    name identifiers
    special identifiers
    parenthesis

Identifiers can be written adjacent except that between two name identifiers and between two special identifiers white space must be used to separate them.

Syntax:

    identifier ::=
      name_identifier | special_identifier | parenthesis .

10.2.1 Name identifiers

A name identifier is a sequence of letters, digits and underscores ( _ ). The first character must be a letter or an underscore. Examples of name identifiers are:

    NUMBER  integer  const  if  UPPER_LIMIT  LowerLimit  x5  _end

Upper and lower case letters are different. Name identifiers may have any length and all characters are significant. The name identifier is terminated with a character which is neither a letter (or _ ) nor a digit. The terminating character is not part of the name identifier.

Syntax:

    name_identifier ::=
      ( letter | underscore ) { letter | digit | underscore } .

    letter ::=
      upper_case_letter | lower_case_letter .

    upper_case_letter ::=
      'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' |
      'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' |
      'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' .

    lower_case_letter ::=
      'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' |
      'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' |
      'u' | 'v' | 'w' | 'x' | 'y' | 'z' .

    digit ::=
      '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' .

    underscore ::=
      '_' .

10.2.2 Special identifiers

A special identifier is a sequence of special characters. Examples of special identifiers are:

    +  :=  <=  *  ->  ,  &

Here is a list of all special characters:

    ! $ % & * + , - . / : ; < = > ? @ \ ^ ` | ~

Special identifiers may have any length and all characters are significant. The special identifier is terminated with a character which is not a special character. The terminating character is not part of the special identifier.

Syntax:

    special_identifier ::=
      special_character { special_character } .

    special_character ::=
      '!' | '$' | '%' | '&' | '*' | '+' | ',' | '-' | '.' | '/' |
      ':' | ';' | '<' | '=' | '>' | '?' | '@' | '\' | '^' | '`' |
      '|' | '~' .

10.2.3 Parentheses

A parenthesis is one of the following characters:

    ( ) [ ] { }

Note that a parenthesis consists of only one character. Except for the character sequence (* (which introduces a comment) a parenthesis is terminated with the next character.

Syntax:

    parenthesis ::=
      '(' | ')' | '[' | ']' | '{' | '}' .

10.3 Literals

There are three types of literals

    integer literals
    character literals
    string literals

Syntax:

    literal ::=
      integer_literal | character_literal | string_literal .

10.3.1 Integer literals

An integer literal is a sequence of digits which is taken to be decimal. The sequence of digits may be followed by the letter E or e an optional + sign and a decimal exponent. Based numbers can be specified when the sequence of digits is followed by the # character and a sequence of extended digits. The decimal number in front of the # character specifies the base of the number which follows the # character. As base a number between 2 and 36 is allowed. As extended digits the letters A or a can be used for 10, B or b can be used for 11 and so on to Z or z which can be used as 35.

Syntax:

    integer_literal ::=
      decimal_integer [ exponent | based_integer ] .

    decimal_integer ::=
      digit { digit } .

    exponent ::=
      ( 'E' | 'e' ) [ '+' ] decimal_integer .

    based_integer ::=
      '#' extended_digit { extended_digit } .

    extended_digit ::=
      letter | digit .

10.3.2 String literals

A string literal is a sequence of characters surrounded by double quotes. For example:

    ""   " "   "\""   "'"   "\'"   "String"   "ch=\" "   "\n\n"

In order to represent nonprintable characters and certain printable characters the following escape sequences may be used.

    audible alert    BEL      \a    backslash    (\)   \\
    backspace        BS       \b    apostrophe   (')   \'   
    escape           ESC      \e    double quote (")   \"
    formfeed         FF       \f
    newline          NL (LF)  \n    control-A          \A
    carriage return  CR       \r      ...
    horizontal tab   HT       \t    control-Z          \Z
    vertical tab     VT       \v

Additionally there are the following possibilities:

  • Two backslashes with a sequence of blanks, horizontal tabs, carriage returns and new lines between them are completely ignored. The ignored characters are not part of the string. This can be used to continue a string in the following line. Note that in this case the leading spaces in the new line are not part of the string.
  • Two backslashes with an integer literal between them is interpreted as character with the specified ordinal number. Note that the integer literal is interpreted decimal unless it is written as based integer.

Syntax:

    string_literal ::=
      '"' { string_character } '"' .

    string_character ::=
      printable_character | escape_sequence .

    escape_sequence ::=
      '\a' | '\b' | '\e' | '\f' | '\n' | '\r' | '\t' | '\v' |
      '\\' | '\''' | '\"' | '\' upper_case_letter |
      '\' { space } '\' | '\' integer_literal '\' .

10.3.3 Character literals

A character literal is a character enclosed in single quotes. For example:

    'a'   ' '   '\n'   '!'   '\\'   '2'   '"'   '\"'   '\''

To represent control characters and certain other characters in character literals the same escape sequences as for string literals may be used.

Syntax:

    character_literal ::=
      ''' ( printable_character | escape_sequence ) ''' .

    escape_sequence ::=
      '\a' | '\b' | '\e' | '\f' | '\n' | '\r' | '\t' | '\v' |
      '\\' | '\''' | '\"' | '\' upper_case_letter |
      '\' { space } '\' | '\' integer_literal '\' .