Seed7 - The extensible programming language
Seed7 FAQ Manual Screenshots Examples Libraries Algorithms Download Links
Screenshots Panic Mandelbr Planets Comanche Savehd7 Compiler Dnafight Sudoku Wator Tar7 Sydir7 Ftp7 Castle Tetris Make7 Ftpserv Basic Pairs Shisen Eliza Toutf8 Lander Wiz Startrek Mahjong
Screenshots
Compiler Source Code
 previous   up   next 

S7c is the Seed7 compiler.

S7c is written in Seed7 and compiles Seed7 programs to C programs. It uses the analyze phase of the interpreter to convert a program from Seed7 to call-code. Call-code consists of values and function calls and is just handled in memory. Then it uses the call-code to generate a corresponding C program. This C program is compiled and linked with the Seed7 runtime library afterwards.

Usage

s7c [ options ] source

Possible options are

  • -? Write Seed7 compiler usage.
  • -O and -O2 Tell the C compiler to optimize.
  • -b Specify the directory of the Seed7 runtime libraries (e.g.: -b ../bin).
  • -e Generate code which sends a signal, when an uncaught exception occurs. This option allows debuggers to handle uncaught Seed7 exceptions.
  • -g Tell the C compiler to generate an executable with debug information. This way the debugger will refer to Seed7 source files and line numbers. To generate debug information which refers to the temporary C program the option -g-debug_c can be used.
  • -l Add a directory to the include library search path (e.g.: -l ../lib).
  • -ocn Optimize constants with level n. E.g.: -oc3 Where n is a digit between 0 and 3:
    • 0 Do no optimizations with constants.
    • 1 Use literals and named constants to simplify expressions (default).
    • 2 Evaluate constant expressions to simplify expressions.
    • 3 Like -oc2 and additionally evaluate all constant expressions.
  • -r Suppress the generation of range checks for strings and arrays.
  • -te Generate code to trace exceptions. This option works in the same way as the interpreter option -te: Every exception will write a message to stdout and the user will be asked to continue (with enter) or to terminate (with * ).

Reflection

The Seed7 reflection provides access to the internal data structures of the interpreter. Specially the call-code of a program can be accessed with the reflection. This makes it suitable for compiling seed7. There are several types on which the reflection is based:

program
Describes a program and is the entry point to the reflection for the compiler.
reference
Reference to an object (plain old data types count also as object here).
ref_list
List of referenced objects.
type
Describes a type (the types of the compiled program have their own type namespace).

The definitions for reference, ref_list and type are in the seed7_05.s7i library. The advanced features of the reflection and the definition for the type program can be found in the progs.s7i library.

C compiler back end

The Seed7 compiler is capable to use different C compilers and C runtime libraries as back end. The program chkccomp.c determines the properties of the back end. This is done when Seed7 interpreter and runtime library are compiled. The properties of the back end are available in Seed7 via the library cc_conf.s7i. This library defines ccConf, which is a constant of type ccConfigType. The type ccConfigType contains elements to descibe the properties:

Type Name Description
boolean WITH_STRI_CAPACITY TRUE, when the Seed7 runtime library uses strings with capacity. The capacity of a string can be larger than its size. Strings with capacity can be enlarged without calling realloc().
boolean ALLOW_STRITYPE_SLICES TRUE, when the actual characters of a string can be stored elsewhere. This allows string slices without the need to copy characters.
boolean RSHIFT_DOES_SIGN_EXTEND TRUE, when the sign of negative signed integers is preserved with a right shift. The C standard specifies that the right shift of signed integers is implementation defined, when the shifted values are negative.
boolean TWOS_COMPLEMENT_INTTYPE TRUE, when signed integers are represented as twos complement numbers. This allows some simplified range checks in compiled programs.
boolean LITTLE_ENDIAN_INTTYPE TRUE, when the byte ordering of integers is little endian.
boolean NAN_COMPARISON_WRONG TRUE, when a comparison between two NaN values (with == < > <= or >= ) returns TRUE.
boolean POWER_OF_ZERO_WRONG TRUE, when the pow() function does not work correctly in the case when the base is zero and and the exponent is negative. If it is TRUE fltPow() should be used instead of pow().
boolean HAS_SIGSETJMP TRUE, when the functions sigsetjmp() and siglongjmp() are available. When it is FALSE the functions setjmp() and longjmp() must be used instead.
boolean SIGILL_ON_OVERFLOW TRUE, when an integer overflow raises the signal SIGILL.
boolean FLOAT_ZERO_DIV_ERROR TRUE, when the C compiler classifies a floating point division by zero as fatal error.
boolean ISNAN_WITH_UNDERLINE TRUE, when the macro/function _isnan() is defined in <float.h> respectively <math.h> instead of isnan().
boolean DO_SIGFPE_WITH_DIV_BY_ZERO TRUE, when SIGFPE should be raised with a division by zero instead of just calling raise(SIGFPE). Under Windows it is necessary to trigger SIGFPE this way to assure that the debugger can catch it. The Seed7 to C compiler produces code to raise SIGFPE when an uncaught EXCEPTION occurs (when the compiler was called with the option -e).
boolean CHECK_INT_DIV_BY_ZERO TRUE, when it is necessary to check all integer divisions (div, rem, mdiv and mod) for division by zero. The generated C code should, when executed, raise the exception NUMERIC_ERROR instead of doing the illegal divide operation.
boolean CHECK_FLOAT_DIV_BY_ZERO TRUE, when a C floating point division by zero does not return the IEEE 754 values Infinity, -Infinity or NaN. In this case the interpreter checks all float divisions and returns the correct result. Additionally the Seed7 to C compiler generates C code, which checks all float divisions ( / and /:= ) for division by zero. The generated C code should, when executed, return Infinity, -Infinity or NaN instead of doing the divide operation.
boolean LIMITED_CSTRI_LITERAL_LEN TRUE, when the C compiler limits the length of string literals. Some C compilers limit the maximum string literal length. There are limits of 2,048 bytes and 16,384 (16K) bytes. The actual limit is not interesting, but the fact that a limit exists or does not exist.
boolean CC_SOURCE_UTF8 TRUE, when the C compiler accepts UTF-8 encoded file names in #line directives. The file names from #line directives are used by the debugger to allow source code debugging.
boolean USE_WMAIN TRUE, when the main function is named wmain. This is a way to support Unicode command line arguments under Windows. An alternate way to support Unicode command line arguments under Windows uses the functions getUtf16Argv() and freeUtf16Argv() (both defined in "cmd_win.c").
boolean USE_WINMAIN TRUE, when the main function is named WinMain.
boolean FLOATTYPE_DOUBLE TRUE, when the type floatType is double. When it is FALSE floatType is float.
integer INTTYPE_SIZE Size of the type intType in bits (either 32 or 64).
integer FLOATTYPE_SIZE Size of the type floatType in bits (either FLOAT_SIZE or DOUBLE_SIZE).
integer POINTER_SIZE Size of a pointer in bits.
integer GENERIC_SIZE The maximum of INTTYPE_SIZE, FLOATTYPE_SIZE and POINTER_SIZE. This is also the size in bits of the types rtlValueunion, rtlObjecttype and generictype (defined in data_rtl.h).
integer INT_SIZE Size of the type int in bits.
string INT32TYPE Name of a signed integer type that is 32 bits wide. The runtime library and the compiler use a typedef to define the type int32Type with INT32TYPE.
string UINT32TYPE Name of an unsigned integer type that is 32 bits wide. The runtime library and the compiler use a typedef to define the type uint32Type with UINT32TYPE.
string INT64TYPE Name of a signed integer type that is 64 bits wide. The runtime library and the compiler use a typedef to define the type int64Type with INT64TYPE.
string UINT64TYPE Name of an unsigned integer type that is 64 bits wide. The runtime library and the compiler use a typedef to define the type uint64Type with UINT64TYPE.
string INT32TYPE_LITERAL_SUFFIX The suffix used by the literals of the type int32Type.
string INT64TYPE_LITERAL_SUFFIX The suffix used by the literals of the type int64Type.

Optimizations

The Seed7 compiler does several optimizations (without using the '-O' option):

Use special case functions

For certain constant values some function calls are replaced by corresponding calls of special case functions:

  • The string comparisons = and <> (primitive actions 'str_eq' and 'str_ne') are simplified when one or both parameters are constant strings.

  • The string indexing like stri[num] (primitive action 'str_idx') is simplified when the string or the index are constant.

  • The array indexing like anArray[num] (primitive action 'arr_idx') is simplified when the array or the index are constant.

  • Searches and splits with string constant of length 1 are replaced by equivalent functions which use a character instead:

    Function call replaced by C function replacement C function
    pos(stri, "a") pos(stri, 'a') 'strPos' 'strChPos'
    rpos(stri, "a") rpos(stri, 'a') 'strRpos' 'strRChPos'
    pos(stri, "a", start) pos(stri, 'a', start) 'strIpos' 'strChIpos'
    rpos(stri, "a", start) rpos(stri, 'a', start) 'strRIPos' 'strRChIPos'
    split(stri, "a") split(stri, 'a') 'strSplit' 'strChSplit'
  • Initializations of string variables are optimized when an empty string or a string with length 1 is used:

    Seed7 variable declaration C declaration C initialization replacement C initialization
    var string: stri is ""; striType o_123​/*stri*/; o_123/*stri*/ = strCreate(""); o_123/*stri*/ = strEmpty(); /* "" */
    var string: stri is "a"; striType o_123​/*stri*/; o_123/*stri*/ = strCreate("a"); o_123/*stri*/ = chrStr('a'); /* "a" */
  • The compiler optimizes integer divisions in the following way:

    Seed7 expression C expression
    a div b a/b
    a div 0 (raise_error(NUMERIC_ERROR),0)
    a div 1 a
    a div -1 -a
    0 div b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a rem b a%b
    a rem 0 (raise_error(NUMERIC_ERROR),0)
    a rem 1 0
    a rem -1 0
    0 rem b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a mdiv b (a>0&&b<0 ? (a-1)/b-1 : a<0&&b>0 ? (a+1)/b-1 : a/b)
    a mdiv 0 (raise_error(NUMERIC_ERROR),0)
    a mdiv 1 a
    a mdiv -1 -a
    a mdiv 8 a>>3 or, when >> does not sign extend: a<0?~(~a>>b):a>>b
    a mdiv -8 -a>>3 or, when >> does not sign extend: a=-a, a<0?~(~a>>b):a>>b
    0 mdiv b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a mod b (c=a%b, a<0^b<0 && c!=0 ? c+b : c)
    a mod 0 (raise_error(NUMERIC_ERROR),0)
    a mod 1 0
    a mod -1 0
    a mod 8 a&7
    a mod -8 -(-a&7)
    0 mod b (b==0?(raise_error(NUMERIC_ERROR),0):0)
  • Operations with bigInteger values are optimized, when a cheaper function can be used:

    Original expression Optimized expression Original C function Optimized C function
    num + 0_ num bigAdd -
    num + 1_ succ(num) bigAdd bigSucc
    num + num num << 1 bigAdd bigLShift
    num - 1_ pred(num) bigAdd bigPred
    num * -8_ -(num << 3) bigMult bigNegateTemp(bigLShift ...
    num * -1_ -num bigMult bigNegate
    num * 0_ 0_ bigMult -
    num * 1_ num bigMult -
    num * 2_ num << 1 bigMult bigLShift
    num * 8_ num << 3 bigMult bigLShift
    num * num num ** 2 bigMult bigSquare
    -8_ * num -(num << 3) bigMult bigNegateTemp(bigLShift ...
    -1_ * num -num bigMult bigNegate
    0_ * num 0 bigMult -
    1_ * num num bigMult -
    2_ * num num << 1 bigMult bigLShift
    8_ * num num << 3 bigMult bigLShift
    num ** (-1) raise NUMERIC_ERROR bigIPow -
    num ** 0 1 bigIPow -
    num ** 1 num bigIPow -
    num ** 2 num * num bigIPow bigSquare
    1_ ** num raise NUMERIC_ERROR for num<0 or 1_ bigIPow -
    2_ ** num 1_ << num or raise NUMERIC_ERROR for num<0 bigIPow bigLog2BaseIPow
    8_ ** num 1_ << 3 * num or raise NUMERIC_ERROR for num<0 bigIPow bigLog2BaseIPow
    num div -1_ -num bigDiv bigNegate
    num div 0_ raise NUMERIC_ERROR bigDiv -
    num div 1_ num bigDiv -
    num mdiv -2_ -num >> 1 bigMDiv bigRShiftAssign(bigNegate ...
    num mdiv -1_ -num bigMDiv bigNegate
    num mdiv 0_ raise NUMERIC_ERROR bigMDiv -
    num mdiv 1_ num bigMDiv -
    num mdiv 2_ num >> 1 bigMDiv bigRShift
    num mdiv 8_ num >> 3 bigMDiv bigRShift
    num mod -2_ num + ((-num >> 1) << 1) bigMod bigNegateTemp(bigLowerBitsTemp(bigNegate ...
    num mod -1_ 0 bigMod -
    num mod 0_ raise NUMERIC_ERROR bigMod -
    num mod 1_ 0 bigMod -
    num mod 2_ num - ((num >> 1) << 1) bigMod bigLowerBits
    num mod 8_ num - ((num >> 3) << 3) bigMod bigLowerBits
    num +:= 0_ noop bigGrow -
    num +:= 1_ incr(num) bigGrow bigIncr
    num +:= -1_ decr(num) bigGrow bigDecr
    num -:= 0_ noop bigShrink -
    num -:= 1_ decr(num) bigShrink bigDecr
    num -:= -1_ incr(num) bigShrink bigIncr
    num *:= 0_ num := 0_ bigMultAssign bigCpy
    num *:= 1_ noop bigMultAssign -
    num *:= 2_ num <<:= 1 bigMultAssign bigLShiftAssign
    num *:= 4_ num <<:= 2 bigMultAssign bigLShiftAssign

Manage temporary values

A temporary expression which would be freed after the return from a function can be used by special case functions. That way it is not necessary to free the temporary value afterwards:

Normal function Function using temporary Comment
arrHead arrHeadTemp Splits the array into the head which is returned and an unused part which is freed later.
arrRange arrRangeTemp Splits the array into the range which is returned and an unused part which is freed later.
arrTail arrTailTemp Splits the array into the tail which is returned and an unused part which is freed later.
strConcat strConcatTemp Resizes the temorary and returns it after concatenating the second parameter.
strAppend strAppendTemp Resizes the temorary and concatenates it to a variable.
strHead strHeadTemp Resizes the temorary to the requested size and returns it.
strUp strUpTemp Converts the parameter to upper case and returns it.
strLow strLowTemp Converts the parameter to lower case and returns it.
strLpad0 strLpad0Temp Resizes the temorary, adds leading zeros and returns it.
bigNegate bigNegateTemp Negates the parameter and returns it.
bigLowerBits bigLowerBitsTemp Take the lower bits of the parameter and returns it.
bigSucc bigSuccTemp Increments the parameter and returns it.
bigPred bigPredTemp Decrements the parameter and returns it.
bigAdd bigAddTemp Adds a value to the parameter and returns it.
bigSbtr bigSbtrTemp Subtracts a value from the parameter and returns it.


Compiling chkint

Speed improvement
with compiled program

 previous   up   next