Programs
Compiler Source Code
 previous   up   next 

S7c is the Seed7 compiler.

S7c is written in Seed7 and compiles Seed7 programs to efficient machine code. It uses the analyze phase of the interpreter to convert a program from Seed7 to call-code. Call-code consists of values and function calls and is just handled in memory. Then it uses the call-code to generate a corresponding C program. This C program is compiled and linked with the Seed7 runtime library afterwards. The intermediate C code is viewed as portable assembler. It is not intended for human readers.

Usage

s7c [ options ] source

Possible options are

  • -? Write Seed7 compiler usage.
  • -On Tell the C compiler to optimize with level n (n is between 1 and 3).
  • -O Equivalent to -O1
  • -S Specify the stack size of the executable (e.g.: -S 16777216).
  • -b Specify the directory of the Seed7 runtime libraries (e.g.: -b ../bin).
  • -c Specify configuration (C compiler, etc.) to be used (e.g.: -c emcc).
  • -e Generate code which sends a signal, if an uncaught exception occurs. This option allows debuggers to handle uncaught Seed7 exceptions.
  • -flto Enable link time optimization.
  • -g Tell the C compiler to generate an executable with debug information. This way the debugger will refer to Seed7 source files and line numbers. To generate debug information which refers to the temporary C program the option -g-debug_c can be used.
  • -l Add a directory to the include library search path (e.g.: -l ../lib).
  • -ocn Optimize generated C code with level n. E.g.: -oc3 The level n is a digit between 0 and 3:
    • 0 Do no optimizations with constants.
    • 1 Use literals and named constants to simplify expressions (default).
    • 2 Evaluate constant parameter expressions to simplify expressions.
    • 3 Like -oc2 and additionally evaluate all constant expressions.
  • -p Activate simple function profiling.
  • -sx Suppress checks specified with x. E.g.: -sr or -sro The checks x are specified with letters from the following list:
  • -tx Set runtime trace level to x. Where x is a string consisting of the following characters:
    • e Trace exceptions and handlers
    • f Trace functions
    • s Trace signals
  • -wn Specify warning level n. E.g.: -w2 The level n is a digit between 0 and 2:
    • 0 Omit warnings.
    • 1 Write normal warnings (default).
    • 2 Write warnings for raised exceptions.

Profiling

The Seed7 compiler supports simple function profiling. The program must be compiled with the option -p. If the program is executed it writes profiling data to the file profile_out, when it is finished. The file "profile_out" contains a tab-separated table with microseconds, number of calls, place of the function and function name.

Reflection

The Seed7 reflection provides access to the internal data structures of the interpreter. Specially the call-code of a program can be accessed with the reflection. This makes it suitable for compiling seed7. There are several types on which the reflection is based:

program
Describes a program and is the entry point to the reflection for the compiler.
reference
Reference to an object (plain old data types count also as object here).
ref_list
List of referenced objects.
type
Describes a type (the types of the compiled program have their own type namespace).

The definitions for reference, ref_list and type are in the seed7_05.s7i library. The advanced features of the reflection and the definition for the type program can be found in the progs.s7i library.

C compiler back end

The Seed7 compiler is capable to use different C compilers and C runtime libraries as back end. The program chkccomp.c determines the properties of the back end. This is done when Seed7 interpreter and runtime library are compiled. The properties of the back end are available in Seed7 via the library cc_conf.s7i. This library defines ccConf, which is a constant of type ccConfigType. The type ccConfigType contains elements to descibe the properties:

Type Name Description
boolean WITH_STRI_CAPACITY TRUE, if the Seed7 runtime library uses strings with capacity. The capacity of a string can be larger than its size. Strings with capacity can be enlarged without calling realloc().
boolean ALLOW_STRITYPE_SLICES TRUE, if the actual characters of a string can be stored elsewhere. This allows string slices without the need to copy characters.
boolean ALLOW_BSTRITYPE_SLICES TRUE, if the actual characters of a bstring can be stored elsewhere. This allows bstring slices without the need to copy characters.
boolean RSHIFT_DOES_SIGN_EXTEND TRUE, if the sign of negative signed integers is preserved with a right shift. The C standard specifies that the right shift of signed integers is implementation defined, if the shifted values are negative.
boolean TWOS_COMPLEMENT_INTTYPE TRUE, if signed integers are represented as twos complement numbers. This allows some simplified range checks in compiled programs.
boolean LITTLE_ENDIAN_INTTYPE TRUE, if the byte ordering of integers is little endian.
boolean FLOAT_COMPARISON_OKAY TRUE, if float comparisons with NaN and negative zero work okay. A comparison between NaN and any other value should return FALSE. Negative zero should be considered as identical to positive zero. Comparison refers to comparisons with == < > <= or >= . If FLOAT_COMPARISON_OKAY is FALSE fltEq(), fltLt(), fltGt(), fltLe() and fltGe() should be used to do comparisons of float values.
boolean POW_FUNCTION_OKAY TRUE, if pow() works okay for a base of zero, one or NaN. If it is FALSE fltPow() should be used instead of pow().
boolean FMOD_FUNCTION_OKAY TRUE, if fmod() works okay for Infinity, NaN and zero. If it is FALSE fltRem() should be used instead of fmod().
boolean SQRT_FUNCTION_OKAY TRUE, if sqrt() works okay for a negative argument. If it is FALSE fltSqrt() should be used instead of sqrt().
boolean EXP_FUNCTION_OKAY TRUE, if exp() works okay for NaN. If it is FALSE fltExp() should be used instead of exp().
boolean LOG_FUNCTION_OKAY TRUE, if log() works okay for zero, a negative values and NaN. If it is FALSE fltLog() should be used instead of log().
boolean LOG10_FUNCTION_OKAY TRUE, if log10() works okay for zero, a negative values and NaN. If it is FALSE fltLog10() should be used instead of log10().
boolean LOG1P_FUNCTION_OKAY TRUE, if log1p() works okay for -1.0, values < -1.0 and NaN. If it is FALSE fltLog1p() should be used instead of log1p().
boolean LOG2_FUNCTION_OKAY TRUE, if log2() works okay for zero, a negative values and NaN. If it is FALSE fltLog2() should be used instead of log2().
boolean LDEXP_FUNCTION_OKAY TRUE, if ldexp() works okay for NaN. If it is FALSE fltLdexp() should be used instead of lsexp().
boolean FREXP_FUNCTION_OKAY TRUE, if frexp() works okay for Infinity, NaN and subnormal numbers. If it is FALSE fltDecompose() should be used instead of frexp().
boolean HAS_SIGSETJMP TRUE, if the functions sigsetjmp() and siglongjmp() are available. If it is FALSE the functions setjmp() and longjmp() must be used instead.
boolean CHECK_INT_DIV_BY_ZERO TRUE if integer divisions must be checked for a division by zero. This applies to the division operations div and mdiv. The generated C code should, if a division by zero occurs, raise the exception NUMERIC_ERROR instead of doing the illegal divide operation. If CHECK_INT_DIV_BY_ZERO is FALSE a division by zero always triggers SIGFPE. SIGFPE is caught by the Seed7 run-time library and triggers a NUMERIC_ERROR.
boolean CHECK_INT_DIV_ZERO_BY_ZERO TRUE if the C expression 0/0 might not trigger SIGFPE. C compilers assume that so called "undefined behavior" will not happen. According to the C standard a division by 0 triggers undefined behavior. This way a C compiler is allowed to optimize the expressions 0/0 and 0/variable to 0. Likewise the expression variable/variable can be optimized to 1. In Seed7 a division by zero is defined behavior, since it raises the exception NUMERIC_ERROR. This configuration setting applies to the division operations div and mdiv. The generated C code should, if a division by zero occurs, raise the exception NUMERIC_ERROR instead of allowing the C compiler to do its optimization.
boolean CHECK_INT_REM_BY_ZERO TRUE if integer remainder must be checked for a division by zero. This applies to the division operations rem and mod. The generated C code should, if a remainder by zero occurs, raise the exception NUMERIC_ERROR instead of doing the illegal divide operation. If CHECK_INT_REM_BY_ZERO is FALSE a remainder by zero always triggers SIGFPE. SIGFPE is caught by the Seed7 run-time library and triggers a NUMERIC_ERROR.
boolean CHECK_INT_REM_ZERO_BY_ZERO TRUE if the C expression 0%0 might not trigger SIGFPE. C compilers assume that so called "undefined behavior" will not happen. According to the C standard a division by 0 triggers undefined behavior. This way a C compiler is allowed to optimize the expressions 0%0 and 0%variable to 0. Likewise the expression variable%variable can be optimized to 0. In Seed7 a division by zero is defined behavior, since it raises the exception NUMERIC_ERROR. This configuration setting applies to the division operations rem and mod. The generated C code should, if a division by zero occurs, raise the exception NUMERIC_ERROR instead of allowing the C compiler to do its optimization.
boolean FLOAT_ZERO_DIV_ERROR TRUE, if the C compiler classifies a floating point division by zero as fatal error.
boolean CHECK_FLOAT_DIV_BY_ZERO TRUE, if a C floating point division by zero does not return the IEEE 754 values Infinity, -Infinity or NaN. In this case the interpreter checks all float divisions and returns the correct result. Additionally the Seed7 to C compiler generates C code, which checks all float divisions ( / and /:= ) for division by zero. The generated C code should return Infinity, -Infinity or NaN instead of doing the divide operation.
boolean HAS_EXP2 TRUE, if the C function exp2() is present.
boolean HAS_EXP10 TRUE, if the C function exp10() is present.
boolean HAS_EXPM1 TRUE, if the C function expm1() is present.
boolean HAS_CBRT TRUE, if the C function cbrt() is present.
boolean LIMITED_CSTRI_LITERAL_LEN TRUE, if the C compiler limits the length of string literals. Some C compilers limit the maximum string literal length. There are limits of 2,048 bytes and 16,384 (16K) bytes. The actual limit is not interesting, but the fact that a limit exists or does not exist.
boolean SWITCH_WORKS_FOR_INT64TYPE TRUE, if switch statements work with 64-bit values.
boolean STMT_BLOCK_IN_PARENTHESES_OK TRUE, if a compound statement is allowed in a parentesized expression.
boolean CC_SOURCE_UTF8 TRUE, if the C compiler accepts UTF-8 encoded file names in #line directives. The file names from #line directives are used by the debugger to allow source code debugging.
boolean USE_WMAIN TRUE, if the main function is named wmain. This is a way to support Unicode command line arguments under Windows. An alternate way to support Unicode command line arguments under Windows uses the functions getUtf16Argv() and freeUtf16Argv() (both defined in "cmd_win.c").
boolean USE_WINMAIN TRUE, if the main function is named WinMain.
boolean USE_DO_EXIT TRUE, if the main function must be terminated with doExit().
boolean FLOATTYPE_DOUBLE TRUE, if the type floatType is double. If it is FALSE floatType is float.
integer INTTYPE_SIZE Size of the type intType in bits (either 32 or 64).
integer FLOATTYPE_SIZE Size of the type floatType in bits (either FLOAT_SIZE or DOUBLE_SIZE).
integer POINTER_SIZE Size of a pointer in bits.
integer GENERIC_SIZE The maximum of INTTYPE_SIZE, FLOATTYPE_SIZE and POINTER_SIZE. This is also the size in bits of the types rtlValueunion, rtlObjecttype and generictype (defined in data_rtl.h).
integer INT_SIZE Size of the type int in bits.
integer INT_MIN Minimum value of the type int.
integer INT_MAX Maximum value of the type int.
integer LONG_SIZE Size of the type long in bits.
integer FLOATTYPE_MANTISSA_BITS Number of mantissa bits in the binary floatType representation.
integer FLOATTYPE_EXPONENT_OFFSET Exponent offset in the binary floatType representation. To get the actual exponent the offset must be subtracted.
integer INT_RANGE_IN_FLOATTYPE_MAX Maximum from the continuous range of integers that map to floats. All integers from -INT_RANGE_IN_FLOATTYPE_MAX to INT_RANGE_IN_FLOATTYPE_MAX can be converted to floatType and back to intType without loss.
integer MINIMUM_TRUNC_ARGUMENT Minimum value that trunc() or round() can convert. Values below MINIMUM_TRUNC_ARGUMENT raise RANGE_ERROR, if trunc() or round() is applied to them.
integer MAXIMUM_TRUNC_ARGUMENT Maximum value that trunc() or round() can convert. Values above MAXIMUM_TRUNC_ARGUMENT raise RANGE_ERROR, if trunc() or round() is applied to them.
integer PIXEL_RED_MASK Mask for the red color in a pixel. 0 if there is no mapping and drwRgbColor() should be used.
integer PIXEL_GREEN_MASK Mask for the green color in a pixel. 0 if there is no mapping and drwRgbColor() should be used.
integer PIXEL_BLUE_MASK Mask for the blue color in a pixel. 0 if there is no mapping and drwRgbColor() should be used.
string RGB_TO_PIXEL_FLAG_NAME Name of the variable deciding between macro and drwRgbColor(). "" if no variable needs to be checked.
integer RAND_MULTIPLIER Multiplier for the linear congruential generator. A well known pseudorandom number generator algorithm.
integer RAND_INCREMENT Increment for the linear congruential generator. A well known pseudorandom number generator algorithm.
string BOOLTYPE Name of a type for the boolean values 0 and 1. The runtime library and the compiler use a typedef to define the type boolType with BOOLTYPE.
string INT32TYPE Name of a signed integer type that is 32 bits wide. The runtime library and the compiler use a typedef to define the type int32Type with INT32TYPE.
string UINT32TYPE Name of an unsigned integer type that is 32 bits wide. The runtime library and the compiler use a typedef to define the type uint32Type with UINT32TYPE.
string INT64TYPE Name of a signed integer type that is 64 bits wide. The runtime library and the compiler use a typedef to define the type int64Type with INT64TYPE.
string UINT64TYPE Name of an unsigned integer type that is 64 bits wide. The runtime library and the compiler use a typedef to define the type uint64Type with UINT64TYPE.
string INT128TYPE Name of a signed integer type that is 128 bits wide. The runtime library and the compiler use a typedef to define the type int128Type with INT128TYPE. Empty string if there is no 128-bit signed integer type.
string UINT128TYPE Name of an unsigned integer type that is 128 bits wide. The runtime library and the compiler use a typedef to define the type uint128Type with UINT128TYPE. Empty string if there is no 128-bit unsigned integer type.
string INT32TYPE_LITERAL_SUFFIX The suffix used by the literals of the type int32Type.
string INT64TYPE_LITERAL_SUFFIX The suffix used by the literals of the type int64Type.
string MACRO_DEFS Definition of several macros (likely, unlikely, noreturn).
string OVERFLOW_SIGNAL Name of the signal that is raised if an integer overflow occurs. Empty string if integer overflow does not raise a signal.
string BUILTIN_ADD_OVERFLOW Name of a C compiler builtin function to add with overflow check. Empty string if there is no such builtin function.
string BUILTIN_SUB_OVERFLOW Name of a C compiler builtin function to subtract with overflow check. Empty string if there is no such builtin function.
string BUILTIN_MULT_OVERFLOW Name of a C compiler builtin function to multiply with overflow check. Empty string if there is no such builtin function.

Optimizations

The Seed7 compiler supports low level optimizations (option -O), which are done by the C compiler. Additionally s7c does several high level optimizations (option -ocn) that the C compiler cannot do, because C compilers don't know Seed7.

Use special case functions

For certain constant values some function calls are replaced by corresponding calls of special case functions:

  • The string comparisons = and <> (primitive actions 'STR_EQ' and 'STR_NE') are simplified if one or both parameters are constant strings.

  • The string indexing like stri[num] (primitive action 'STR_IDX') is simplified if the string or the index are constant.

  • The array indexing like anArray[num] (primitive action 'ARR_IDX') is simplified if the array or the index are constant.

  • Searches and splits with string constant of length 1 are replaced by equivalent functions which use a character instead:

    Function call replaced by C function replacement C function
    pos(stri, "a") pos(stri, 'a') 'strPos' 'strChPos'
    rpos(stri, "a") rpos(stri, 'a') 'strRpos' 'strRChPos'
    pos(stri, "a", start) pos(stri, 'a', start) 'strIpos' 'strChIpos'
    rpos(stri, "a", start) rpos(stri, 'a', start) 'strRIPos' 'strRChIPos'
    replace(stri, "a", repl) - 'strRepl' 'strChRepl'
    replace(stri, "a", "b") - 'strRepl' 'strChChRepl'
    split(stri, "a") split(stri, 'a') 'strSplit' 'strChSplit'
    "a" mult n - 'strMult' 'strChMult'
  • Initializations of string variables are optimized if an empty string or a string with length 1 is used:

    Seed7 variable declaration C declaration C initialization replacement C initialization
    var string: stri is ""; striType o_123​/*stri*/; o_123/*stri*/ = strCreate(""); o_123/*stri*/ = strEmpty(); /* "" */
    var string: stri is "a"; striType o_123​/*stri*/; o_123/*stri*/ = strCreate("a"); o_123/*stri*/ = chrStr('a'); /* "a" */
  • The compiler optimizes integer divisions in the following way:

    Seed7 expression C expression
    a div b a/b
    a div 0 (raise_error(NUMERIC_ERROR),0)
    a div 1 a
    a div -1 -a
    0 div b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a rem b a%b
    a rem 0 (raise_error(NUMERIC_ERROR),0)
    a rem 1 0
    a rem -1 0
    0 rem b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a mdiv b (a>0&&b<0 ? (a-1)/b-1 : a<0&&b>0 ? (a+1)/b-1 : a/b)
    a mdiv 0 (raise_error(NUMERIC_ERROR),0)
    a mdiv 1 a
    a mdiv -1 -a
    a mdiv 8 a>>3 or, if >> does not sign extend: a<0?~(~a>>b):a>>b
    a mdiv -8 -a>>3 or, if >> does not sign extend: a=-a, a<0?~(~a>>b):a>>b
    0 mdiv b (b==0?(raise_error(NUMERIC_ERROR),0):0)
    a mod b (c=a%b, a<0^b<0 && c!=0 ? c+b : c)
    a mod 0 (raise_error(NUMERIC_ERROR),0)
    a mod 1 0
    a mod -1 0
    a mod 8 a&7
    a mod -8 -(-a&7)
    0 mod b (b==0?(raise_error(NUMERIC_ERROR),0):0)
  • Operations with bigInteger values are optimized, if a cheaper function can be used:

    Original expression Optimized expression Original C function Optimized C function
    num + 0_ num bigAdd -
    num + 1_ succ(num) bigAdd bigSucc
    num + num num << 1 bigAdd bigLShift
    num - 1_ pred(num) bigAdd bigPred
    num * -8_ -(num << 3) bigMult bigNegateTemp(bigLShift ...
    num * -1_ -num bigMult bigNegate
    num * 0_ 0_ bigMult -
    num * 1_ num bigMult -
    num * 2_ num << 1 bigMult bigLShift
    num * 8_ num << 3 bigMult bigLShift
    num * num num ** 2 bigMult bigSquare
    -8_ * num -(num << 3) bigMult bigNegateTemp(bigLShift ...
    -1_ * num -num bigMult bigNegate
    0_ * num 0 bigMult -
    1_ * num num bigMult -
    2_ * num num << 1 bigMult bigLShift
    8_ * num num << 3 bigMult bigLShift
    num ** (-1) raise NUMERIC_ERROR bigIPow -
    num ** 0 1 bigIPow -
    num ** 1 num bigIPow -
    num ** 2 num * num bigIPow bigSquare
    1_ ** num raise NUMERIC_ERROR for num<0 or 1_ bigIPow -
    2_ ** num 1_ << num or raise NUMERIC_ERROR for num<0 bigIPow bigLog2BaseIPow
    8_ ** num 1_ << 3 * num or raise NUMERIC_ERROR for num<0 bigIPow bigLog2BaseIPow
    num div -1_ -num bigDiv bigNegate
    num div 0_ raise NUMERIC_ERROR bigDiv -
    num div 1_ num bigDiv -
    num mdiv -2_ -num >> 1 bigMDiv bigRShiftAssign(bigNegate ...
    num mdiv -1_ -num bigMDiv bigNegate
    num mdiv 0_ raise NUMERIC_ERROR bigMDiv -
    num mdiv 1_ num bigMDiv -
    num mdiv 2_ num >> 1 bigMDiv bigRShift
    num mdiv 8_ num >> 3 bigMDiv bigRShift
    num mod -2_ num + ((-num >> 1) << 1) bigMod bigNegateTemp(bigLowerBitsTemp(bigNegate ...
    num mod -1_ 0 bigMod -
    num mod 0_ raise NUMERIC_ERROR bigMod -
    num mod 1_ 0 bigMod -
    num mod 2_ num - ((num >> 1) << 1) bigMod bigLowerBits
    num mod 8_ num - ((num >> 3) << 3) bigMod bigLowerBits
    num +:= 0_ noop bigGrow -
    num +:= 1_ incr(num) bigGrow bigIncr
    num +:= -1_ decr(num) bigGrow bigDecr
    num -:= 0_ noop bigShrink -
    num -:= 1_ decr(num) bigShrink bigDecr
    num -:= -1_ incr(num) bigShrink bigIncr
    num *:= 0_ num := 0_ bigMultAssign bigCpy
    num *:= 1_ noop bigMultAssign -
    num *:= 2_ num <<:= 1 bigMultAssign bigLShiftAssign
    num *:= 4_ num <<:= 2 bigMultAssign bigLShiftAssign

Manage temporary values

A temporary expression which would be freed after the return from a function can be used by special case functions. That way it is not necessary to free the temporary value afterwards:

Normal function Function using temporary Comment
arrHead arrHeadTemp Splits the array into the head which is returned and an unused part which is freed later.
arrRange arrRangeTemp Splits the array into the range which is returned and an unused part which is freed later.
arrSubarr arrSubarrTemp Splits the array into the sub array which is returned and an unused part which is freed later.
arrTail arrTailTemp Splits the array into the tail which is returned and an unused part which is freed later.
strConcat strConcatTemp Resizes the temorary and returns it after concatenating the second parameter.
strAppend strAppendTemp Resizes the temorary and concatenates it to a variable.
strHead strHeadTemp Resizes the temorary to the requested size and returns it.
strTail strTailTemp Moves the characters, resizes the temorary to the requested size and returns it.
strUp strUpTemp Converts the parameter to upper case and returns it.
strLow strLowTemp Converts the parameter to lower case and returns it.
strLpad0 strLpad0Temp Resizes the temorary, adds leading zeros and returns it.
bigNegate bigNegateTemp Negates the parameter and returns it.
bigLowerBits bigLowerBitsTemp Take the lower bits of the parameter and returns it.
bigSucc bigSuccTemp Increments the parameter and returns it.
bigPred bigPredTemp Decrements the parameter and returns it.
bigAdd bigAddTemp Adds a value to the parameter and returns it.
bigSbtr bigSbtrTemp Subtracts a value from the parameter and returns it.


Compiling chkint

Speed improvement
with compiled program

 previous   up   next