|
|
|
|
|
8. THE FILE SYSTEMThe file system is used for communication in various ways. For example: To write strings on the screen we use the following statements:
write("hello world");
writeln;
'writeln' means write newline. We can also write data of various types with 'write':
write("result = ");
write(number div 5);
write(" ");
writeln(not error);
The 'writeln' above writes data and then terminates the line. This is equal to a 'write' followed by a 'writeln'. Instead of multiple write statements the '<&' operator can be used to concatenate the elements to be written:
writeln("result = " <& number div 5 <& " " <& not error);
The '<&' operator needs a 'string' as left operand and is overloaded for various types as right operand. To allow things like
write(next_time <& " \r");
the '<&' operator is also overloaded for various types as left operand and a 'string' as right operand. This allows you to concatenate several objects with '<&' when at least the first or the second object is a 'string'. We can also read data from the keyboard:
write("Amount? ");
read(amount);
The user is allowed to use backspace and sends the input to the program with the RETURN-key. To let the user respond with the RETURN-key we can write:
writeln("Type RETURN");
readln;
To read a line of data we can use 'readln':
write("Your comment? ");
readln(user_comment_string);
In the previous examples all 'read' statements read from the file IN and all 'write' statements write to the file OUT. The files IN and OUT are initialized with 'STD_IN' and 'STD_OUT' which are the stdin and stdout files of the operating system. (Usually the keyboard and the screen). When we want to write to other files we use write statements with the file as first parameter. To write a line of text to the file "info.fil" we use the following statements:
info_file := open("info.fil", "w");
writeln(info_file, "This is the first line of the info file.");
close(info_file);
First the external file is opened for writing and then it is used. To read the file back in the string 'stri' we write:
info_file := open("info.fil", "r");
readln(info_file, stri);
close(info_file);
It is also possible to write values of other types to 'info_file':
writeln(info_file, number);
Here the 'number' is converted to a string which is written to the file. A 'number' is read back with:
readln(info_file, number);
For doing I/O to a window on the screen we write:
window1 := open_window(SCREEN, 10, 10, 5, 60);
box(window1);
setPos(window1, 3, 1);
write(window1, "hello there");
This opens the window 'window1' on the SCREEN at the position 10, 10. This window has 5 lines and 60 columns. A box (of characters: - | + ) is written to surround the 'window1' and finally the string "hello there" is written in the window 'window1' at Position 3, 1. If we want to clear the 'window1' we write:
clear(window1);
Files can be used for much more things. Here is a list of goals for a file system:
In the following subchapters we discuss each of these goals. 8.1 Conversion to strings and backWe archive the goal of doing I/O for arbitrary types with two conversion functions. In order to do I/O with a type the 'str' and 'parse' functions must be defined for that type. As an example we show the conversion functions for the type 'boolean':
const func string: str (in boolean: aBool) is func
result
var string: result is "";
begin
if aBool then
result := "TRUE";
else
result := "FALSE";
end if;
end func;
const func boolean: (attr boolean) parse (in string: stri) is func
result
var boolean: result is FALSE;
begin
if stri = "TRUE" then
result := TRUE;
elsif stri = "FALSE" then
result := FALSE;
else
raise RANGE_ERROR;
end if;
end func;
The 'str' function must deliver a corresponding string for every value of the type. The 'parse' function parses a string and delivers the converted value as result. If the conversion is not successful the exception RANGE_ERROR is raised. The attribute used with 'parse' allows that it is overloaded for different types. After defining the 'str' and 'parse' functions for a type the 'enable_io' function can be called for this type as in:
enable_io(boolean);
The 'enable_io' template declares various io functions like 'read', 'write' and others for the provided type (in this example 'boolean'). If only output (or only input) is needed for a type it is possible to define just 'str' (or 'parse') and activate just 'enable_output' (or 'enable_input'). There is also a formatting operator called 'lpad' which is based on the 'str' function. The statements
write(12 lpad 6);
write(3 lpad 6);
writeln(45 lpad 6);
write(678 lpad 6);
write(98765 lpad 6);
writeln(4321 lpad 6);
produce the following output:
12 3 45
678 98765 4321
As we see the 'lpad' operator can be used to produce right justified output. There is also a 'rpad' operator to produce left justified output. The basic definitions of the 'lpad' and 'rpad' operators work on strings and are as follows:
const func string: (ref string: stri) lpad (in integer: leng) is func
result
var string: result is "";
begin
if leng > length(stri) then
result := " " mult leng - length(stri) & stri;
else
result := stri;
end if;
end func;
const func string: (ref string: stri) rpad (in integer: leng) is func
result
var string: result is "";
begin
if leng > length(stri) then
result := stri & " " mult leng - length(stri);
else
result := stri;
end if;
end func;
The 'enable_io' template contains definitions of 'lpad' and 'rpad' to work on the type specified with 'enable_io':
const func string: (in aType: aValue) lpad (in integer: leng) is func
result
var string: stri is "";
begin
stri := str(aValue) lpad leng;
end func;
const func string: (in aType: aValue) rpad (in integer: leng) is func
result
var string: stri is "";
begin
stri := str(aValue) rpad leng;
end func;
For 'float' values exists an additional way to convert them to strings. The 'digits' operator allows the specification of a precision. For example the statements
writeln(3.1415 digits 2);
writeln(4.0 digits 2);
produce the following output:
3.14
4.00
A combination with the 'lpad' operator as in
writeln(3.1415 digits 2 lpad 6);
writeln(99.9 digits 2 lpad 6);
is also possible and produces the following output:
3.14
99.90
8.2 Basic input and output operationsTo allow arbitrary user defined file-types beside the operating system files we chose a model in which the I/O methods are assigned to the type of the file-value and not to the type of the file-variable. This allows a file variable to point to any file-value. The file-variables have the type 'file' which has only the assignment method defined. For the operating system files and for each user defined file a file-type must be declared which has the I/O methods defined. These file-types are derived (direct or indirect) from the type 'null_file' for which all I/O methods are defined upon a base of basic string I/O methods. So for a new user defined file-type only the basic string I/O methods must be defined. The two basic I/O methods defined for the 'null_file' are
const proc: write (ref null_file param, in string param) is noop;
const string: gets (ref null_file param, ref integer param) This means that writing any string to the 'null_file' has no effect and reading any number of characters from the 'null_file' delivers the empty string. When a user defined file type is declared these are the two methods that must be redefined for the new file-type. Based upon these two methods three more methods are defined for the 'null_file' named 'getc', 'getwd' and 'getln'. These methods get a character, a word and a line respectively. A word is terminated by a space, a tab or a linefeed. A line is terminated by a linefeed. This methods need not to be redefined for a user defined file type but for performance reasons they can also be redefined. The definitions for 'getc', 'getwd' and 'getln' for the 'null_file' are
const func char: getc (inout null_file: aFile) is func
result
var char: ch is ' ';
local
var string: buffer is "";
begin
buffer := gets(aFile, 1);
if buffer = "" then
ch := EOF;
else
ch := buffer[1];
end if;
end func;
const func string: getwd (inout null_file: aFile) is func
result
var string: stri is "";
local
var string: buffer is "";
begin
repeat
buffer := gets(aFile, 1);
until buffer <> " " and buffer <> "\t";
while buffer <> " " and buffer <> "\t" and
buffer <> "\n" and buffer <> "" do
stri &:= buffer;
buffer := gets(aFile, 1);
end while;
if buffer = "" then
aFile.bufferChar := EOF;
else
aFile.bufferChar := buffer[1];
end if;
end func;
const func string: getln (inout null_file: aFile) is func
result
var string: stri is "";
local
var string: buffer is "";
begin
buffer := gets(aFile, 1);
while buffer <> "\n" and buffer <> "" do
stri &:= buffer;
buffer := gets(aFile, 1);
end while;
if buffer = "" then
aFile.bufferChar := EOF;
else
aFile.bufferChar := buffer[1];
end if;
end func;
Note that 'getwd' skips leading spaces and tabs while 'getc' and 'getln' do not. When 'getc', 'getwd' or 'getln' is not defined for a new user defined file type the declarations from the 'null_file' are used instead. These declarations are based on the method 'gets' which must be defined for every new user defined file-type. Note that there is an assignment to the variable 'bufferChar'. This variable is an element of 'null_file' and therefore also an element of all derived file types. This allows an 'eoln' function to test if the last 'getwd' or 'getln' reach the end of a line. Here is a definition of the 'eoln' function:
const func boolean: eoln (ref null_file: aFile) is func
result
var boolean: result is TRUE;
begin
result := aFile.bufferChar = '\n';
end func;
Besides assigning a value to 'bufferChar' in 'getwd' and 'getln' and using it in 'eoln' the standard 'file' functions do nothing with 'bufferChar'. The functions of the "scanfile.s7i" library use the 'bufferChar' variable as current character in the scan process. As such all functions of the "scanfile.s7i" library assume that the first character to be processed is always in 'bufferChar'. Since the standard 'file' functions do not have this behaviour, care has to be taken when mixing scanner and file functions. The next declarations allows various I/O operations for strings:
const proc: writeln (inout file: aFile, in string: stri) is func
begin
write(aFile, stri);
writeln(aFile);
end func;
const proc: read (inout file: aFile, inout string: stri) is func
begin
stri := getwd(aFile);
aFile.io_empty := stri = "";
aFile.io_ok := TRUE;
end func;
const proc: readln (inout file: aFile, inout string: stri) is func
begin
stri := getln(aFile);
aFile.io_empty := stri = "";
aFile.io_ok := TRUE;
end func;
8.3 Input and output with conversionNormally we need a combination of an I/O operation with a conversion operation. There are several functions which are based on the 'str' and 'parse' conversions and on the basic I/O-functions. The declaration of this functions is done by the templates 'enable_io', 'enable_input' and 'enable_output'. The templates 'enable_io' and 'enable_output' define the following 'write' function:
const proc: write (in file: aFile, in aType: aValue) is func
begin
write(aFile, str(aValue));
end func;
The templates 'enable_io' and 'enable_input' define the following 'read' and 'readln' functions:
const proc: read (inout file: aFile, inout aType: aValue) is func
local
var string: stri is "";
begin
stri := getwd(aFile);
aFile.io_empty := stri = "";
block
aValue := aType parse stri;
aFile.io_ok := TRUE;
exception
catch RANGE_ERROR:
aFile.io_ok := FALSE;
end block;
end func;
const proc: readln (inout file: aFile, inout aType: aValue) is func
local
var string: stri is "";
begin
stri := getln(aFile);
aFile.io_empty := stri = "";
block
aValue := aType parse stri;
aFile.io_ok := TRUE;
exception
catch RANGE_ERROR:
aFile.io_ok := FALSE;
end block;
end func;
The next two declarations define 'writeln' and 'backSpace':
const proc: writeln (ref external_file: aFile) is func
begin
write(aFile, "\n");
end func;
const proc: backSpace (ref external_file: aFile) is func
begin
write(aFile, "\b \b");
end func;
8.4 Simple read and write statementsThe simple input/output for the standard I/O-files are 'read' and 'write' which are defined with 'enable_io'. Simple I/O may look like:
write("Amount? ");
read(amount);
'read' and 'write' use the files IN and OUT which are described in the next chapter. Here is the definition of the 'read' and 'write' procedures done with 'enable_io':
const proc: read (inout aType: aValue) is func
begin
read(IN, aValue);
end func;
const proc: readln (inout aType: aValue) is func
begin
readln(IN, aValue);
end func;
const proc: write (in aType: aValue) is func
begin
write(OUT, aValue);
end func;
const proc: writeln (in aType: aValue) is func
begin
write(OUT, aValue);
writeln(OUT);
end func;
Additional procedures defined outside of 'enable_io' are:
const proc: readln is func
local
var string: stri is "";
begin
stri := getln(IN);
IN.io_empty := stri = "";
IN.io_ok := TRUE;
end func;
const proc: read (NL) is func
begin
readln;
end func;
const proc: writeln is func
begin
writeln(OUT);
end func;
const proc: write (NL) is func
begin
writeln(OUT);
end func;
As an example when you call
readln(number);
the readln(integer) procedure calls
readln(IN, number);
if the file IN has not redefined readln(IN, integer) this procedure calls
stri := getln(IN);
and 'getln' may call gets(IN, 1) in a loop or may be defined for the file IN. Finally the 'parse' function converts the string read into an 'integer' and assigns it to 'number'
number := integer parse stri;
8.5 Standard input and output filesThe standard I/O files are OUT for output and IN for input. This TWO are file-variables which are declared as follows:
var file: IN is STD_IN;
var file: OUT is STD_OUT;
The files 'STD_IN' and 'STD_OUT' are the standard input and output files of the operating system (Usually the keyboard and the screen). Because IN and OUT are variables redirection of standard input or standard output can be done easily by assigning a new value to them:
IN := OTHER_FILE;
After that all 'read' statements refer to OTHER_FILE. Most operating systems have also a stderr file which can be accessed via the name 'STD_ERR'. If you want to write error messages to the screen even when stdout is redirected elsewhere you can write:
writeln(STD_ERR, "ERROR MESSAGE");
To redirect the standard output to 'STD_ERR' you can write:
OUT := STD_ERR;
There is also a file 'STD_NULL' defined. Anything written to it is ignored. Reading from it does deliver empty strings. This file can be used to initialize file variables as in:
var file: MY_FILE is STD_NULL;
It is also used to represent an illegal file value when for example an 'open' procedure fails. 8.6 Access to operating system filesThe interface type 'file' is also used to access operating system files. This is done with the implementation type 'external_file'. The type 'external_file' is defined as:
const type: external_file is sub null_file struct
var PRIMITIVE_FILE: ext_file is PRIMITIVE_null_file;
var string: name is "";
end struct;
This means that every data item of the type 'external_file' has the elements from 'null_file' and additionally the elements 'ext_file' and 'name'. The type 'PRIMITIVE_FILE' points directly to an operating system file. Objects of type 'PRIMITIVE_FILE' can only have operating system files as values while objects of type 'file' can also have other files as values. To allow the implementation of the type 'external_file' several operations for the type 'PRIMITIVE_FILE' are defined. But outside 'external_file' the type 'PRIMITIVE_FILE' and its operations should not be used. There are three predefined external files 'STD_IN', 'STD_OUT' and 'STD_ERR' which have the following declarations:
const func external_file: INIT_STD_FILE (ref PRIMITIVE_FILE: primitive_file,
in string: file_name) is func
result
var external_file: result is external_file.value;
begin
result.ext_file := primitive_file;
result.name := file_name;
end func;
var external_file: STD_IN is INIT_STD_FILE(PRIMITIVE_INPUT, "STD_IN");
var external_file: STD_OUT is INIT_STD_FILE(PRIMITIVE_OUTPUT, "STD_OUT");
var external_file: STD_ERR is INIT_STD_FILE(PRIMITIVE_ERROR, "STD_ERR");
It is possible to do I/O directly with them, but it is more wisely to use them only to initialize user defined file variables as in:
var file: err is STD_ERR;
In the rest of the program references to such a variable can be used:
writeln(err, "Some error occurred");
In this case redirection of the file 'err' can be done very easy. The second way to access external files is to use the 'open' function. Usually a file variable is declared
var file: my_out is STD_NULL;
and the result of the 'open' function is assigned to this file variable
my_out := open("my_file", "w");
The first parameter of 'open' is the path of the file to be opened. Seed7 always uses the slash ('/') as path delimiter. The use of a backslash in a path may raise the exception 'RANGE_ERROR'. The second parameter of 'open' specifies the mode:
Binary mode:
"r" ... Open file for reading.
"w" ... Truncate to zero length or create file for writing.
"a" ... Append; open or create file for writing at end-of-file.
"r+" ... Open file for update (reading and writing).
"w+" ... Truncate to zero length or create file for update.
"a+" ... Append; open or create file for update, writing at end-of-file.
Text mode:
"rt" ... Open file for reading.
"wt" ... Truncate to zero length or create file for writing.
"at" ... Append; open or create file for writing at end-of-file.
"rt+" ... Open file for update (reading and writing).
"wt+" ... Truncate to zero length or create file for update.
"at+" ... Append; open or create file for update, writing at end-of-file.
Note that Seed7 defines the modes "r", "w", "a", "r+", "w+" and "a+" as binary modes. This is different from the definition used by the 'fopen' function of the C library. The difference between binary and text mode is as follows:
The following table compares the file modes of Seed7 and C:
The function 'open' returns 'STD_NULL' when it fails. So it is necessary to check the file variable to be on the save side:
if my_out <> STD_NULL then
After that output to 'my_out' is possible with
writeln(my_out, "hi there");
Note that 'external_file' describes BYTE files. Writing a character with an ordinal >= 256 such as
writeln(my_out, "illegal char: \256\");
results in the exception 'RANGE_ERROR'. To write unicode characters other file types must be used. The libraries "utf8.s7i" and "utf16.s7i" provide access to UTF-8 and UTF-16 files. The library "utf8.s7i" defines the implementation type 'utf8_file' as
const type: utf8_file is sub external_file struct
end struct;
and the function 'open_utf8' which can be used the same way as 'open':
my_out := open_utf8("utf8_file", "w");
An UTF-8 file accepts all unicode characters. That way
writeln(my_out, "unicode char: \256\");
works without problems. 8.7 Keyboard fileAs stated earlier 'STD_IN' provides an interface to the keyboard which is line buffered and echoed on 'STD_OUT'. This means that you can see everything you typed. Additionally you can correct your input with BACKSPACE until you press RETURN. But sometimes an unbuffered and unechoed input is needed. This is provided in the library "keybd.s7i", which defines the type 'keyboard_file' and the file 'KEYBOARD'. Characters typed at the keyboard are queued (first in first out) and can be read directly from 'KEYBOARD' without any possibiliy to correct. Additionally 'KEYBOARD' does not echo the characters. Reading from 'KEYBOARD' delivers normal UNICODE characters or special codes (which may be or may not be UNICODE characters) for function and cursor keys. UNICODE characters and special codes both are 'char' values. The "keybd.s7i" library defines 'char' constants for various keys:
The following example uses the 'char' constant 'KEY_UP':
$ include "seed7_05.s7i";
include "keybd.s7i";
const proc: main is func
begin
writeln("Please press cursor up");
while getc(KEYBOARD) <> KEY_UP do
writeln("This was not cursor up");
end while;
writeln("Cursor up was pressed");
end func;
Progams should use the 'char' constants defined in "keybd.s7i" to deal with function and cursor keys, since the special key codes may change in future versions of Seed7. Additionally to the operations possible with a 'file' there are two functions that are applicable only to files of type 'keyboard_file':
Note that 'keypressed' does not actually read a character. Reading must be done with a different function after 'keypressed' returns TRUE. Both functions ('busy_getc' and 'keypressed') are useful when user input is allowed while some processing takes place. The following program uses 'busy_getc(KEYBOARD)' to display the time until a key is pressed:
$ include "seed7_05.s7i";
include "time.s7i";
include "keybd.s7i";
const proc: main is func
begin
writeln;
while busy_getc(KEYBOARD) = KEY_NONE do
write(time(NOW) <& "\r");
flush(OUT);
end while;
writeln;
writeln;
end func;
Seed7 programs can run in two modes:
This two modes are supported with two basic keyboard files:
The file 'KEYBOARD' is actually a variable which refers to one of the two basic keyboard files. The declaration of the type 'keyboard_file' and the file 'KEYBOARD' in "keybd.s7i" is:
const type: keyboard_file is subtype file;
var keyboard_file: KEYBOARD is CONSOLE_KEYBOARD;
Graphic programs switch to to the 'GRAPH_KEYBOARD' driver with:
KEYBOARD := GRAPH_KEYBOARD;
Some file types are defined to support the 'KEYBOARD'. One such file type is 'echo_file', which is defined in the library "echo.s7i". An 'echo_file' file can be used to write input characters to an output file. This is useful since 'KEYBOARD' does not echo its input, but 'echo_file' is not restricted to support 'KEYBOARD'. The following program writes echos of the keys typed and exits as soon as a '!' is encountered:
$ include "seed7_05.s7i";
include "keybd.s7i";
include "echo.s7i";
const proc: main is func
local
var char: ch is ' ';
begin
IN := open_echo(KEYBOARD, OUT);
repeat
ch := getc(IN);
until ch = '!';
writeln;
end func;
An 'echo_file' checks also for control-C (KEY_CTL_C). When control-C is typed an 'echo_file' asks if the program should be terminated:
terminate (y/n)?
Aswering 'y' or 'Y' is interpreted as 'yes' and the program is terminated with the following message:
*** PROGRAM TERMINATED BY USER
Any other input removes the question and the program continues to read input. Another helpful file type is 'line_file', which is defined in the library "line.s7i". A 'line_file' allows to correct the input with BACKSPACE until a RETURN (represented with '\n') is encountered. In contrast to this editing feature the possibility to edit a line of 'STD_IN' is provided by the operating system. The following program uses 'echo_file' and 'line_file' to simulate input line editing:
$ include "seed7_05.s7i";
include "keybd.s7i";
include "echo.s7i";
include "line.s7i";
const proc: main is func
local
var char: ch is ' ';
begin
IN := open_echo(KEYBOARD, OUT);
IN := open_line(IN);
repeat
ch := getc(IN);
write(ch);
until ch = '!';
end func;
This program terminates when a line containing '!' is confirmed with RETURN. 8.8 Files with line structureThe type 'text' is a subtype of 'file' which adds a line structure and other features such as scrolling and color. The lines and columns of a type 'text' start with 1 in the upper left corner and increase downward and rightward. The function 'setPos' sets the current line and column of a 'text':
const proc: setPos (inout text: aText, in integer: line, in integer: column) is ...
The functions 'setLine' and 'setColumn' set just the line and column respectively:
const proc: setLine (inout text: aText, in integer: line) is ...
const proc: setColumn (inout text: aText, in integer: column) is ...
The current line and column of a 'text' file can be retrieved with 'line' and 'column':
const func integer: line (ref text: aText) is ...
const func integer: column (ref text: aText) is ...
The current height and width of a 'text' file can be retrieved with 'height' and 'width':
const func integer: height (ref text: aText) is ...
const func integer: width (ref text: aText) is ...
To allow random access output to a text screen (or text window) the type SCREEN_FILE is defined. The function
open(SCREEN_FILE)
returns a SCREEN_FILE. 8.9 SocketsThe library "socket.s7i" defines types and functions to access sockets. The implementation type for sockets is 'socket'. As interface type 'file' is used:
var file: clientSocket is STD_NULL;
With 'openInetSocket' an internet client socket can be opened:
clientSocket := openInetSocket("www.google.com", 80);
The function 'openInetSocket' creates and connects a socket. Opening an internet socket at the local host is done with:
clientSocket := openInetSocket(1080);
Server sockets are supported with the type 'listener'. The type 'listener' is used directly without interface type:
var listener: myListener is listener.value;
The function 'openInetListener' opens a 'listener':
myListener := openInetListener(1080);
The function 'listen' is used to listen for incoming socket connections of a 'listener', and to limit the incoming queue:
listen(myListener, 10);
The function 'accept' returns the first connected socked of the 'listener':
serverSocket := accept(myListener);
8.10 User defined file typesIn addition to the predefined file types it is often necessary to define a new type of file. Such a new file has several possibilities:
With the following declaration we define a new file type:
const type: NEW_FILE is sub null_file struct
...
(* Local data *)
...
end struct;
It is not necessary to derive the NEW_FILE type directly from 'null_file'. The NEW_FILE type may also be an indirect descendant of 'null_file'. So it is possible to create file type hierarchies. The interface implemented by the new file needs also to be specified:
type_implements_interface(NEW_FILE, file);
The type 'file' is not the only interface type which can be used. There is also the type 'text' which is derived from 'file'. The type 'text' describes a line oriented file which allows 'setPos' (which moves the current position to the line and column specified) and other functions. It is also possible to define new interface types which derive from 'file' or 'text'. As next an open function is needed to generate a new NEW_FILE:
const func file: open_new_file ( (* Parameters *) ) is func
result
var file: result is STD_NULL;
begin
...
(* Initialisation of the local data *)
result := malloc( ... );
...
end func;
Note the usage of the 'malloc' function to generate a new data item. This data item is not freed automatically but if you do not open files to often this does not hurt. Now only the two basic I/O operations must be defined:
const proc: write (inout NEW_FILE: new_fil, in string: stri) is func
begin
...
(* Statements that do the output *)
...
end func;
const proc: gets (inout NEW_FILE: new_fil, in integer: leng) is func
result
var string: stri is "";
begin
...
(* Statements that do the input *)
...
end func;
The I/O concept introduced in the previous chapters separates the input of data from its conversion. The 'read', 'readln', 'getwd' and 'getln' functions are designed to read whitespace separated data elements. When the data elements are not separated by whitespace characters this I/O concept is not possible. Instead the functions which read from the file need some knowledge about the type which they intend to read. Fortunately this is a well researched area. The lexical scanners used by compilers solve exactly this problem. Lexical scanners read symbols from a file and use the concept of a current character. A symbol can be a name, a number, a string, an operator, a parenthesis or something else. The current character is the first character to be processed when scanning a symbol. After a scanner has read a symbol the current character contains the character just after the symbol. This character could be the first character of the next symbol or some whitespace character. If the set of symbols is chosen wisely all decisions about the type of the symbol and when to stop reading characters for a symbol can be done based on the current character. Every 'file' contains a 'bufferChar' variable which is used as current character by the scanner functions defined in the "scanfile.s7i" library. The "scanfile.s7i" library contains skip... and get... functions. The skip... procedures return void and are used to skip input while the get... functions return the string of characters they have read. The following basic scanner functions are defined in the "scanfile.s7i" library:
Contrary to 'read' and 'getwd' basic scanner functions do not skip leading whitespace characters. To skip whitespace characters one of the following functions can be used:
The advanced scanner functions do skip whitespace characters before reading a symbol:
All scanner functions assume that the first character to be processed is in 'bufferChar' and after they are finished the next character which should be processed is also in 'bufferChar'. To use scanner functions for a new opened file it is necessary to assign the first character to the 'bufferChar' with:
myFile.bufferChar := getc(myFile);
In most cases whole files are either processed with normal I/O functions or with scanner functions. When normal I/O functions need to be combined with scanner functions care has to be taken:
Scanner functions are helpful when it is necessary to read numeric input without failing when no digits are present:
skipWhiteSpace(IN);
if eoln(IN) then
writeln("empty input");
elsif IN.bufferChar in {'0' .. '9'} then
number := integer parse getDigits(IN);
skipLine(IN);
writeln("number " <& number);
else
stri := getLine(IN);
writeln("command " <& literal(stri));
end if;
The function 'getSymbol' is designed to read Seed7 symbols. When the end of the file is reached it returns "". With 'getSymbol' name-value pairs can be read:
name := getSymbol(inFile);
while name <> "" do
if name <> "#" and getSymbol(inFile) = nt color=maroon>"="/font> then
aValue = getSymbol(inFile);
if aValue <> "" then
if aValue[1] = '"' then
keyValueHash @:= [name] aValue[2 ..];
elsif aValue[1] in {'0' .. '9'} then
keyValueHash @:= [name] aValue;
end if;
end if;
end if;
end while;
The following loop can be used to process the symbols of a Seed7 program:
inFile.bufferChar := getc(inFile);
currSymbol := getSymbol(inFile);
while currSymbol <> "" do
... process currSymbol ...
currSymbol := getSymbol(inFile);
end while;
Whitespace and comments are automatically skipped with the function 'getSymbol'. When comments should also be returned the function 'getSymbolOrComment' can be used. Together with the function 'getWhiteSpace' it is even possible to get the whitespace between the symbols:
const func string: processFile (in string: fileName) is func
result
var string: result is "";
local
var file: inFile is STD_NULL;
var string: currSymbol is "";
begin
inFile := open(fileName, "r");
if inFile <> STD_NULL then
inFile.bufferChar := getc(inFile);
result := getWhiteSpace(inFile);
currSymbol := getSymbolOrComment(inFile);
while currSymbol <> "" do
result &:= currSymbol;
result &:= getWhiteSpace(inFile);
currSymbol := getSymbolOrComment(inFile);
end while;
end if;
end func;
In the example above the function 'processFile' gathers all symbols, whitespace and comments in the string it returns. The string returned by 'processFile' is equivalent to the one returned by the function 'getf'. That way it is easy to test the scanner functionality. The logic with 'getWhiteSpace' and 'getSymbolOrComment' can be used to add HTML tags to comments and literals. The following function colors comments with green, string and char literals with maroon and numeric literals with purple:
const proc: sourceToHtml (inout file: inFile, inout file: outFile) is func
local
var string: currSymbol is "";
begin
inFile.bufferChar := getc(inFile);
write(outFile, "<pre>\n");
write(outFile, getWhiteSpace(inFile));
currSymbol := getSymbolOrComment(inFile);
while currSymbol <> "" do
currSymbol := replace(currSymbol, "&", "&");
currSymbol := replace(currSymbol, "<", "<");
if currSymbol[1] in {'"', '''} then
write(outFile, "<font color=\"maroon\">");
write(outFile, currSymbol);
write(outFile, "</font>");
elsif currSymbol[1] = '#' or startsWith(currSymbol, "(*") then
write(outFile, "<font color=\"green\">");
write(outFile, currSymbol);
write(outFile, "</font>");
elsif currSymbol[1] in digit_char then
write(outFile, "<font color=\"purple\">");
write(outFile, currSymbol);
write(outFile, "</font>");
else
write(outFile, currSymbol);
end if;
write(outFile, getWhiteSpace(inFile));
currSymbol := getSymbolOrComment(inFile);
end while;
write(outFile, "</pre>\n");
end func;
The functions 'skipSpace' and 'skipWhiteSpace' are defined in the "scanfile.s7i" library as follows:
const proc: skipSpace (inout file: inFile) is func
local
var char: ch is ' ';
begin
ch := inFile.bufferChar;
while ch = ' ' do
ch := getc(inFile);
end while;
inFile.bufferChar := ch;
end func;
const proc: skipWhiteSpace (inout file: inFile) is func
begin
while inFile.bufferChar in white_space_char do
inFile.bufferChar := getc(inFile);
end while;
end func;
The functions 'skipComment' and 'skipLineComment', which can be used to skip Seed7 comments, are defined as follows:
const proc: skipComment (inout file: inFile) is func
local
var char: character is ' ';
begin
character := getc(inFile);
repeat
repeat
while character not in special_comment_char do
character := getc(inFile);
end while;
if character = '(' then
character := getc(inFile);
if character = '*' then
skipComment(inFile);
character := getc(inFile);
end if;
end if;
until character = '*' or character = EOF;
if character <> EOF then
character := getc(inFile);
end if;
until character = ')' or character = EOF;
if character = EOF then
inFile.bufferChar := EOF;
else
inFile.bufferChar := getc(inFile);
end if;
end func; # skipComment
const proc: skipLineComment (inout file: inFile) is func
local
var char: character is ' ';
begin
repeat
character := getc(inFile);
until character = '\n' or character = EOF;
inFile.bufferChar := character;
end func; # skipLineComment
|
|