|
|
|
|
|
8. THE FILE SYSTEM
The file system is used for communication in various ways. For example: To write strings on the screen we use the following statements:
write("hello world");
writeln;
'writeln' means write newline. We can also write data of various types with 'write':
write("result = ");
write(number div 5);
write(" ");
writeln(not error);
The 'writeln' above writes data and then terminates the line. This is equal to a 'write' followed by a writeln. Instead of multiple write statements the '<&' operator can be used to concatenate the elements to be written:
writeln("result = " <& number div 5 <& " " <& not error);
The '<&' operator needs a 'string' as left operand and is overloaded for various types as right operand. To allow things like
write(next_time <& " \r");
the '<&' operator is also overloaded for various types as left operand and a 'string' as right operand. This allows you to concatenate several objects with '<&' when at least the first or the second object is a 'string'. We can also read data from the keyboard:
write("Amount? ");
read(amount);
The user is allowed to use backspace and sends the input to the program with the RETURN-key. To let the user respond with the RETURN-key we can write:
writeln("Type RETURN");
readln;
To read a line of data we can use 'readln':
write("Your comment? ");
readln(user_comment_string);
In the previous examples all 'read' statements read from the file IN and all 'write' statements write to the file OUT. The files IN and OUT are initialized with STD_IN and STD_OUT which are the stdin and stdout files of the operating system. (Usually the keyboard and the screen). When we want to write to other files we use write statements with the file as first parameter. To write a line of text to the file "info.fil" we use the following statements:
info_file := open("info.fil", "w");
writeln(info_file, "This is the first line of the info file.");
close(info_file);
First the external file is opened for writing and then it is used. To read the file back in the string 'stri' we write:
info_file := open("info.fil", "r");
readln(info_file, stri);
close(info_file);
It is also possible to write values of other types to 'info_file':
writeln(info_file, number);
Here the 'number' is converted to a string which is written to the file. A 'number' is read back with:
readln(info_file, number);
For doing i/o to a window on the screen we write:
window1 := open_window(SCREEN, 10, 10, 5, 60);
box(window1);
setPos(window1, 3, 1);
write(window1, "hello there");
This opens the window 'window1' on the SCREEN at the position 10, 10. This window has 5 lines and 60 columns. A box (of characters: - | + ) is written to surround the 'window1' and finally the string "hello there" is written in the window 'window1' at Position 3, 1. If we want to clear the 'window1' we write:
clear(window1);
Files can be used for much more things. Here is a list of goals for a file system:
In the following subchapters we discuss each of this goals. 8.1 Conversion to strings and backWe archive the goal of doing i/o for arbitrary types with two conversion functions. In order to do i/o with a type the 'str' and 'parse' functions must be defined for that type. As an example we show the conversion functions for the type boolean:
const func string: str (in boolean: aBool) is func
result
var string: result is "";
begin
if aBool then
result := "TRUE";
else
result := "FALSE";
end if;
end func;
const func boolean: (attr boolean) parse (in string: stri) is func
result
var boolean: result is FALSE;
begin
if stri = "TRUE" then
result := TRUE;
elsif stri = "FALSE" then
result := FALSE;
else
raise RANGE_ERROR;
end if;
end func;
The 'str' function must deliver a corresponding string for every value of the type. The 'parse' function parses a string and delivers the converted value as result. If the conversion is not successful the exception RANGE_ERROR is raised. The attribute used with 'parse' allows that it is overloaded for different types. After defining the 'str' and 'parse' functions for a type the enable_io function can be called for this type as in:
enable_io(boolean);
The enable_io package declares various io functions like 'read', 'write' and others for the provided type (in this example 'boolean'). If only output (or only input) is needed for a type it is possible to define just 'str' (or 'parse') and activate just enable_output (or enable_input). There is also a formatting operator called 'lpad' which is based on the 'str' function. The statements
write(12 lpad 6);
write(3 lpad 6);
writeln(45 lpad 6);
write(678 lpad 6);
write(98765 lpad 6);
writeln(4321 lpad 6);
produce the following output:
12 3 45
678 98765 4321
As we see the 'lpad' operator can be used to produce right justified output. There is also a 'rpad' operator to produce left justified output. The basic definitions of the 'lpad' and 'rpad' operators work on strings and are as follows:
const func string: (ref string: stri) lpad (in integer: leng) is func
result
var string: result is "";
begin
if leng > length(stri) then
result := " " mult leng - length(stri) & stri;
else
result := stri;
end if;
end func;
const func string: (ref string: stri) rpad (in integer: leng) is func
result
var string: result is "";
begin
if leng > length(stri) then
result := stri & " " mult leng - length(stri);
else
result := stri;
end if;
end func;
The enable_io package contains definitions of 'lpad' and 'rpad' to work on the type specified with enable_io:
const func string: (in aType: aValue) lpad (in integer: leng) is func
result
var string: stri is "";
begin
stri := str(aValue) lpad leng;
end func;
const func string: (in aType: aValue) rpad (in integer: leng) is func
result
var string: stri is "";
begin
stri := str(aValue) rpad leng;
end func;
For 'float' values exists an additional way to convert them to strings. The 'digits' operator allows the specification of a precision. For example the statements
writeln(3.1415 digits 2);
writeln(4.0 digits 2);
produce the following output:
3.14
4.00
A combination with the 'lpad' operator as in
writeln(3.1415 digits 2 lpad 6);
writeln(99.9 digits 2 lpad 6);
is also possible and produces the following output:
3.14
99.90
8.2 Basic input and output operations
To allow arbitrary user defined file-types beside the operating system files we chose a model in which the i/o methods are assigned to the type of the file-value and not to the type of the file-variable. This allows a file variable to point to any file-value. The file-variables have the type 'file' which has only the assignment method defined. For the operating system files and for each user defined file a file-type must be declared which has the i/o methods defined. These file-types are derived (direct or indirect) from the type 'null_file' for which all i/o methods are defined upon a base of basic string i/o methods. So for a new user defined file-type only the basic string i/o methods must be defined. The two basic i/o methods defined for the 'null_file' are
const proc: write (ref null_file param, in string param) is noop;
const string: gets (ref null_file param, ref integer param) is "";
This means that writing any string to the 'null_file' has no effect and reading any number of characters from the 'null_file' delivers the empty string. When a user defined file type is declared these are the two methods that must be redefined for the new file-type. Based upon these two methods three more methods are defined for the 'null_file' named 'getc', 'getwd' and 'getln'. This methods get a character, a word and a line respectively. A word is terminated by a space, a tab or a linefeed. A line is terminated by a linefeed. This methods need not to be redefined for an user defined file type but for performance reasons they can also be redefined. The definitions for 'getc', 'getwd' and 'getln' for the 'null_file' are
const func char: getc (ref null_file: aFile) is func
result
var char: ch is ' ';
begin
ch := gets(aFile, 1)[1];
end func;
const func string: getwd (inout null_file: aFile) is func
result
var string: stri is "";
local
var string: buffer is "";
begin
repeat
buffer := gets(file conv aFile, 1);
until buffer <> " " and buffer <> "\t";
while buffer <> " " and buffer <> "\t" and
buffer <> "\n" and buffer <> "" do
stri &:= buffer;
buffer := gets(file conv aFile, 1);
end while;
if buffer = "" then
aFile.bufferChar := EOF;
else
aFile.bufferChar := buffer[1];
end if;
end func;
const func string: getln (inout null_file: aFile) is func
result
var string: stri is "";
local
var string: buffer is "";
begin
buffer := gets(file conv aFile, 1);
while buffer <> "\n" and buffer <> "" do
stri &:= buffer;
buffer := gets(file conv aFile, 1);
end while;
if buffer = "" then
aFile.bufferChar := EOF;
else
aFile.bufferChar := buffer[1];
end if;
end func;
Note that 'getwd' skips leading spaces and tabs while 'getc' and 'getln' do not. When 'getc', 'getwd' or 'getln' is not defined for a new user defined file type the declarations from the 'null_file' are used instead. This declarations are based on the method 'gets' which must be defined for every new user defined file-type. Note that there is an assignment to the variable 'bufferChar'. This variable is a component of 'null_file' and therefore also a component of all derived file types. This allows an eoln-function to test if the last 'getwd' or 'getln' reach the end of a line. Here is a definition of the eoln-function:
const func boolean: eoln (ref null_file: aFile) is func
result
var boolean: result is TRUE;
begin
result := aFile.bufferChar = '\n';
end func;
Besides assigning a value to 'bufferChar' in 'getwd' and 'getln' and using it in 'eoln' the standard 'file' functions do nothing with 'bufferChar'. The functions of the "scanfile.s7i" library use the 'bufferChar' variable as current character in the scan process. As such all functions of the "scanfile.s7i" library assume that the first character to be processed is always in 'bufferChar'. Since the standard 'file' functions do not have this behaviour, care has to be taken when mixing scanner and file functions. The next declarations allows various i/o operations for strings:
const proc: writeln (in file: aFile, in string: stri) is func
begin
write(aFile, stri);
writeln(aFile);
end func;
const proc: read (inout file: aFile, inout string: stri) is func
begin
stri := getwd(aFile);
aFile.io_empty := stri = "";
aFile.io_ok := TRUE;
end func;
const proc: readln (inout file: aFile, inout string: stri) is func
begin
stri := getln(aFile);
aFile.io_empty := stri = "";
aFile.io_ok := TRUE;
end func;
8.3 Input and output with conversion
Normally we need a combination of an i/o operation with a conversion operation. There are several functions which are based on the 'str' and 'parse' conversions and on the basic i/o-functions. The following declarations allow the 'write' function to be used for all types which define 'enable_io':
const proc: write (in file: aFile, in aType: aValue) is func
begin
write(aFile, str(aValue));
end func;
To allow the use of the 'read' and 'readln' functions the following declarations are made with 'enable_io':
const proc: read (inout file: aFile, inout aType: aValue) is func
local
var string: stri is "";
begin
stri := getwd(aFile);
aFile.io_empty := stri = "";
block
aValue := aType parse stri;
aFile.io_ok := TRUE;
exception
catch RANGE_ERROR:
aFile.io_ok := FALSE;
end block;
end func;
const proc: readln (inout file: aFile, inout aType: aValue) is func
local
var string: stri is "";
begin
stri := getln(aFile);
aFile.io_empty := stri = "";
block
aValue := aType parse stri;
aFile.io_ok := TRUE;
exception
catch RANGE_ERROR:
aFile.io_ok := FALSE;
end block;
end func;
The next two declarations define 'writeln' and 'backSpace':
const proc: writeln (ref external_file: aFile) is func
begin
write(aFile, "\n");
end func;
const proc: backSpace (ref external_file: aFile) is func
begin
write(aFile, "\b \b");
end func;
8.4 Simple read and write statements
The simple input/output for the standard i/o-files are 'read' and 'write' which are defined with 'enable_io'. Simple i/o may look like:
write("Amount? ");
read(amount);
'read' and 'write' use the files IN and OUT which are described in the next chapter. Here is the definition of the 'read' and 'write' procedures done with 'enable_io':
const proc: read (inout aType: aValue) is func
begin
read(IN, aValue);
end func;
const proc: readln (inout aType: aValue) is func
begin
readln(IN, aValue);
end func;
const proc: write (in aType: aValue) is func
begin
write(OUT, aValue);
end func;
const proc: writeln (in aType: aValue) is func
begin
write(OUT, aValue);
writeln(OUT);
end func;
Additional procedures defined outside of 'enable_io' are:
const proc: readln is func
local
var string: stri is "";
begin
stri := getln(IN);
IN.io_empty := stri = "";
IN.io_ok := TRUE;
end func;
const proc: read (NL) is func
begin
readln;
end func;
const proc: writeln is func
begin
writeln(OUT);
end func;
const proc: write (NL) is func
begin
writeln(OUT);
end func;
As an example when you call
readln(number);
the readln(integer) procedure calls
readln(IN, number);
if the file IN has not redeclared readln(IN, integer) this procedure calls
stri := getln(IN);
and 'getln' may call gets(IN, 1) in a loop or may be defined for the file IN. Finally the 'parse' function converts the string read into an 'integer' and assigns it to 'number'
number := integer parse stri;
8.5 Standard input and output files
The standard i/o files are OUT for output and IN for input. This TWO are file-variables which are declared as follows:
var file: IN is STD_IN;
var file: OUT is STD_OUT;
STD_IN and STD_OUT are the standard input and output files of the operating system (Usually the keyboard and the screen). Because IN and OUT are variables redirection of standard input or standard output can be done easily by assigning a new value to them:
IN := OTHER_FILE;
After that all 'read' statements refer to OTHER_FILE. Most operating systems have also a stderr file which can be accessed via the name STD_ERR. If you want to write error messages to the screen even when stdout is redirected elsewhere you can write:
writeln(STD_ERR, "ERROR MESSAGE");
To redirect the standard output to STD_ERR you can write:
OUT := STD_ERR;
There is also a file STD_NULL defined. Anything written to it is ignored. Reading from it does deliver empty strings. This file can be used to initialize file variables as in:
var file: MY_FILE is STD_NULL;
It is also used to represent an illegal file value when for example an 'open' procedure fails. 8.6 Access to operating system filesThe access to operating system files is done via files of the types 'external_file', KEYBOARD_FILE and SCREEN_FILE. The type 'external_file' is defined as:
const type: external_file is sub null_file struct
var PRIMITIVE_FILE: ext_file is PRIMITIVE_null_file;
var string: name is "";
end struct;
This means that every data item of the type 'external_file' has the components from 'null_file' and additionally the components ext_file and name. Note the type PRIMITIVE_FILE which points directly to an operating system file. Objects of type PRIMITIVE_FILE can only have operating system files as values while objects of type 'file' can also have other files as values. To allow the implementation of the type 'external_file' several operations for the type 'PRIMITIVE_FILE' are defined. But outside 'external_file' the type 'PRIMITIVE_FILE' and its operations should not be used. There are three predefined external files STD_IN, STD_OUT and STD_ERR which have the following declarations:
const func external_file: INIT_STD_FILE (ref PRIMITIVE_FILE: primitive_file,
in string: file_name) is func
result
var external_file: result is external_file.value;
begin
result.ext_file := primitive_file;
result.name := file_name;
end func;
var external_file: STD_IN is INIT_STD_FILE(PRIMITIVE_INPUT, "STD_IN");
var external_file: STD_OUT is INIT_STD_FILE(PRIMITIVE_OUTPUT, "STD_OUT");
var external_file: STD_ERR is INIT_STD_FILE(PRIMITIVE_ERROR, "STD_ERR");
It is possible to do i/o directly with them, but it is more wisely to use them only to initialize user defined file variables as in:
var file: ERR is STD_ERR;
In the rest of the program references to such a variable can be used:
writeln(ERR, "Some error occurred");
In this case redirection of the ERR file can be done very easy. The second way to access external_files is to use the 'open' function. Usually a file variable is declared
var file: MY_OUT is STD_NULL;
and the result of the 'open' function is assigned to this file variable
MY_OUT := open("my_file", "w");
If the 'open' has failed it returns STD_NULL so we must check the file variable to be on the save side
if MY_OUT <> STD_NULL then
After that output to MY_OUT is possible with
writeln(MY_OUT, "hi there");
As stated earlier STD_IN provides an interface to the keyboard which is line buffered and echoed on STD_OUT. This means that you can see everything you typed in and correct it with BACKSPACE until you press RETURN. But sometimes an unbuffered and unechoed input is needed. This is provided with the file KEYBOARD. This are the declaration of the type KEYBOARD_FILE and the file KEYBOARD itself:
const type: KEYBOARD_FILE is subtype file;
var KEYBOARD_FILE: KEYBOARD is SCREEN_KEYBOARD;
Reading from KEYBOARD may deliver simple ascii characters or special codes for function keys. This special codes can also be copied to a character variables because the type 'char' is not limited to 8 bits. For each function key exists a predefined constant that can be used to test which key is pressed. Additionally to the operations possible with other files there are two functions that are applicable only for the file KEYBOARD. The busy_getc(KEYBOARD) function delivers the next character from the keyboard or the character KEY_NONE if no key has been pressed. The busy_gets(KEYBOARD, integer) function delivers a string consisting of the characters in the keyboard buffer but with the maximum length which is specified in the second parameter. To allow random access output to a text screen (or text window) the type SCREEN_FILE is defined. The function
open(SCREEN_FILE)
returns a SCREEN_FILE. 8.7 User defined file typesIn addition to the predefined file types it is often necessary to define a new type of file. Such a new file has several possibilities:
With the following declaration we define a new file type:
const type: NEW_FILE is sub null_file struct
...
(* Local data *)
...
end struct;
It is not necessary to derive the NEW_FILE type directly from 'null_file'. The NEW_FILE type may also be an indirect descendant of 'null_file'. So it is possible to create file type hierarchies. The interface implemented by the new file needs also to be specified:
type_implements_interface(NEW_FILE, file);
The type file is not the only interface type which can be used. There is also the type text which is derived from file. The type text describes a line oriented file which allows 'setPos' (which moves the current position to the line and column specified) and other functions. It is also possible to define new interface types which derive from file or text. As next an open function is needed to generate a new NEW_FILE:
const func file: open_new_file ( (* Parameters *) ) is func
result
var file: result is STD_NULL;
begin
...
(* Initialisation of the local data *)
result := malloc( ... );
...
end func;
Note the usage of the ALLOC function to generate a new data item. This data item is not freed automatically but if you do not open files to often this does not hurt. Now only the two basic i/o operations must be defined:
const proc: write (inout NEW_FILE: new_fil, in string: stri) is func
begin
...
(* Statements that do the output *)
...
end func;
const proc: gets (inout NEW_FILE: new_fil, in integer: leng) is func
result
var string: stri is "";
begin
...
(* Statements that do the input *)
...
end func;
The I/O concept introduced in the previous chapters separates the input of data from it's conversion. The 'read', 'readln', 'getwd' and 'getln' functions are designed to read whitespace separated data elements. When the data elements are not separated by whitespace characters this I/O concept is not possible. Instead the functions which read from the file need some knowledge about the type which they intend to read. Fortunately this is a well researched area. The lexical scanners used by compilers solve exactly this problem. Lexical scanners read symbols from a file and use the concept of a current character. A symbol can be a name, a number, a string, an operator, a parenthesis or something else. The current character is the first character to be processed when scanning a symbol. After a scanner has read a symbol the current character contains the character just after the symbol. This character could be the first character of the next symbol or some whitespace character. If the set of symbols is choosen wisely all decisions about the type of the symbol and when to stop reading characters for a symbol can be done based on the current character. Every 'file' contains a 'bufferChar' variable which is used as current character by the scanner functions defined in the "scanfile.s7i" library. The "scanfile.s7i" library contains skip... and get... functions. The skip... procedures return void and are used to skip input while the get... functions return the string of characters they have read. The following basic scanner functions are defined in the "scanfile.s7i" library:
Contrary to 'read' and 'getwd' basic scanner functions do not skip leading whitespace characters. To skip whitespace characters one of the following functions can be used:
The advanced scanner functions do skip whitespace characters before reading a symbol:
All scanner functions assume that the first character to be processed is in 'bufferChar' and after they are finished the next character which should be processed is also in 'bufferChar'. To use scanner functions for a new opened file it is necessary to assign the first character to the 'bufferChar' with:
myFile.bufferChar := getc(myFile);
In most cases whole files are either processed with normal I/O functions or with scanner functions. When normal I/O functions need to be combined with scanner functions care has to be taken:
Scanner functions are helpful when it is necessary to read numeric input without failing when no digits are present:
skipWhiteSpace(IN);
if eoln(IN) then
writeln("empty input");
elsif IN.bufferChar in {'0' .. '9'} then
number := integer parse getDigits(IN);
skipLine(IN);
writeln("number " <& number);
else
stri := getLine(IN);
writeln("command " <& literal(stri));
end if;
The function 'getSymbol' is designed to read Seed7 symbols. When it returns "" the end of the file is reached. With 'getSymbol' name-value pairs can be read:
name := getSymbol(inFile);
while name <> "" do
if name <> "#" and getSymbol(inFile) = nt color=maroon>"="/font> then
aValue = getSymbol(inFile);
if aValue <> "" then
if aValue[1] = '"' then
keyValueHash @:= [name] aValue[2 ..];
elsif aValue[1] in {'0' .. '9'} then
keyValueHash @:= [name] aValue;
end if;
end if;
end if;
end while;
The following loop can be used to process the symbols of a Seed7 program:
inFile.bufferChar := getc(inFile);
currSymbol := getSymbol(inFile);
while currSymbol <> "" do
... process currSymbol ...
currSymbol := getSymbol(inFile);
end while;
Whitespace and comments are automatically skipped with the function 'getSymbol'. When comments should also be returned the function 'getSymbolOrComment' can be used. Together with the function 'getWhiteSpace' it is even possible to get the whitespace between the symbols:
const func string: processFile (in string: fileName) is func
result
var string: result is "";
local
var file: inFile is STD_NULL;
var string: currSymbol is "";
begin
inFile := open(fileName, "r");
if inFile <> STD_NULL then
inFile.bufferChar := getc(inFile);
result := getWhiteSpace(inFile);
currSymbol := getSymbolOrComment(inFile);
while currSymbol <> "" do
result &:= currSymbol;
result &:= getWhiteSpace(inFile);
currSymbol := getSymbolOrComment(inFile);
end while;
end if;
end func;
In the example above the function 'processFile' gathers all symbols, whitespace and comments in the string it returns. The string returned by 'processFile' is equivalent to the one returned by the function 'getf'. That way it is easy to test the scanner functionality. The logic with 'getWhiteSpace' and 'getSymbolOrComment' can be used to add HTML tags to comments and literals. The following function colors comments with green, string and char literals with maroon and numeric literals with purple:
const proc: sourceToHtml (inout file: inFile, inout file: outFile) is func
local
var string: currSymbol is "";
begin
inFile.bufferChar := getc(inFile);
write(outFile, "<pre>\n");
write(outFile, getWhiteSpace(inFile));
currSymbol := getSymbolOrComment(inFile);
while currSymbol <> "" do
currSymbol := replace(currSymbol, "&", "&");
currSymbol := replace(currSymbol, "<", "<");
if currSymbol[1] in {'"', '''} then
write(outFile, "<font color=\"maroon\">");
write(outFile, currSymbol);
write(outFile, "</font>");
elsif currSymbol[1] = '#' or startsWith(currSymbol, "(*") then
write(outFile, "<font color=\"green\">");
write(outFile, currSymbol);
write(outFile, "</font>");
elsif currSymbol[1] in digit_char then
write(outFile, "<font color=\"purple\">");
write(outFile, currSymbol);
write(outFile, "</font>");
else
write(outFile, currSymbol);
end if;
write(outFile, getWhiteSpace(inFile));
currSymbol := getSymbolOrComment(inFile);
end while;
write(outFile, "</pre>\n");
end func;
The functions 'skipSpace' and 'skipWhiteSpace' are defined in the "scanfile.s7i" library as follows:
const proc: skipSpace (inout file: inFile) is func
local
var char: ch is ' ';
begin
ch := inFile.bufferChar;
while ch = ' ' do
ch := getc(inFile);
end while;
inFile.bufferChar := ch;
end func;
const proc: skipWhiteSpace (inout file: inFile) is func
begin
while inFile.bufferChar in white_space_char do
inFile.bufferChar := getc(inFile);
end while;
end func;
The functions 'skipComment' and 'skipLineComment', which can be used to skip Seed7 comments, are defined as follows:
const proc: skipComment (inout file: inFile) is func
local
var char: character is ' ';
begin
character := getc(inFile);
repeat
repeat
while character not in special_comment_char do
character := getc(inFile);
end while;
if character = '(' then
character := getc(inFile);
if character = '*' then
skipComment(inFile);
character := getc(inFile);
end if;
end if;
until character = '*' or character = EOF;
if character <> EOF then
character := getc(inFile);
end if;
until character = ')' or character = EOF;
if character = EOF then
inFile.bufferChar := EOF;
else
inFile.bufferChar := getc(inFile);
end if;
end func; # skipComment
const proc: skipLineComment (inout file: inFile) is func
local
var char: character is ' ';
begin
repeat
character := getc(inFile);
until character = '\n' or character = EOF;
inFile.bufferChar := character;
end func; # skipLineComment
|
|