Part 2 — The Library cxxxix
22.6 Programming note
continued from previous page
_:int The precision, in the range (1, 65535).
_:fppPad> An element specifying the padding character.
<DefaultFppPad> Use the default pad character for output, i.e. a space.
<FppPad Specify the precision and padding character for basic output.
_:int> The decimal Unicode number of the padding character, in the range (0, 65535). See table 2 on page 16.
Table 10: Styling elements for basic file input and output.
22.6 Programming note
You may find it helpful to use the basic input/output paths described in this chapter in conjunction with the pathalu.unicode.cat, see chapter 7.5 on page 53. An output message is prepared in three steps:
1. Useio.format.itouni,io.format.etouni,io.format.ftouni,
io.format.gtouniandio.format.utounito prepare the components of the mes-sage.
2. Use alu.unicode.catto pull the components together into a complete unicode message.
3. Useio.file.uwrite to display the message.
See template catch_ping in program ping for an example.
23 Formatted input and output
It is often convenient to specify a mesage to be printed by marking it up with “tags”
which indicate which parts are text, and where integers, reals and other texts are to be inserted. Examples of such markup are Erlang’s io:format function and C’s printf function.
The problem with printf-like functions is that they do not work in a strictly typed language such as ML, or Pint. Olivier Danvy [Dan98] describes the problem:
In ML, expressing a printf-like function is not as trivial as in C. For example, we would like that evaluating the expression
format "%i is %s%n" 3 "x"
yields the string "3 is x\n", as specified in the pattern "%i is %s%n", which tells format to issue an integer, followed by the constant string " is ", itself followed by a string and ended by the newline character.
What is the type of format? In this example it is
string -> int -> string -> string
but we would like our printf-like function to handle any kind of pattern. For example, we would like
format "%i/%i" 10 20
to yield "10/20". In that example, format is used with the type
string -> int -> int -> string
However we cannot do that in ML: format can only have one type.
The Pint language solves the problem by marking up the format specification as an element rather than a string. The values are included in nested elements rather than curried behind the specification string.
The specification has type format where
822 elatt format = Format of format_field list
where the typeformat_field is defined by
147
The final letter of each element’s identifier, e.g. the “i” in_fppiindicates the formatting style:
Final Input effect for fread letter
i Asio.format.ureadi, chapter 22.2 on page 142 e, f, g Asio.format.ureadf, chapter 22.2 on page 142 u Asio.format.ureadu, chapter 22.2 on page 142 n Asio.format.ureadw, chapter 22.2 on page 143 r As described by the nestedFormat. . . element.
When the specification is used by fread, the precision and padding character specifica-tions are ignored.
Final Output formatting style for fwrite letter
i Asio.format.itouni, chapter 22.3 on page 143 e Asio.format.etouni, chapter 22.3 on page 143 f Asio.format.ftouni, chapter 22.3 on page 143 g Asio.format.gtouni, chapter 22.3 on page 143 u Asio.format.utouni, chapter 22.3 on page 143 n Insert a newline character
r As described by the nestedFormat. . . element.
For example, an element _fppe. . . corresponds to ∼F.P1.P2c in Erlang and produces the same effect as the Erlang specification:
• F specifies minimum field width, left justified if -ve.
• P1 specifies precision.
• P2 specifies padding character. Default = space.
• c specifies i = integer, e,f,g = real, u = unicode, n = newline and r = recursive specification.
Pattern Description
<_b Format a boolean as if it were a text “true” or “false”, i.e. a string of Unicode numbers.
_:bool> The value to be formatted.
<_fb Format a boolean as if it were a text “true” or “false”, i.e. a string of Unicode numbers.
_:int The minimum field width, left justified value if negative.
_:bool> The value to be formatted.
<_fpb Format a boolean as if it were a text “true” or “false”, i.e. a string of Unicode numbers.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
_:bool> The value to be formatted.
<_fppb Format a boolean as if it were a text “true” or “false”, i.e. a string of Unicode numbers.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:bool> The value to be formatted.
<_i Format an integer value.
_:int> The value to be formatted.
<_fi Format an integer value.
_:int The minimum field width, left justified value if negative.
_:int> The value to be formatted.
<_fpi Format an integer value.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
_:int> The value to be formatted.
<_fppi Format an integer value.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
continued on next page
149
continued from previous page
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:int> The value to be formatted.
<_e Format a floating point value in the style [-]d.ddde±ddd _:real> The value to be formatted.
<_fe Format a floating point value in the style [-]d.ddde±ddd _:int The minimum field width, left justified value if negative.
_:real> The value to be formatted.
<_fpe Format a floating point value in the style [-]d.ddde±ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The number of digits written. The default precision is 6.
_:real> The value to be formatted.
<_fppe Format a floating point value in the style [-]d.ddde±ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The number of digits written. The default precision is 6.
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:real> The value to be formatted.
<_f Format a floating point value in the style [-]ddd.ddd _:real> The value to be formatted.
<_ff Format a floating point value in the style [-]ddd.ddd _:int The minimum field width, left justified value if negative.
_:real> The value to be formatted.
<_fpf Format a floating point value in the style [-]ddd.ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The number of digits after the decimal point. The default precision is 6.
_:real> The value to be formatted.
<_fppf Format a floating point value in the style [-]ddd.ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The number of digits after the decimal point. The default precision is 6.
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:real> The value to be formatted.
<_g If floating point value f is in the range 0.1 < f < 104 then for-mat it in the style ]ddd.ddd otherwise forfor-mat it in the style [-]d.ddde±ddd
_:real> The value to be formatted.
<_fg If floating point value f is in the range 0.1 < f < 104 then for-mat it in the style ]ddd.ddd otherwise forfor-mat it in the style [-]d.ddde±ddd
continued on next page
continued from previous page
_:int The minimum field width, left justified value if negative.
_:real> The value to be formatted.
<_fpg If floating point value f is in the range 0.1 < f < 104 then for-mat it in the style ]ddd.ddd otherwise forfor-mat it in the style [-]d.ddde±ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. If floating point value f is in the range 0.1 < f <
104, the precision is the number of digits after the decimal point, otherwise the precision is the number of digits written. The default precision is 6.
_:real> The value to be formatted.
<_fppg If floating point value f is in the range 0.1 < f < 104 then for-mat it in the style ]ddd.ddd otherwise forfor-mat it in the style [-]d.ddde±ddd
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. If floating point value f is in the range 0.1 < f <
104, the precision is the number of digits after the decimal point, otherwise the precision is the number of digits written. The default precision is 6.
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:real> The value to be formatted.
<_u Format a text, i.e. a string of Unicode numbers.
_:unicode> The value to be formatted.
<_fu Format a text, i.e. a string of Unicode numbers.
_:int The minimum field width, left justified value if negative.
_:unicode> The value to be formatted.
<_fpu Format a text, i.e. a string of Unicode numbers.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
_:unicode> The value to be formatted.
<_fppu Format a text, i.e. a string of Unicode numbers.
_:int The width of the field in which the value is displayed, left justified value if negative.
_:int The precision. The sequence of characters representing the value will be truncated or padded out to fill a string of this length.
_:int The Unicode number of a padding character to be used for field width and precision. The default value is 32, a space.
_:unicode> The value to be formatted.
<_n> Insert a new line sequence into the output.
<_r Format a substring specified using recursive element<Format>.
_:format> The substring to be formatted.
Table 11: Fields used by fwrite and fread.
The templates in this library are made available by placing the declaration 850 import
851 "-//Pint//NONSGML Formatted input output library//EN"
at the head of your program. It is not a problem if multiple programs import the same
151
file: only the first such declaration applies. The others are ignored.
fwrite Receives record {stream spec c e}, where
• stream is the identifier of the stream to which the characters are to be written.
If stream is specified asstdOutthe output is displayed on the screen, if stream is specified as stdErr the output is placed in the log file, and if stream is specified as stdNull the output disappears for ever. In these cases it is not necessary to open any file.
• spec is the specification of typeformat. See line 23.
• c is a path on which a handclap {} will be sent if the operation is successful.
• e is a path on which a handclap {} will be sent if an error occurs.
An example will make this clearer. First we define a format specification:
852 val Spec =
853 <Format [<_u "He asked \"">,
854 <_r Format [_u "Is the answer ",
855 _i 3,
856 _u "?"]>,
857 <_u "\",">,
858 <_n>,
859 <_u "and she replied \"">, 860 <_r Format [_u "No, its ",
861 _fppf ~10 3 183 3.14159,
862 _u "."]>,
863 <_u "\".">, 864 <_n>]>
Note on lines 854 and 860 the recursive use of theFormat element to specify the quoted sentences.
Note on line 861 that the element _fppf . . . specifies the field width F to be ∼10 which means that the characters will be left-justified in the field. The precision is 3 digits after the decimal point, and the pad character is the center dot, ISO Latin 9 character position 183. The value supplied, 3.14159, is sufficient for the required precision.
The following program fragment
865 new k2 new k3 new k4 new k5 new e 866 offer
867 ( io.file.open!Open "/tmp/test" [<ModeWrite>] k2 e
868 || k2?Stream = ( fwrite!{Stream Spec k3 e}
869 || k4!Stream
870 )
871 || k3?_ = k4?Stream= io.file.close!Close Stream k5
872 || k5?_ = ¶fin 873 || e?_ = ¶postmortem
874 )
places the result
875 He asked "Is the answer 3?",
876 and she replied "No, its 3.142· · · · ·.".
in file /tmp/test. Note on line 867 that the element Open . . . provides the output file name and the list of modes, each of type mode, see line 806. The resulting stream identifier of type (stream file) is send on path k2. On line 868 the stream identifier is recovered and used by fwrite, and concurrently forwarded on path k4.
On line 871 the stream identifier is not recovered until the handclap {} has been received on path k3 indicating that output is complete and the file may be safely closed.
fwrite_to_cont Receives record {spec c}, where
• spec is the specification of typeformat. See line 23.
• c is a path on which the formatted string will be sent.
The formatting done is exactly the same as fwrite.
fread Receives record {stream spec c eof e}, where
• stream is the identifier of the stream from which the characters are to be read.
If stream is specified asstdInthe characters are taken from the terminal. In this case it is not necessary to open any file.
• spec is the specification of typeformat. See line 23.
• c is a path on which a specification of typeformatwill be sent if the operation is successful.
• eof is a path on which a possibly partial specification of typeformatwill be sent if the operation was terminated prematurely by an end-of-file marker.
• e is a path on which a handclap {} will be sent if an error occurs.
This library template is currently not supported due an anomaly in the Erlang run time system.
153
24 Token scanning
This chapter describes a lexical analysis of Unicode values. It returns tokens based on those of the Pint language itself.
The library template is made available by placing the declaration 877 import "-//Pint//DTD Token scanner library//EN"
at the head of your program. It is not a problem if multiple programs import the same file: only the first such declaration applies. The others are ignored.
The Pint language provides a library path which analyzes values of type unicode, extracting a sequence of white-space separated tokens. The tokens belong to type tok, and fall into four classes: keywords, operators, processes and values. The keywords are defined by:
878 elatt tok =
879 T_kw_and of int * int | T_kw_anon of int * int 880 | T_kw_as of int * int | T_kw_bool of int * int 889 | T_kw_stdErr of int * int | T_kw_stdIn of int * int 890 | T_kw_stdNull of int * int | T_kw_stdOut of int * int 891 | T_kw_stream of int * int | T_kw_then of int * int 892 | T_kw_unicode of int * int | T_kw_val of int * int
and correspond to the keywords of the Pint language defined in chapter 4 on page 21.
The two integers provide the line number and column number at which the token was found in the Unicode value.
The operators are defined by:
893 elatt tok =
894 T_op_caret of int * int | T_op_colon of int * int 895 | T_op_comma of int * int | T_op_div of int * int 896 | T_op_dlquot of int * int | T_op_dot of int * int 897 | T_op_drquot of int * int | T_op_dvbar of int * int 898 | T_op_equals of int * int | T_op_input of int * int 899 | T_op_lang of int * int | T_op_lbra of int * int 900 | T_op_lcurl of int * int | T_op_lguill of int * int 901 | T_op_lpar of int * int | T_op_minus of int * int 902 | T_op_output of int * int | T_op_plus of int * int 903 | T_op_prod of int * int | T_op_rang of int * int 904 | T_op_rbra of int * int | T_op_rcurl of int * int 905 | T_op_rguill of int * int | T_op_rpar of int * int 906 | T_op_slash of int * int | T_op_times of int * int 907 | T_op_vbar of int * int
and again the two integers provide the line number and column number at which the token was found in the Unicode value.
The built-in processes are defined by:
908 elatt tok =
909 T_pr_die of int * int | T_pr_fin of int * int 910 | T_pr_post of int * int
Values are defined by:
911 elatt tok =
912 T_int of int * int * int
913 | T_name of unicode * int * int 914 | T_real of real * int * int 915 | T_unicode of unicode * int * int For example, the Unicode value
916 "val Y = {∼13 3.14 3.3e∼10 \n \"VoilU+00e0\"}"
is tokenized as
917 [<T_kw_val 1 1>, <T_name "Y" 1 5>, <T_op_equals 1 7>, 918 <T_op_lcurl 1 9>, <T_int ∼13 1 10>, <T_real 3.14 1 14>, 919 <T_real 3.3e∼10 1 19>, <T_unicode "Voil`a" 2 2>,
920 <T_op_rcurl 2 15>]
Note in line 916 that the quote marks around the contained Unicode value are escaped to prevent confusion with the outer quote marks.
Pint comments may be placed between the tokens, and will be ignored:
921 "hello % Line comment 922 35 (* block comment
923 over two lines *) 36"
is tokenized as
924 [<T_name "hello" 1 1>, <T_int 35 2 1>, <T_int 36 3 25>]
24.1 Lexical analysis using tok.io.scan
Built-in path tok.io.scan receives an element of typetokscanner:
925 elatt tokscanner =
926 TokScanner of unicode * ∧a(tok list) * ∧a{}
Pattern Description
<TokScanner Scan a unicode value for Pint tokens.
_:unicode The value to be scanned.
_:∧a(tok list) A path on which the list of tokens will be sent if the scanning is successful.
_:∧a{}> A handclap is sent on this path if the scan fails, and a message explaining the failure is placed in the log file.
Table 12: Tokenize a Unicode value
155
25 OASIS catalogue access
This chapter describes the basic OASIS catalogue access of the Pint language.
The Pint programming system supports the OASIS catalogue [Gro97] as described in chapter 3 on page 17. The library templates are made available by placing the declaration
927 import "-//Pint//DTD Catalogue access library//EN"
at the head of your program. It is not a problem if multiple programs import the same file: only the first such declaration applies. The others are ignored.
To access the catalogue, you send elements of type catlookup to the library paths which provide the access.
928 %% Interface to the catalogue
929 elatt catlookup = CatLookup of catkey * ∧acatval * ∧a{}
Pattern Description
<CatLookup Search for an item in the OASIS catalogue. The key is specified by
<CatLookup Search for an item in the OASIS catalogue. The key is specified by