3 Built-In Datatypes
3.5 Bytes and Byte Strings
A byte is an exact integer between0and255, inclusive. Thebyte?predicate recognizes numbers that represent bytes.
Examples:
> (byte? 0)
#t> (byte? 256)
#f
A byte string is similar to a string—see §3.4 “Strings (Unicode)”—but its content is a se-quence of bytes instead of characters. Byte strings can be used in applications that process pure ASCII instead of Unicode text. The printed form of a byte string supports such uses in particular, because a byte string prints like the ASCII decoding of the byte string, but prefixed with a#. Unprintable ASCII characters or non-ASCII bytes in the byte string are
written with octal notation. §1.3.7 “Reading
Strings” in The Racket Reference documents the fine points of the syntax of byte strings.
Examples:
> #"Apple"
#"Apple"
> (bytes-ref #"Apple" 0) 65> (make-bytes 3 65)
#"AAA"
> (define b (make-bytes 2 0))
> b
#"\0\0"
> (bytes-set! b 0 1)
> (bytes-set! b 1 255)
> b
#"\1\377"
The displayform of a byte string writes its raw bytes to the current output port (see §8
“Input and Output”). Technically, displayof a normal (i.e,. character) string prints the UTF-8 encoding of the string to the current output port, since output is ultimately defined in terms of bytes;displayof a byte string, however, writes the raw bytes with no encoding.
Along the same lines, when this documentation shows output, it technically shows the UTF-8-decoded form of the output.
Examples:
> (display #"Apple") Apple
> (display "\316\273") ; same as "λ"
λ> (display #"\316\273") ; UTF-8 encoding of λ λ
For explicitly converting between strings and byte strings, Racket supports three kinds of encodings directly: UTF-8, Latin-1, and the current locale’s encoding. General facilities
for byte-to-byte conversions (especially to and from UTF-8) fill the gap to support arbitrary string encodings.
Examples:
> (bytes->string/utf-8 #"\316\273")
"λ"
> (bytes->string/latin-1 #"\316\273")
"λ"
> (parameterize ([current-locale "C"]) ; C locale supports ASCII, (bytes->string/locale #"\316\273")) ; only, so...
bytes-ąstring/locale: byte string is not a valid encoding for the current locale
byte string: #"z316z273"
> (let ([cvt (bytes-open-converter "cp1253" ; Greek code page
"UTF-8")]
[dest (make-bytes 2)])
(bytes-convert cvt #"\353" 0 1 dest) (bytes-close-converter cvt)
(bytes->string/utf-8 dest))
"λ"
§4.5 “Byte Strings”
in The Racket Referenceprovides more on byte strings and byte-string procedures.
3.6 Symbols
A symbol is an atomic value that prints like an identifier preceded with'. An expression that starts with'and continues with an identifier produces a symbol value.
Examples:
> 'a
'a> (symbol? 'a)
#t
For any sequence of characters, exactly one corresponding symbol is interned; calling the string->symbolprocedure, orreading a syntactic identifier, produces an interned symbol.
Since interned symbols can be cheaply compared witheq?(and thuseqv?orequal?), they serve as a convenient values to use for tags and enumerations.
Symbols are case-sensitive. By using a#ci prefix or in other ways, the reader can be made to case-fold character sequences to arrive at a symbol, but the reader preserves case by default.
Examples:
> (eq? 'a 'a)
#t> (eq? 'a (string->symbol "a"))
#t> (eq? 'a 'b)
#f> (eq? 'a 'A)
#f> #ci'A 'a
Any string (i.e., any character sequence) can be supplied tostring->symbolto obtain the corresponding symbol. For reader input, any character can appear directly in an identifier, except for whitespace and the following special characters:
( ) [ ] { } " , ' ` ; # | \
Actually,#is disallowed only at the beginning of a symbol, and then only if not followed by
%; otherwise,#is allowed, too. Also,.by itself is not a symbol.
Whitespace or special characters can be included in an identifier by quoting them with|or
\. These quoting mechanisms are used in the printed form of identifiers that contain special characters or that might otherwise look like numbers.
Examples:
> (string->symbol "one, two") '|one, two|
> (string->symbol "6") '|6|
§1.3.2 “Reading Symbols” in The Racket Reference documents the fine points of the syntax of symbols.
Thewritefunction prints a symbol without a'prefix. Thedisplayform of a symbol is the same as the corresponding string.
Examples:
> (write 'Apple) Apple
> (display 'Apple) Apple
> (write '|6|)
|6|> (display '|6|) 6
The gensym and string->uninterned-symbol procedures generate fresh uninterned
symbols that are not equal (according toeq?) to any previously interned or uninterned sym-bol. Uninterned symbols are useful as fresh tags that cannot be confused with any other value.
Examples:
> (define s (gensym))
> s
'g42> (eq? s 'g42)
#f> (eq? 'a (string->uninterned-symbol "a"))
#f
A keyword value is similar to a symbol (see §3.6 “Symbols”), but its printed form is prefixed
with#:. §1.3.15 “Reading
Keywords” in The Racket Reference documents the fine points of the syntax of keywords.
> (eq? '#:apple (string->keyword "apple"))
#t
More precisely, a keyword is analogous to an identifier; in the same way that an identifier can be quoted to produce a symbol, a keyword can be quoted to produce a value. The same term “keyword” is used in both cases, but we sometimes use keyword value to refer more specifically to the result of a quote-keyword expression or ofstring->keyword. An unquoted keyword is not an expression, just as an unquoted identifier does not produce a symbol:
Examples:
> not-a-symbol-expression not-a-symbol-expression: undefined;
cannot reference an identifier before its definition in module: top-level
> #:not-a-keyword-expression
eval:2:0: #%datum: keyword misused as an expression at: #:not-a-keyword-expression
Despite their similarities, keywords are used in a different way than identifiers or symbols.
Keywords are intended for use (unquoted) as special markers in argument lists and in certain syntactic forms. For run-time flags and enumerations, use symbols instead of keywords. The example below illustrates the distinct roles of keywords and symbols.
Examples:
> (define dir (find-system-path 'temp-dir)) ; not '#:temp-dir
> (with-output-to-file (build-path dir "stuff.txt") (lambda () (printf "example\n"))
; optional #:mode argument can be 'text or 'binary
#:mode 'text
; optional #:exists argument can be 'replace, 'truncate, ...
#:exists 'replace)