Published on august, 8, 2019 at starynkevitch.net/Basile/guile-tutorial-1.html. this is work in progress and incomplete.
Please bear in mind that I am French, so not a native English speaker.
The audience for this is mostly software developers using GNU/Linux (e.g. some Debian or Ubuntu or most other Linux distributions), knowing some programming language (like Python, PHP, JavaScript, C or C++, Rust, etc...; in particular at least some C or Java or JavaScript), notably those interested by my bismon persistent system (see also this draft technical report, but skip there the text required by European bureaucracy). It is about learning basic Scheme programming using Guile, not about embedding the Guile interpreter in your C program.
My (Basile's) personal motivation for writing this short tutorial is to popularize many concepts important to me and in my bismon, such as homoiconicity, metaprogramming, hygenic macros, reflection, introspection (both at compile-time and at run-time). So this tutorial mostly aims to be food for thought. My scariest resource is my own time spent in writing this tutorial.
You are expected to read this twice to undestand all of it. On the second read, do follow the hyperlinks given here at least to check that the terminology is familiar to you, or else to bookmark that hyperlink for later reading, and also refer to the Revised5 Report on the Algorithmic Language Scheme, called R5RS. Take in mind that I am not a native English speaker.
Refer also to the R5RS and its index, and to the Guile reference and tutorial and this Guile/Scheme tutorial. I greatly recommend reading Queinnec's Lisp in Small Pieces book (after SICP and this tutorial).
You need a recent Linux distribution such as Debian, Ubuntu, Arch etc. Install some Guile 2 (preferably Guile 2.2) package on your system.
I am using the guile-2.2 package on
Debian/Sid, on a x86-64 Linux desktop, and my favorite editor is
GNU emacs. I
heard that Guile could be usable on Windows thru
WSL
and on MacOSX by compiling its source code, but I don't use
these proprietary
operating systems, because I strongly
prefer free
software. Notice that Guile is a nice implementation
of Scheme, but there are other good free software
implementations of Scheme on Linux,
notably Chichen/Scheme
or Bigloo
(both are somehow compilers, and less focused on interactive
usage) or Racket (which
implements interactively several programming languages,
including Scheme). But Guile is
a GNU software and is
easily embeddable,
e.g. in GNU make
.
You really want auto-completion and parenthesis matching during REPL interaction. For that you need to fill, with your favorite source code editor your ~/.guile configuration file in your home directory to use several common modules, e.g. to get GNU readline Guile support and have formatted output and pretty-printing then initialize the readline facility:
;; file ~/.guile
(use-modules
(ice-9 readline)
(ice-9 format)
(ice-9 pretty-print))
(activate-readline)
The usual file suffix for Scheme source code is .scm on Linux, for example you could have some hello-world.scm source file. But you would prefer using the Read-Eval-Print Loop of your guile interpreter.
The GNU
emacs has a M-x scheme-mode
for Scheme; see also this.
And the vim editor knows
about
Lisp and Scheme. If your
editor don't know about Scheme choose at least some
Lisp
support in it. Parenthesis matching is of practical paramount
importance, so some of them are shown here as ( matched with its corresponding ).
Norvig's Teach Yourself Programming in Ten Years should be read before this mini-tutorial.
Many other texts are available, including better tutorials.
Of course SICP should be read after this mini-tutorial!
Scheme is a minimal and elegant programming language
which has popularized
important computer
science concepts, such
as continuations
with
its call/cc,
and this tutorial is not a reference. For reference
about Scheme, refer to the
famous R5RS
(or some later standard, such
as R6RS). If you are
also interested about Common
Lisp (see also
its wikipage),
refer to
the Common
Lisp HyperSpec. My biased opinion is that
If you happen to code
your scripts
in Python, you should
notice, after having read this tutorial, how Scheme
is semantically
close to Python or even
to JavaScript
or to Lua (all
being dynamic
programming languages), even if
their syntax
is widely different. I personally hate Python's syntax
so I
prefer GNU Guile
to Python, notably because white space are semantically
significant in Python, and this practically requires extra
support from
your source
code editor, so it is becomes difficult to
edit python
scripts with a plain editor
like vi
or leafpad.
In Guile, upper or lower cases are significant,
and comments start with a
semi-colon ;
and symbols (they
are a bit like identifiers or keywords in other
languages) may contain non-letter characters such
as - or ? or * or ! etc.... In
practice, the minus sign is almost like the underscore in many other
programming languages. So if you name a
variable like_this
in your favorite programming
language, you should name it like-this in
Scheme. Hence x-y is not a
substraction and a+b is not
the sum of a
and b, it is a symbol name!! And
"predicate" like names are conventionally ending with a
question mark, such as integer? which test
for integerness. An example of guile code is :
;; some comment, we are inputting a number
2
Notice that Guile don't have much real keywords (they are actually predefined macros), but it is advised to avoid redefining core language operators even if in some occasions, it might be useful. And the macro system of Scheme is specially powerful because of its hygenic macros. If you are used to the C preprocessor, forget everything you know about C or C++ macros. In Scheme, you could in principle redefine the if operator but you usually don't want to (except in rare cases!).
You'll often run the simplest examples directly into your guile interpreter. The prompt of that interpreter could be scheme@(guile-user)> but we abreviate it as guile> . The result of each input is often immediate. For example:
guile> 3
3 is the result given by guile
Actually, with readline support, Guile maintains some history of input, and keep several past results. Strings are like in C or Java or C++, with backslash escapes. But Guile will show you them, in results, with appropriate escapes. So, actually you'll have:
% guile
GNU Guile 2.2.4
Copyright (C) 1995-2017 Free Software Foundation, Inc.
Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.
Enter `,help' for help.
scheme@(guile-user)> "a\x47b\n"
$1 =
"aGb\n"
Then you could type Ctrl-D for the end-of-file, or also the ,quit command, to exit from Guile.
Don't forget to use your TAB key for auto-completion! In many (but not all) cases, just typing a few characters, immediately followed by a tabulation keypress (but without the ENTER key!), should be worthwhile. Also, take advantage (with the GNU readline extension of Guile) of parenthesis matching: if your cursor is on a right parenthesis, the corresponding left parenthesis will blink.
The left ( and
right ) parenthesis, quote or
apostrophe ', backquote `, dot .,
and comma , characters have very special
meanings. Some of these characters are
actually syntactic
sugar, in particular, within expressions, 'a
is a shorthand for (quote a)
, ,a is a
shorthand for (unquote a)
, and `a is a shorthand
for (quasiquote a)
, and all
of quote, unquote, quasiquote operators are
Scheme macros. In some contexts -and this is specific to Guile-
a prefix comma is starting a command, for example Guile
understands ,help
to provide you with on-line help,
and ,quit
to quit the guile interpreter. To
write a character constant (rarely useful), you need to
use #\a
for the character a (as if in C you
code 'a'), and #\newline
for the newline
character (in C, you code it '\n'). Integers (decimal
notation) and floating-point literals are like in C or JavaScript. The
boolean literal values are #f
(like false in
JavaScript or C++) and #t
(like true); notice
that Common Lisp don't have such boolean literal values: it
uses
instead nil
and t.
The Scheme language has no statements, just expressions (with a few other core syntactic constructs, such as bindings inside let, or conditions inside cond, etc...). Every expression gives some value. The interpreter is running some Read-Eval-Print Loop (or REPL) so it is looping on:
is evaluated tosome-value relation, e.g. (+ 2 3) → 5, etc...
$n
notation
(e.g. $3) where n is a small
integer.,quit
command, or
some end-of-file input condition. The values
are garbage
collected (this GC is required
by R5RS)
so there is no need to explicitly release any allocated
memory. The Guile interpreter
uses conservative tracing
garbage collection techniques, but don't expect them to be
your silver
bullet.
The Scheme language
is homoiconic:
expressions of the Scheme language can be easily manipulated as
data, so code is data
, and that homoiconicity permits to
have some
clever eval
function, and to
have hygienic
macros.
As in most Lisp dialects -since that is the acronym
of List Processing-, an important
data structure of Scheme is
the linked
list, as shown by the following figure taken
from wikipedia:
and the function to make them is naturally
called list. Of
course, these lists may easily
represent abstract
syntax trees of the Scheme language, thus enabling
its homoiconicity.
The Scheme language is minimal, so its syntax, inspired by Polish notation, is extremely simple and orthogonal, and might be percieved as scary or unfriendly. Any operator is before its (optional) operands, and enclosed in significant parenthesis. Of course the already mentioned syntactic sugar rule[s] mentioned above do apply, so '(+ 1 2) is equivalent to (quote (+ 1 2)) and will have exactly the same semantics and behavior.
Guile can be used as a simple Polish-notation calculator. For example: (- 7 2) → 5, (+ 3 5 10) → 18 and of course (+ 3 (- 8 4) 5) → 12.
Simple output can be done with
the display unary operation,
with newline niladic operation for
newlines. Formatted
output (a bit like printf in C, but with a different
format syntax, where ~
is an escape character, similiar
to what %
is in C's printf) can be done
with format. For
example
(format #f "ten hexa=~x; some list=~a!~%" 10 (list 12
13))
→
"ten hexa=a; some list=(12 13)!\n"
and works a bit à
la sprintf. If #f is replaced by #t we get
a printf-like behavior (output to
the current
output port, that is to stdout). But most of the time, we
would use pretty-print to nicely output one
single value.
Scheme being an expression-based language, the begin operator generalizes the comma operator of C to several arguments, evaluating all of them in sequence and giving the last value. Hence (begin (pretty-print (+ 4 5)) (* 5 6) (- 7 2)) → 5 but prints before 9 and heats the processor (as a side-effect) to compute the product 30. The if conditional operator is like the ?: ternary conditional operator of C, but you also have the quite useful when, cond, unless, case conditional operators.
Simple input happens with the read
primitive,
so (+ (read)
2)→ 5,
but during evaluation will wait for your input, supposed here to
be 3. More general input and output are done
thru ports,
a bit similar to C FILE*
handles.
Variables (that is, global ones) can be defined with define followed by the variable name. For example: after (define some-number 2) we have (+ some-number 3) → 5. Variables can be assigned with set! so after (set! some-number 5) we get (+ some-number 4) → 9.
Functions are also defined with define followed by some
kind of template looking like a function call. For example the
recursive factorial is of course defined with :
(define (fact n) (if (< n 1) 1 (* n (fact (- n 1)))))
and after that (fact 40) → 815915283247897734345611269596115894272000000000. Notice that Guile uses arbitrary-precision arithmetic, also known as bignums. The body of a function can contain several expressions, à la begin, for example: (define (verbose-double x) (pretty-print x) (+ x x))
Of course, functions are procedures, so, are first-class values;
hence : fact →
#<procedure fact (n)> and
+ → #<procedure + (#:optional _ _ . _)>. However, if
or define
cannot be evaluated, since they are not procedures. Use the unary procedure?
primitive to test if some value is a procedure.
Local
variables are introduced with the let syntax (which has
inspired the same
construct of JavaScript). The let construct is followed
by one or several local bindings, for
example (let ( (x 1) (y 2) ) (+ x
y)) → 3
but the bindings (x 1)
and (y 2)
are lexically
scoped to just the entire let expression.
Some anonymous functions can be obtained with lambda. For example (let ( (double (lambda (x) (+ x x)) ) ) (double 3) ) → 6, so observe that (define (triple x) (* 3 x)) is just a short-hand (some syntactic sugar) for (define triple (lambda (x) (* 3 x))).
The previously seen let construct is not enough in all cases. It cannot define recursive local functions, because when such functions are recursively called in their body, their name is not yet bound to them. For example, if we want to locally define two mutually recursive functions for testing odd and even numbers, we need to use letrec with two local bindings and the below example is (almost) exactly from the Guile manual:
(letrec ((even? (lambda (n)
(if (zero? n)
#t
(odd? (- n 1)))))
(odd? (lambda (n)
(if (zero? n)
#f
(even? (- n 1))))))
(even? 8888))
Be careful to note that the mutually recursive calls above are
in tail call
position,
and R5RS requires
that to be
computed iteratively. So
even if you replace 8888
by a larger number
like 998877
, the computation cannot blow up
your call
stack. Actually, such tail-recursive letrec
constructs are the recommended way to code iterations in
Scheme. As an exercise, code in such a way the factorial, using an
internal tail-recursive function of two arguments: the
number n
of which you compute the factorial and some
accumulator p
holding
the partial result
. Be aware that Scheme has
other iteration
mechanisms.
Of course in real life, few recursive functions or mutually-recursive functions are involving just tail-recursive calls. For example, most recursive descent parser or macro-expanders involve mutually-recursive functions with some call sites which are not tail-recursive. And it is common practice to code recursive functions with both tail-recursive and full-recursive calls.
Read more about primitive recursive functions which illustrate the mathematical equivalence between iteration and recursion, and study also continuation-passing style as a way to construct that equivalence.
Thanks to Abhishek Chakravarti (Kolkota, India) for constructive feedback and help on CSS.