1. MELT: Middle End Lisp Translator

The MELT branch introduces a powerful Lisp dialect to express middle-end analyzers and passes. This chapter describes the dialect and how to use it. A working knowledge of Scheme or Lisp is presupposed.

See the MELT wiki page and the GCC MELT site

1.1 MELT Prerequisites		Prerequisites and topics not yet covered in this MELT chapter.
1.2 MELT overview		An overview of MELT.
1.3 Building the MELT branch		Configuration and building requirements and instructions for MELT.
1.4 MELT as a plugin		Building and using MELT as a plugin.
1.5 Invoking MELT
1.6 Tutorial about MELT		Tutorial describing MELT.
1.7 Reference on MELT		MELT language reference.
1.8 Writing C code for MELT		How to write C code for MELT.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.1 MELT Prerequisites

The reader is expected to have some working knowledge of some Lisp dialect (Common Lisp, Emacs Lisp, Guile, ...). The reader is also expected to be somehow familiar with the internal architecture of GCC (i.e. knowing what GCC gimple-s and tree-s are).

MELT is different of other Lisps, because it is tightly suited to GCC internals. For that purpose, it has several peculiarities; MELT can:

handle two kind of things. The MELT infrastructure can handle both MELT values (closures, lists, objects, ...) and GCC stuff (plain long integers, gimples, trees, ...), that it, datatypes appearing inside GCC. Both values and stuff are called MELT things. Notice that stuff is not handled polymorphically (due to a limitation of the GCC Garbage Collector).
generate C code. MELT source code (either in ‘*.melt’ files, or inside memory) is translated into C code suitable for GCC internals, in the style expected inside GCC. That generated C code is compiled into a MELT binary module, which is dynamically loaded by the MELT infrastructure.
provide linguistic devices. The MELT language has several linguistic devices to generate C code suitable for GCC internals, in the style expected by GCC. So MELT code contains constructs to fit into GCC, and to define operators related to GCC coding style.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.2 MELT overview

Any MELT enabling compilation is really a long lasting compilation. It is supposed that you use a powerful workstation (or laptop) with enough memory (at least 4Gigabytes of RAM is receommended on a 64 bits machine like x86-64), and that the MELT-enabled compilation will run a lot slower than a simple gcc -O1 compilation (hopefully doing some useful stuff). Notice that a MELT-enabled compilation usually generates C code, compile it (using another GCC compilation process) to a dynamically loadable library, and load its into the MELT-enabled GCC compilation process started by the user. In practice, the compilation of the generated C code (which is much bigger than the original MELT source) is the main bottleneck. Often, when using an existing MELT module, no C code has to be generated (it already exists).

The MELT plugin or branch contains several (related) stuff. Everything can be enabled or disabled at GCC run time:

a Lisp dialect compiled into C code, with which one can code sophisticated or prototypical middle end passes.
a runtime which extends the GCC infrastructure to support the previous items, in particular a generational copying garbage collector well suited for the lisp dialect above, which is build above the existing GGC (which deals with old values).

MELT is bootstrapped, in the sense that the translation from the MELT dialect to C is coded in MELT (hence the MELT generated C code is available from the source code).

The generated C code is including only one file run-melt.h which includes many GCC include files internal to the compiler. It is compiled into a dynamic library by a shell script *melt-cc-script* which invokes the host GCC with appropriate flags.

MELT obviously need that the binary (dynamic libraries warm*.so) for the MELT translator are already available. More generally, it uses several kind of files:

the script used to compile generated C files info dynamically loadable stuff. This script may be invoked by MELT GCC. In common cases, the first argument to the script is the MELT generated input *.c file and the second argument is the MELT loaded output *.so dynamic library.
an include directory (passed by -I to the compiler) containing all the useful GCC headers. This directory is only written by the installation procedure.
a permanent generated C code directory which contains some essential files, in particular the C form of the MELT translated.

MELT can be used as a plugin for GCC (and can also be compiled as a separate GCC branch). It uses some of the plugin machinery, even inside the MELT branch.

When using MELT, it is important in practice to give it a work directory (where all generated C or object files go).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.3 Building the MELT branch

To compile the MELT branch, you need the Parma Polyhedra Library. The Parma Polyhedra Library (PPL) is a free library available here, it is a C++ library (GPLv3 licensed) handling lattices like intervals etc. Also, the host compiler (the compiler which compiles the source code of GCC), also used to compile MELT generated C code during MELT enabled gcc execution, should be some version of gcc (preferably a 4.x version at least).

Note that currently MELT is only compiled on Linux machines.

MELT can also be used as a plugin to GCC (4.6 or latter).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.4 MELT as a plugin

MELT can be used as a plugin to a GCC 4.6 or 4.7 (or better) binary (i.e. future gcc 4.8) build with plugin enabled. You’ll need the GCC headers available to plugins, ‘gengtype’ and its state file to build and run the MELT plugin.

Detailed instructions about building MELT as a plugin are available in the MELT plugin source tarball.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.5 Invoking MELT

Without any MELT specific program flags, the MELT variant of gcc behave as the trunk. So to get or use MELT features, you need to pass some special flags. Most of these flags are starting with -fmelt for the MELT branch or with -fplugin-melt-arg for the plugin. They for the middle-end of GCC so are common for every source language (ie gcc, g++ … commands) and target.

MELT is usually invoked while compiling a (C, C++, …) source file but may occasionnally be invoked with an empty C input to perform tasks which are not related to a particular GCC input source file. In practice, you should pass an empty C file to gcc for that purpose. In particular, the translation of a MELT file foo.melt into C code foo.c is done with a special invocation like gcc -fmelt-mode=translatefile -fmelt-arg=foo.melt -fmelt-secondarg=foo.c (possibly with other options like some appropriate -fmelt-init=). It is possible but deprecated to invoke with -fmelt-mode=compilefile instead of -fmelt-mode=translatefile. In other words, the MELT translator to C is not a GCC front-end, like e.g. g++ is a C++ front-end of GCC.

The table below lists all MELT specific options, in alphabetical order. We list both MELT branch options like -fmelt-arg= and MELT plugin option like -fplugin-arg-melt-arg=

-fmelt-mode=

-fplugin-arg-melt-mode=

This flag (called the MELT mode flag) is required for every MELT enabled compilation. If it is not given, no MELT specific processing is done. If given, this gives the mode to be used before any MELT passes. It uses the :sysdata_mode_dict field of INITIAL_SYSTEM_DATA internal object of MELT to determine the MELT function applied to execute the mode. If this application returns nil, no GCC compilation occur (i.e. no *.c or *.cc etc… source file is read). Hence, some modes may be used for their side-effects. In particular, the compilation of MELT lisp source file *.melt into C code *.c is done this way.

Several modes may be given by separating them with commas. They are handled in that case in succession.

-fmelt-arg=

-fplugin-arg-melt-arg=

This gives the first argument string to MELT. It is incompatible with the -fmelt-arglist= option.

-fmelt-arglist=

-fplugin-arg-melt-arglist=

This gives the first argument list of strings to MELT. It is incompatible with the -fmelt-arg= option. The string program argument is split into a list of strings using the comma separator. For example, -fmelt-arglist=1,BB,3 makes a three-element list argument with first string 1, second string BB and third string 3. There is no way to give a string subargument containing a comma.

-fmelt-print-settings=

-fplugin-arg-melt-print-settings=

The builtin settings (notably MELT builtin modules directory and MELT builtin source directory) used by MELT are output in the given file, which should be source-able by a Posix shell if you are lucky enough. This is mostly useful in configuration, building, or packaging scripts.

-fmelt-coutput=

-fplugin-arg-melt-coutput=

This flag gives the name of the generated C file.

-fmelt-bootstrapping

-fplugin-arg-melt-bootstrapping

This flag is useless to most users. When given, MELT is bootstrapping (translating its own translator from MELT to C), so some environment variables and options are ignored. Only for gurus working on the MELT translator. See melt-runtime.c for dirty details.

-fmelt-print-settings=

-fplugin-arg-melt-print-settings=

-fmelt-coutput=

-fplugin-arg-melt-coutput=

This flag gives the name of the generated C file.

-fmelt-output=

-fplugin-arg-melt-output=

This flag gives the name of the generated files.

-fmelt-debug

-fplugin-arg-melt-debug

This flag has no argument and asks for lot of debugging output. It is only useful to debug MELT code and is unrelated to the -g flag asking GCC to output debug information. Obsolete, use -fmelt-debugging=mode instead.

-fmelt-generate-work-link

-fplugin-arg-melt-generate-work-link

This flag, when used in translating modes (for MELT to C translations), generates the files as a unique name in the work directory, with a symbolic link to it from the output path.

-fmelt-generated-c-file-list=

-fplugin-arg-melt-generated-c-file-list=

When given, this is a file name into which MELT lists the set of really written or overwritten emitted C file names, one per line (and also some lines starting with #). Unchanged files are prefixed with =, new or changed files are prefixed with +.

-fmelt-output=

-fplugin-arg-melt-output=

This flag gives the name of the generated files.

-fmelt-debug

-fplugin-arg-melt-debug

-fmelt-debugging=

-fplugin-arg-melt-debugging=

This flag should either be set to mode or to all. When set (e.g. with fplugin-arg-melt-debugging=mode) to mode, debugging messages happen only after mode processing; when set to all, they happen everywhere.

-fmelt-source-path=

-fplugin-arg-melt-source-path=

This flag sets the path (colon separated list of directories) for sources (i.e. ‘*.melt’ and ‘*.c’). Otherwise use the GCCMELT_SOURCE_PATH environment variable.

-fmelt-module-path=

-fplugin-arg-melt-module-path=

This flag sets the path (colon separated list of directories) for MELT binary modules (i.e. ‘*.so’). Otherwise use the GCCMELT_MODULE_PATH environment variable.

-fmelt-module-make-command=

-fplugin-arg-melt-module-make-command=

This flag defines the make command used to build MELT binary modules (i.e. ‘*.so’). from a small set of generated C files. The default is the GNU make utility used to build MELT, very often just make or perhaps gmake.

-fmelt-module-makefile=

-fplugin-arg-melt-module-makefile=

This flag defines the makefile used to build MELT binary modules (i.e. ‘*.so’). from a small set of generated C files. The default is a file ‘melt-module.mk’.

-fmelt-module-cflags=

-fplugin-arg-melt-module-cflags=

This flag defines the CFLAGS passed to make to build MELT binary modules. If not given, the environment variable GCCMELT_MODULE_CFLAGS is used if it was set.

-fmelt-init=

-fplugin-arg-melt-init=

This flag sets the initial MELT modules. They are separated by colons or semi-colons. So -fmelt-init=foo:bar or '-fmelt-init=foo;bar' (quotes are useful for the shell running GCC) load first the foo module and then the bar module. A module starting with an at sign @ is handled as a module list file. The .modlis extension is added, and then a file is seeked by that name. This file is read line by line (with empty or blank lines skipped, and comment lines starting with an hash # skipped). Each line is the name of a module do be load in sequence. For example, -fmelt-init=@mylist:bar with a file ‘mylist.modlis’ containing

# file mylist.modlis ; just a comment
alpha
beta

would have the same effect as -fmelt-init=alpha:beta:bar. Notice that modules are seeked in several directories. The notation @@ is a shorthand for the default module list called ‘melt-default-modules.modlis’ and is the default value of this flag.

-fmelt-extra=

-fplugin-arg-melt-extra=

This flag sets the extra MELT modules. They are separated by semi-colons or (on Unix only) colons. Extra modules are also searched in the current directory, and are loaded after processing of MELT options. In practice, to use your own MELT module foo you should pass -fmelt-extra=foo because your module needs the default modules.

-fmelt-tempdir=

-fplugin-arg-melt-tempdir=

This flags sets the temporary MELT directory. If specified it is not cleaned. If it does not exist, it is mkdir-ed and cleaned. Avoid setting it to a non-empty directory which may contain files named like MELT modules (such as ‘warmelt-*.so’ etc.).

-fmelt-option=

-fplugin-arg-melt-option=

This set some options for MELT. the argument is a comma separated sequence of options settings, each being an option name possibly followed by an equal sign and an option value. For example, -fmelt-option=foo,bar=x set the option foo and the option bar to x. An option name is case-insensitive and may appear several times.

-fmelt-workdir=

-fplugin-arg-melt-workdir=

This flags sets the working MELT directory. If specified all generated files go inside, and MELT modules are also loaded from it. Use that flag if you don’t want MELT related generated files to clobber your source tree.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.6 Tutorial about MELT

More up to date information may be found on GCC MELT pages.

As in all Lisps, parenthesis are important, so a and (a) do not mean the same thing. The first stuff after an opening parenthesis has usually an operator or syntactic keyword role.

MELT is a Lisp dialect translated into (unreadable, or at least unfriendly) C code. Some MELT constructs, and some MELT limitations (e.g. lack of tail-recursion) are related to this C translatability. The MELT translator is itself written in MELT (files ‘gcc/melt/warmelt-*.melt’) and is bootstrapped; the translated C files are in ‘gcc/warmelt-*-0.c’; they are quite big and are distributed with the GCC source code; use the upgrade-warmelt target of ‘gcc/Makefile.in’ to regenerate these C translations.

MELT is closely related to GCC internal passes and internal middle-end representations and runtime. Hence (in contrast to other LISP dialects) MELT is dealing with both boxed values and unboxed stuff (e.g. plain long integers as in C, but also trees and gimples, etc…, as inside GCC, separating them using their ctype). Keep always in mind the boxed versus unboxed distinction. Because of that, and because of GCC runtime (in particular the GGC garbage collector), MELT is neither polymorphic (you cannot deal with unboxed stuff like with boxed values) nor polytopic (no variable arguments facility).

Some familiarity with other Lisp dialects and with GCC internals is required to code in MELT.

The MELT runtime contains a copying generational garbage collector -GC- implemented in ‘gcc/melt-runtime.c’, backed up by the previously existing GCC ordinary (precise, marking) garbage collector GGC. The MELT-specific copying GC is designed for efficiency (but requires a very specific C coding style, easy to achieve in generated C code, but uncumfortable for human C developers), and handles well quick allocation of many short-lived objects [which is not a goald of GGC]. Therefore, don’t be afraid of allocating a lot of values inside MELT code.

This section has to be completed.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.6.1 Reserved MELT syntax and symbols

The following symbols have specific MELT meaning. Use them only as described here and avoid redefining them. and assert_msg comment compile_warning cond cppif current_module_environment_container debug_msg defciterator defclass definstance defprimitive defselector defun exit export_class export_macro export_values fetch_predefined forever get_field if instance lambda let make_instance match multicall or parent_module_environment progn put_fields quote return setq store_predefined unsafe_get_field unsafe_put_fields update_current_module_environment_container

Also avoid symbols starting with def

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.6.2 Primitives in MELT

A MELT primitive defines an operator by specifying how to translate into C each of its invocation. As a simple example, the less-than integer operator <i is defined as

(defprimitive <i                ; define the primitive 
  (:long                        ; next formal arguments are longs
   a b)                         ; the two formal arguments
  :long                         ; the type of the result (also long)
  "((" a ") < (" b "))")        ; how to expand into C code

Later on, a MELT expression like (<i a b) gets translated into C code similar to ((curfnum[3]) < (curfnum[7])) where curfun[3](1) is the translation of the normalized form(2) of a, etc.

Note that the above primitive accepts raw long integers (exactly the C long type) and returns such a long integer [0 if ((a)<(b)) was false in the C sense, and non-zero, perhaps -1, if it was true]. We say that such integers are unboxed stuff (we don’t speak of values in that case). The symbol :long represents the C type long and we call it a ctype.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.6.3 Citerators in MELT

A MELT c-iterator or citerator is a construct which generalize iterative loops (like the for in C). As a trivial example, to iterate on positive integers till a limit, define

(defciterator each-posint-till     ; define each-posint-till citerator
 (:long lim)                      ; start formal argument is lim
 eachposint                       ; state symbol - uniquely substituted
 (:long cur)                      ; local formals
 (                                ; start of before expansion
 "long " eachposint ";"
 " for (" eachposint"=0; " 
        eachposint "<" lim ";"
        eachposint "++) {"
   cur " = " eachposint;
 )
 (                                ; start of after expansion
 "}" 
 )
)

When used in a MELT expression like (each-posint-till (5) (:long v) (print-long v)) -which has :void ctype because citerators are only useful for their side-effects- the C translation is vaguely similar (assuming print-long is a primitive expanding to printf(``%d\n'',…) to something looking like

curfnum[11] /*LIM*/ = 5;
{long eachposint_24;
 for (eachposint_24=0; eachposint_24<curfnum[11]; eachposint_24++) {
   curfnum[3] /*V*/ = eachposint_24;
   printf("%d\n", curfnum[3] /*V*/);
 }
}

So the start formals is translated as some local variable in the MELT frame, the state symbol eachposint is only used to generate a C identifier (unique to each occurrence of the citerator) and the local formals are translated to local variables bound inside the iterators body.

In practice, citerators are very useful for interfacing to the various iterating idioms in GCC. A more realistic example is

;;;; iterate on a gimpleseq
(defciterator each_in_gimpleseq
  (:gimpleseq gseq)			;start formals
  eachgimplseq
  (:gimple g)				;local formals
  ( ;;; before expansion
   "gimple_stmt_iterator gsi_" eachgimplseq ";\n"
   ;; test that gseq is not null to be safe
   "if (" gseq ") for (gsi_" eachgimplseq " = gsi_start (" gseq
        "); !gsi_end_p (gsi_" eachgimplseq ");"
   " gsi_next (&gsi_" eachgimplseq ")) {\n"
    g " = gsi_stmt (gsi_" eachgimplseq ");"
   )
  ( ;;; after expansion
   "}"
   )
)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.6.4 Functions in MELT

As in many lisp dialect (e.g. Common Lisp) MELT functions are defined using the defun construct. The first argument (and the primary result) of all MELT function should always be a value, so it is not possible to give an unboxed gimple stuff to a function; hence we box it (pack it into a MELT value) before passing it as an argument.

The following define a second-order function (actually defined in ‘ana-base.melt’) called do_each_gimpleseq which gets two arguments, the first being itself a MELT function and the second being an unobxed gimple stuff, and apply the first argument to boxes packing each gimple inside the given gimpleseq.

;; apply a function to each boxed gimple in a gimple seq
(defun do_each_gimpleseq (f :gimpleseq gseq)
  (each_in_gimpleseq 
   (gseq) (:gimple g)
   (let ( (gplval (make_gimple discr_gimple g)) )
     (f gplval)))
)

This function is only useful for its side effect (calling a function for each member of a gimpleseq). It returns the nil value.

The real translation to C of the above is a quite big and messy C function, actually:

static melt_ptr_t
rout_9_DO_EACH_GIMPLESEQ (meltclosure_ptr_t closp_,
			  melt_ptr_t firstargp_, const char xargdescr_[],
			  union meltparam_un *xargtab_,
			  const char xresdescr_[],
			  union meltparam_un *xrestab_)
{
#if ENABLE_CHECKING
  static long call_counter__;
  long thiscallcounter__ ATTRIBUTE_UNUSED = ++call_counter__;
#define callcount thiscallcounter__
#else
#define callcount 0L
#endif
  struct frame_rout_9_DO_EACH_GIMPLESEQ_st
  {
    unsigned nbvar;
#if ENABLE_CHECKING
    const char *flocs;
#endif
    struct meltclosure_st *clos;
    struct excepth_melt_st *exh;
    struct callframe_melt_st *prev;
#define CURFRAM_NBVARPTR 5
    void *varptr[5];
/*no varnum*/
#define CURFRAM_NBVARNUM /*none*/0
/*others*/
    gimple_seq loc_CTYPE_GIMPLESEQ__o0;
    gimple loc_CTYPE_GIMPLE__o1;
    long _spare_;
  }
  curfram__;
  memset (&curfram__, 0, sizeof (curfram__));
  curfram__.nbvar = 5;
  curfram__.clos = closp_;
  curfram__.prev = (struct callframe_melt_st *) melt_topframe;
  melt_topframe = (struct callframe_melt_st *) &curfram__;
  melt_trace_start ("DO_EACH_GIMPLESEQ", callcount);

The generated C function has a strange C formal arguments list (every applicable routine has the same signature in C. All arguments except the first are passed in an array of union, described by a short constant string, one character per argument, encoding its ctype. Secondary results are handled likewise). Some code is only enabled with #if ENABLE_CHECKING when GCC is configured for debugging (not for release). The MELT call frame is declared explicitly as a structure called curfram__, and is properly initialized, and set as the melt_topframe. The melt_trace_strart MELT_LOCATION callcount C macros are significant only when #if ENABLE_CHECKING.

  /*getarg#0 */
  MELT_LOCATION ("ana-base.melt:436:/ getarg");
#ifndef MELTGCC_NOLINENUMBERING
#line 436 "ana-base.melt" /**::getarg::**/
#endif /*MELTGCC_NOLINENUMBERING */
 /*_.F__V2*/ curfptr[1] = (melt_ptr_t) firstargp_;

We start to fetch the first argument into the current frame, since curfptr is actually a C macro defined as curfram__.varptr. The MELT_LOCATION macro call (significant only when checking was enabled, and setting the flocs field of the current frame in that case) and the #line directive(3) refer to the MELT source location. For clarity, we now skip them, but there are lots of such positional information in the generated C code. Note that a single MELT source line is producing many C code lines (hence the line numbering seen in a debugger might be slightly wrong), and that some comments are generated (notably explaining what each curfptr occurrence means).

  /*getarg#1 */
  if (xargdescr_[0] != BPAR_GIMPLESEQ)
    goto lab_endgetargs;
  curfram__.loc_CTYPE_GIMPLESEQ__o0 = xargtab_[0].bp_gimpleseq;
  goto lab_endgetargs;
lab_endgetargs:;

The second argument is likewise fetched, only if the actual argument is of gimpleseq ctype. The useless goto is optimized by any serious C compiler (like gcc).

/*block*/
  {
    /*citerblock EACH_IN_GIMPLESEQ */
    {
      gimple_stmt_iterator gsi_cit1__EACHGIMPLSEQ;
      if ( /*_?*/ curfram__.loc_CTYPE_GIMPLESEQ__o0)
	for (gsi_cit1__EACHGIMPLSEQ =
	     gsi_start ( /*_?*/ curfram__.loc_CTYPE_GIMPLESEQ__o0);
	     !gsi_end_p (gsi_cit1__EACHGIMPLSEQ);
	     gsi_next (&gsi_cit1__EACHGIMPLSEQ))
	  {
/*_?*/ curfram__.loc_CTYPE_GIMPLE__o1 =
	      gsi_stmt (gsi_cit1__EACHGIMPLSEQ);
	    /*block */
	    {
   /*_.GPLVAL__V4*/ curfptr[3] =
		(meltgc_new_gimple
		 ((meltobject_ptr_t)
		  (( /*!DISCR_GIMPLE */ curfrout->tabval[0])),
		  ( /*_?*/ curfram__.loc_CTYPE_GIMPLE__o1)));;

This is the beginning of a block generated by a citerator. It contains the translation of the make_gimple primitive use as a call to the meltgc_new_gimple C function.

	      /*apply */
	      {
		/*_.F__V5*/ curfptr[4] =
		  melt_apply ((meltclosure_ptr_t)
				 ( /*_.F__V2*/ curfptr[1]),
				 (melt_ptr_t) ( /*_.GPLVAL__V4*/
						  curfptr[3]), 
                                 "", (union meltparam_un *) 0, 
                                 "", (union meltparam_un *) 0);
	      };

This is the translation of the application of f. Since there only one argument and no secundary results, we pass null union meltparam_un pointers described by empty strings to follow the pecular conventions required by melt_apply(4) and respected by MELT generated C functions implementing MELT routines.

	      /*epilog */
	     /*clear *//*_.GPLVAL__V4*/ curfptr[3] = 0;
	     /*clear *//*_.F__V5*/ curfptr[4] = 0;
	    };
	  }
      /*citerepilog */
	    /*clear *//*_?*/ curfram__.loc_CTYPE_GIMPLE__o1 = 0;
	    /*clear *//*_.LET___V3*/ curfptr[2] = 0;
    }				/*endciterblock EACH_IN_GIMPLESEQ */

Some MELT local variables are explicitly cleared. This helps the MELT garbege collector. The block generated for the citerator is ended, again by clearing some locals.

    /*epilog */ };
  goto labend_rout;
labend_rout:
  melt_trace_end ("DO_EACH_GIMPLESEQ", callcount);
  melt_topframe = (struct callframe_melt_st *) curfram__.prev;
  return (melt_ptr_t) ( /*noretval */ NULL);
#undef callcount
#undef CURFRAM_NBVARNUM
#undef CURFRAM_NBVARPTR
}				/*end rout_9_DO_EACH_GIMPLESEQ */

This is the whole function epilog. The MELT top frame is popped, and the previous is reinstated.

Of course, nobody wants to read or understand the generated code above.

In practice, such second-order functions (second order because they are functionals, consuming function arguments) are often used with anonymous functions using the lambda construct, eg

(do_each_gimpleseq
 (lambda (boxgimp) ;anonymous function with argument boxgimp
  (let ( (:long gimp
          ; fetch the content of the boxed gimple value as an unboxed stuff
          (gimple_content boxgimp)) )
  …. do something with gimp stuff ….
 ))
bgs ; some boxed gimple value
)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7 Reference on MELT

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.1 Lexical MELT conventions

It is recommended to edit MELT files with a Lisp-aware editor (e.g. the GNU emacs Lisp mode).

As in Lisp dialects:

parenthesis are essential and should be matched. It is an error to add extra right parenthesis.
brackets are like parenthesis but should be matched (but you probably don’t want to use them). [a b] is the same as (a b) but both [a b) and (a b] are incorrect.
comments start with a semicolon (;) to the end of the line. This is the prefered way to put comments in MELT file.
block comments start with hash-bar (#|), may take several lines, and end with bar-hash (|#). Don’t nest block comments.
space characters are token sepators, but indentation does not matter (we strongly recommend the MELT code to be properly indented, e.g. using Emacs Lisp mode, for readability purposes).
case is insensitive; words, i.e. identifiers and keywords are all converted to uppercases.
strings are denoted like in C between double quotes, with backslashes escaping (eg double-backslash \\ to represent a single backslash, backslash doublequote \" to represent a doublequote, backslash t \t for a tab, , and \xfe to represent the character coded 0xfe in hex, etc. In addition, a backslash-leftbrace \{ read verbatim all characters up to the first rightbrace }. A string with the last doublequote followed by an underscore like "do that"_ is localized using the gettext host system function; this could be useful for some user messages (to be translated to other languages like french).
symbols (i.e. identifiers) are case insensitive and may contain non alphanumerical characters like _+-*/<>=!?:%~&@$. It is advised to use these special characters sparingly. Symbols cannot start with any of ?%. Because symbols are related to their C translation, is advised to avoid digits after underscores in symbols like x_12 and to have each symbol contain at least one letter (e.g. use <i instead of <).
the quote character ' is special. 'x is parsed the same as (quote x).
the backquote character ` is special. `x means the same as (backquote x)
the comma character , is special. So ,x means (comma x) and ,(a b) is (comma (a b))
the question mark chararacter ? is special when is is the first of a token (it may appear inside a symbol otherwise). For instance, ?x means (question x) but x? is a symbol of two characters. So ?y? is bad taste but means (question y?)
the hash character # is special. In particular, #| starts multiline comments; #\space is the integer code of the space character; #b10 is a binary number (i.e. two), #o12 is octal (ie ten), #xffff is hexadecimal number (ie 65535). #{ starts macrostrings.
macro strings To avoid escaping many C-like caracters in C code chunks used for primitives, c-iterators, c-matchers etc.. an alternative multi-line lexical construct exist: the macro string started with #{ and ending with }# possibly on a different line with $ escapes like in C. For example, the #{if ($A>0) printf("%s", $B);}# macrostring is parsed exactly as the 5-elements s-expression ("if (" A ">0) printf(\"%s\", " B ");"). In a macrostring, all caracters are taken as is, except the dollar sign $; the macro-string itself is always read as an S-expr. When a dollar is followed by alphanumerical (or underscore) caracters like a C identifier, it is parsed as a symbol. If it is followed by an hash # caracter, that hash-character is skipped and terminate the symbol. The $. sequence is skipped and ignored, the double-dollar $$ is read as a single dollar, the $# is read as a single hash #.
A macro-string starting with the four characters #{$' is expanded into a (quote ...) expression and should preferably not contain symbols like $symb. This special meaning of $' is only relevant when appearing at the very start of the macro-string.
braces { and } are special.
numbers are integers in decimal like -123 or +22 or 33. Notice that 1.2 is illegal; it is not a floating point number.
colons (i.e. :) starts constant (lisp-like) keywords which always evaluate to themselves.

Contrarily to some or most other Lisp dialects:

- don’t use the dot for cons-ing, e.g. (a b . c) is not legal.
- strings may contain escaped braces with special verbatim-like meaning.
- a string whose ending doublequote is immediately followed by an underscore (e.g. "example of international"_) is localized by calling gettext at read time.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2 Main MELT syntax and features

We list each key symbol in alphabetical order and provide a short derscription. Familiarity with some Lisp or Scheme dialect is required.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.1 MELT formals

A formal argument list is a possibly empty list (between parenthesis). This list contains either ctype keywords or formal names. A ctype keyword apply to all further formals (until another ctype keyword, or end of formal arguments list. Ctypes have a keyword and are each described by a predefined instance (of CLASS_CTYPE) with a name conventionnally starting with ctype_. [For experts: to add a new ctype, define a BGLOB_CTYPE_* predefined in ‘gcc/melt.h’ and an instance in ‘warmelt-first.melt’ using install_ctype_descr, then regenerate all the ‘gcc/warmelt-*.c’ files]

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.2 MELT ctypes

Here are the list of ctype-s.

:value (ctype instance ctype_value) This ctype is for MELT [boxed] values. It is the default ctype of arguments.
:long (ctype instance ctype_long) This ctype is for unboxed long integers; it is also used for conditions and tests.
:tree (ctype instance ctype_tree) This ctype is for GCC tree raw pointers, as in ‘gcc/tree.h’.
:gimple (ctype instance ctype_gimple) This ctype is for GCC gimple raw tuple pointers, as in ‘gcc/gimple.h’.
:gimpleseq (ctype instance ctype_gimple) This ctype is for GCC gimple_seq raw pointers, representing sequences of gimple instructions, as in ‘gcc/gimple.h’
:basicblock (ctype instance ctype_basicblock) This ctype is for GCC basic_block raw pointers, representing basic blocks, as in ‘gcc/basic-block.h’
:edge (ctype instance ctype_edge) This ctype is for GCC edge raw pointers, representing edges of the control flow graph, as in ‘gcc/basic-block.h’
:void (ctype instance ctype_void) This ctype is the same as C void type. It should not be the type of formal arguments. It is only useful as the result type of side-effecting primitives.
:cstring (ctype instance ctype_cstring) This ctype is only for constant strings (like const char[] in C). It is not possible to build an unboxed :cstring. Every :cstring variable may only be bound to constant strings (not to something inside some heap).

MELT formal arguments appear in lambda defun defprimitive defciterator multicall forms. The first formal argument of defun lambda multicall constructs should -if given- be a :value. Ctype-s also appear in let bindings. Each MELT expression (or constant or variable) has a ctype (usually :value).

The :value ctype is the only ctype for boxed values. Every other ctype is for unboxed stuff.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.3 MELT boxed values

Most data manipulated by MELT code are values. Values are allocated in the nursery generation of MELT heap, and are later (if alive) copied into GGC heap. A minor MELT garbage collection, which runs quickly and often, only copies live values (in particular, local variables of MELT functions) out of the nursery, into the GGC heap. A full MELT garbage collection also invokes the GGC collector, so scans the entire heap.

MELT boxed values can be one of:

nil (represented by the C NULL pointer and noted () in MELT) is a value. It is the initial or default value everywhere.
multiples (or MELT tuples) - they are a fixed array of values agglomerated as a multiple.
closures (or MELT functional values) represent a functional value, containing a routine and closed values; the only way of making closures is thru the lambda and defun syntactic constructs.
routines are the reification of MELT functions (generated internally).
lists are singly linked lists of pairs. Efficient access to the first and last pair of the list are provided. Unlike in many other Lisps, lists are not simply pairs (but implemented as the grouping of the first and last pair contained in the list), so appending a list to another one, or a single value at the beginning or the end of a list, is a simple operation. Lists are never circular and have a finite length.
pairs are like CONS pairs in most other Lisps. In particular, a list knows its first and last pair. The head of a pair is an arbitrary boxed value, but its tail is a pair or nil.
triples (rarely used) have arbitrary head and middle values, but the tail is a triple or nil. They could be used like A-lists’ nodes in Lisp.
integers (are actually boxed longs).
strings (are like boxed cstrings; they are immutable, so the characters inside them do not change; they are terminated by a null byte, like in C).
string-buffers (are mutable buffers of strings and may grow appropriately; they are a bit like C++ string streams).
boxes (like references in ML, are mutable boxed containers).
objects have values in their fields (or slots) and are described below; each MELT object has a class (which is also a MELT object), which are organized in a single-inheritance class hierarchy rooted at CLASS_ROOT.
mixints are mixing an arbitrary mutable MELT value and an integer.
mixlocs (for experts; they are mixing an arbitrary mutable MELT value and a location_t indicating a location inside e.g. a MELT or C source file).
object maps are an hashtable association between MELT objects and arbitrary MELT non-null [boxed] values.
string maps are an hashtable dictionnary mapping strings to arbitrary non-null MELT values.
boxed ctypes Each ctype has its boxed representation, which is a value containing the raw (unboxed) ctype like gimple etc..
boxed ctype maps Each ctype [except :long :void :cstring] has its boxed map, an hash table associating (non-null) stuff of the given ctype with arbitrary non-null MELT [boxed] values. For example, a gimple map associate GCC gimples to arbitrary MELT non-null values (usually MELT objects). This is very useful to represent a relationship (conceptually an attribute) between gimples and MELT values such as objects without having to enhance the definition of the gimple structure inside ‘gcc/gimple.h’
special values They are useful to represent stuff like MPFR things (arbitrary precision numbers), PPL coefficients, etc… The MELT runtime is able to run a sort of destructing C function when a special value is no more used, so the handling of special values is more expensive than for other values.

Notice that (contrarily to most other lisps) MELT symbols and MELT s-expressions are both objects (respectively of class CLASS_SYMBOL and CLASS_SEXPR). The reader function (which is not as versatile as in CommonLisp) deals with them.

Adding additional MELT value types require enhancing the ‘gcc/melt.h’ and ‘gcc/melt.c’ files.

Each MELT [boxed] value starts with a discriminant. This discriminant is a MELT object (it cannot be nil). The nil value has conceptually its own discriminant DISCR_NULLRECV, but is of course represented by C NULL pointer. Discriminants are used by the garbage collector (precisely to discriminate various MELT boxed data types using the object number of their discriminant), and by the MELT message sending machinery (hence messages sent to the nil MELT value are processed using the DISCR_NULLRECV discriminant). Each kind of MELT value has its own discriminant, but sometimes it is useful to have several discriminants possible for the same kind of MELT value. For example, MELT strings can have DISCR_STRING or DISCR_VERBATIMSTRING etc., and verbatim strings are handled specially (in particular when printing them inside generated C code). Every MELT [boxed] value has an immutable discriminant, set at the time of the value’s creation.

Conventionally MELT non-object values have a primitive to test them called like is_*, a primitive to build them called like make_* [which takes a discriminant as the first argument], and the accessing and modifying primitives share a common prefix. In particular, object maps are tested with is_mapobject, built with make_mapobject, accessed with mapobject_get and updated using the mapobject_put and mapobject_remove primitives. For more details, look into file ‘gcc/melt/warmelt-first.melt’.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.4 MELT objects and classes

An important (and common) kind of MELT [boxed] values are MELT objects. A MELT object contains exactly

the discriminant or class of the object; as every [boxed] value, MELT objects starts with a discriminant; for objects, it is their class, which is an object itself. We say “Cl is the class of Ob” or equivalently “Ob is a [direct] instance of Cl” when Ob is a MELT object of discriminant Cl.
the hash code of the object is an unsigned non-zero (more or less random, immutable i.e. fixed) integer, given at object build time (i.e. instanciation time).
the object number or objnum of the object is a small unsigned short integer. It is usually assigned at object build time. For discriminants Di, their objnum is also called the magic number of the values Va of the given discriminant Di.
the object length or size is the number of slots or fields of the objects. All objects Ob of a given class Cl have the same fixed number of slots (no more than 32767 slots and almost always a lot less, e.g. at most a dozen). Some objects could have a length of 0 (if their class is the CLASS_ROOT or has no direct or inherited fields), but this is very unusual.
the object slots or object fields are the values contained inside the object. These fields may be mutable; their number is fixed (it is the object length).

In practice, every object’s slot is described by a field object (of class CLASS_FIELD) inside the object’s class.

Every discriminant (in particular every class) is an object with the following fields (or slots):

prop_table is the property object map associating objects to values, and usable as a P-list.
named_name is the boxed string naming the discriminant.
disc_methodict is an object map associating selectors to closures (method implementations).
disc_super is the super-discriminant (or the super-class for objects)

The root discriminant is DISCR_ANYRECV. The discriminant of the nil value is DISCR_NULLRECV. Other types of values have discriminants like DISCR_ANYRECV DISCR_BASICBLOCK DISCR_BOX DISCR_CHARINTEGER DISCR_CLOSURE DISCR_EDGE DISCR_GIMPLE DISCR_GIMPLESEQ DISCR_INTEGER DISCR_LIST DISCR_MAPBASICBLOCKS DISCR_MAPEDGES DISCR_MAPGIMPLES DISCR_MAPGIMPLESEQS DISCR_MAPOBJECTS DISCR_MAPSTRINGS DISCR_MAPTREES DISCR_METHODMAP DISCR_MIXEDINT DISCR_MIXEDLOC DISCR_MULTIPLE DISCR_NAMESTRING DISCR_NULLRECV DISCR_PAIR DISCR_ROUTINE DISCR_SEQCLASS DISCR_SEQFIELD DISCR_STRBUF DISCR_STRING DISCR_TREE DISCR_VERBATIMSTRING. Some discriminants are specialized by having a meaningful (i.e. not DISCR_ANYRECV) super-discriminant (i.e. the value inside the :disc_super slot). For example, DISCR_METHODMAP is used for object maps which are method maps (mapping a selector to a function implementing a method), instead of the plain DISCR_MAPOBJECTS. [For experts:] It is possible to make additional discriminants using definstance with CLASS_DISCR as the class.

Classes are discriminants, but in addition have the following fields (or slots):

class_ancestors is the multiple (of discriminant DISCR_SEQCLASS) of the classes’ ancestors. Testing that a given object has some given class as its direct class or indirect ancestor is quick (is_a primitive in MELT, melt_is_instance_of function in C code).
class_fields is the multiple (of DISCR_SEQFIELD) of the classes’ fields (both inherited from ancestors or own to the class).
class_objnumdescr is usable for describing the objnum of instances.
class_data is an additional slot for holding class data.

Fields are slot descriptors (objects of CLASS_FIELD), they are named (so inherit fields prop_table named_name). Their objnum is their index, their specific slots are

fld_ownclass gives the class defining the field.
fld_typinfo can be used for describing the field’s type in instances.

Beware that the structure of classes, fields and discriminants is described not only in ‘warmelt-first.melt’ but also “built-in” in files ‘melt.c’ and ‘melt.h’ so changing them is very tricky.

Fields should have a globally unique name. Conventionally, fields common to the same class share a common prefix for their name.

The defclass construct builds and fills class and fields objects. Don’t make instances of CLASS_CLASS or CLASS_FIELD otherwise!

Objects are built using the make_instance construct, or statically using definstance. In addition, defselector defclass also statically build objects (likes classes and fields).

Exporting a class means exporting the class object and its own fields.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.5 MELT function application

Function applications are noted (fun args …). There may be no arguments, e.g. just (fun). If arguments are given, the first argument must be a :value. So (f 1 x) is incorrect (because 1 is an unboxed :long); use (f x 1) instead. Usually, the function is just a variable bound to a function, but it may be a more complex expression, like ((if (p x) f g) x y) which, depending on the test (p x) applies either f or g to x y.

The application of a non-function returns null. The melt_apply C function doing the application checks that the applied function is indeed a function (ie a closure). Function applications are never tail-recursive; they always consume some stack space.

Named functions are defined using the defun construct, using a Common Lisp like syntax (not the Scheme define). If the formal arguments list is not empty, its first element (the first formal argument of a named or anonymous function) should be a :value.

Functions are not polytopic nor polymorphic; their signature is essentially fixed. They should expect a fixed number of arguments [there is no variable argument facility in MELT], each with a defined ctype (the first argument should be a :value), and return a fixed number of results (the first result should be a :value) each with a defined ctype. An argument which has not the expected ctype or is missing is initialized to null or 0. Likewise a secundary result which has not the expected ctype is ignored or set to null or 0.

A function should [always] return a primary result of ctype :value and may also return secondary results (using the return construct). The only way of getting the secondary results of a function call (or a message send) is thrue the multicall construct, which binds all the results of the call or send to the formal arguments in the multicall. Function applications not done in a multicall have all their secondary results (if any) ignored.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.6 MELT function abstraction and closures

Function abstraction (i.e. making anonymous functions) is done using the lambda construct. Only values can be closed, hence it is not possible to close a non-boxed value, so (let ( (:long one 1) ) (lambda (a) (f a one))) is incorrect (and rejected by the MELT translator).

Actually, every MELT function is really a closure, so defun binds a name to the closure which is the named function.

Closures are :values. Use the is_closure primitive to test tha a given value is indeed a closure. The only way of building closures is thru lambda or defun. Closures contain a routine pointer (routines are also :values) and closed values. [For experts] the size of a closure is available thru the closure_size primitive. Its routine is available thru closure_routine primitive. To get its n-th closed value, use the closure_nth primitive. At MELT runtime, each MELT call frame for MELT function application (or message sending) knows its closure.

Routines correspond to MELT generated C functions (with their constant values).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.7 MELT message sending

A message invocation is done using the construct (selector-name reciever args …). This construct is syntactically the same as function (or primitive) application, and is discriminated by the fact that the selector-name has been previously defined with defselector or is imported as bound to an instance of CLASS_SELECTOR. The selector should be such a name and cannot be an expression. The reciever can be any :value (even null). The args are optional and can have any ctype (but a selector should have a fixed and well defined signature). Use export_values to export selectors.

A method is just a functional value, installed thru the install_method function. This function expects a discriminant or class, a selector, and a function (the method). Method installation is very dynamic and can be done at any time.

A message invocation (i.e. an expression starting with a selector) can be done on any boxed value. If it is an object, its class is used; otherwise its discriminant is used (so DISCR_NULLRECV is used when sending to nil). To send a message of selector sel (an instance of CLASS_SELECTOR) to a reciever recv of discriminant (e.g. the class of an object) dis, the following procedure is used:

dis should be a discriminant; if it is not an instance of CLASS_DISCR, stop and do nothing.
get the discr_methodmap slot of dis; it should be an object map (i.e. a “dictionnary” of methods) that we call md.
get the method meth associated to sel in md; if meth is a function (i.e. a boxed MELT closure), apply meth to the reciever recv and any additional arguments. This ends the message invocation.
otherwise, no method is found, so replace dis by its super-discriminant (its slot disc_super) and repeat again. Hence, methods are also looked in superclasses, etc… so are properly inherited.

Notice that message invocation is more dynamic (hence slower) than e.g. C++ virtual member functions, and that method maps can be upgraded at any time.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.2.8 MELT syntax constructs

The table below gives MELT syntax constructs, in alphabetical order. [Experts can add new constructs using macros, and implementing appropriate methods in the MELT translator].

and: (and e1 e2 e3 …) is (like in all Lisps) used for sequential conjunction; it is the same as (if e1 (if e2 e3)) etc… Any number (at least one) of conjuncts are possible. All the conjuncts (e1 …) should have the same ctype (usually :value).
assert_msg: (assert_msg msg check) aborts when check is false (using the assert_failed primitive, giving the source file position) and displays the given msg, when GCC is built for debugging with ENABLE_CHECK. If GCC is not built for debugging, neither operand is used. The entire assert_msg expression evaluates to nil.
comment: (comment msg) evaluates to nil and output the msg as a C comment in the C translation. Don’t use */ or */ in msg. When a comment appears at the beginning of a MELT compilation unit, it appears at the beginning of the generated C file; this is useful for making copyright notices appear both in the MELT source file and the generated C code.
compile_warning: (compile_warning msg exp) evaluates like exp but also emits a message at MELT compilation time. Intended use is similar to #warning in C.
cond: (cond condition1 condition2 …) is -like in all Lisps- a conditional evaluation. Each condition is (test then1 then2 … thenk) so the test is evaluated. If it is true, all the thens are evaluated in sequence, and the last is the result of the whole cond expression. The last condition can be (:else else1 … elsek); if no previous test succeeded, all the elses are sequentially evaluated, and the last of them is the whole cond result. Notice that (cond (test1 then1) (test2 then2a then2b) (:else else1 else2 else3)) is the same as (if test1 then1 (if test2 (progn then2a then2b) (progn else1 else2))).
cppif: [for experts] (cppif name then-cpp else-cpp) is translated using a C directive #if name to the translation of then-cpp or else-cpp.
current_module_environment_container: [for experts] (current_module_environment_container) evaluates to an object of CLASS_CONTAINER containing the current module environment.
debug_msg: (debug_msg expv msg [count]) -where the count expression (of ctype :long) is usually ommitted- is useful for debugging ouput of the value of expv (with the -fmelt-debug program option) to output, using the debug_msg_fun function. The entire debug_msg expression is somehow equivalent to (cppif ENABLE_CHECKING (debug_msg_fun expv msg count filename lineno) ()) and evaluates to nil.
defciterator: The form (defciterator iter-name start-formals state-symbol local-formals before-expansion after-expansion) defines a C-iterator named iter-name. The start-formals is a [binding] list of formal arguments [given to the C-iterator]. The state-symbol is usable in the expansions, where it is expanded to a unique C identifier. The local-formals is a [binding] list of variables local to the expanded block. The before-expansion and after-expansion are lists of items like strings (appearing as is in the C expansion) or symbols (either from the start formals, or the local formals, or the state symbol).
defclass: The form (defclass class-name [:predef predefined] [:super superclass-name] :fields fields-list) defines a class named class-name of super-class named superclass-name with the given fields-list (a list of field names) and an optional predefined name (for predefined classes [giving a predefined is for experts]).
defcmatcher: The form (defcmatcher cmatcher-name match&in-formals out-formals state-sym test-expansion fill-expansion oper-expansion) defined a matching construct by its C translation. The match&in-formals gives the matched thing ctype (as the first formal argument, either a boxed value or a raw stuff) and input arguments (rest of formals). The out-formals are the signature of the deconstructed things. The test-expansion expands (as a C boolean-like expression) to the test part of the match. The fill-expansion expands (as a sequence of C instructions) to the deconstructing part. The oper-expansion is used, much like in primitives, when the cmatcher-name appears as an operator in an expression context.
definstance: The form (definstance instance-name class-name [:predef predefined] [:obj_num object-number] :field-name field-value …) statically defines an instance of name instance-name of the class class-name. [expert usage: a predefined name and an object-number may also be given].
defprimitive: The form (defprimitive primitive-name formals-arglist ctype expansion …) statically defines a C primitive named primitive-name of a given formals-list and given return ctype. The expansion-s are either strings or formal names.
defselector: The form (defprimitive selector-name selector-class :field-name field-value …) defines a selector. Usually selector-class is CLASS_SELECTOR, and no other fields are given. Once a name is bound to a selector, every further occurrence of that name in operator position is considered as a message invocation.
defun: The form (defun function-name formals-list body …) define a function named function-name. The ctype of the first (if any) formal argument (in the formals-list) should be a :value. The function-name can appear in the given body (for recursion).
exit: The form (exit loop-label expr …), only used inside forever loops, causes the lexically enclosing forever loop named by loop-label to be exited, after evaluation of the exprs. The last such value (or nil if no expr is given) is the result returned by the forever loop. exit forms are similar to Ada’s exit or C break (not to longjmp). The exit should be local to the containing procedure: it cannot jump across lambdas.
export_class: The form (export_class class-name …) export all the given class-names and their fields.
export_macro: [For experts] The form (export_macro macro-symbol expander) exports a macro binding for the given macro-symbol with the expander function. The macro macro-symbol is defined in the environment exported by the current module, so is available in other modules only (but not in the current one).
export_patmacro: [For experts, not implemented] The form (export_patmacro patmacro-symbol pat-expander mac-expander) exports a pattern macro binding for the given patmacro-symbol with the pat-expander as a pattern expanding function (used in patterns) and the mac-expander as a macro expanding function (used in expressions).
export_values: The form (export_values exported-name …) export all the names, as values, given as arguments. For classes, export_class should be used, otherwise the fields are not exported.
fetch_predefined: [For experts] (fetch_predefined predefined-name-or-number)
forever: (forever label-name body …) when evaluated, the bodies are evaluated in sequence, and indefinitely re-evaluated again. The only way of getting out from a forever loop is with exit (using the given label-name, lexically inside the body) or return. Avoid using a bound variable name as a label-name.
get_field: (unsafe_get_field :field-name expr) retrieves the field named :field-name from the object returned by expr expression. If it is not an appropriate object (of the class owning the :field-name) , gives nil.
if: (if test then-exp [else-exp]). When evaluated, the test is first evaluated. If it is true, the then-exp is evaluated and is the result of the whole if. If it is false (either 0 if ctype-d :long, or the null pointer for :value and other ctypes), the optional else-exp is evaluated (or 0 or null) and is the result of the whole if. Both the then-exp and the else-exp (if given) should have the same ctype.
instance: (instance class-name [:field-name field-value] …) is a constrctive expression for instances, where the class-name is the name of a class (it cannot be a complex expression but should be a class statically known) and where each :field-name keyword (starting with a colon) is the name of some field (direct or inherited) of the class and the following field-value is an expression giving its initial value; the result of instance is a freshly built instance of the given class-name initialized with the fields (fields which are not mentionned are initialized with nil).
lambda: (lambda formal-args body …) is a constructive expression for function abstraction, it returns a closure, the anonymous function taking formal-args as arguments and evaluating sequentially the body expressions, returning the value of the last one. The first argument of a function and the first result that it is returning should be a :value.
let: (let (let-binding …) body …) is a sequential binding construct (closer to let* in other Lisps). The first operand should be a list of let-bindings. Others operands make the body, evaluated in sequence with the new bindings applied with lexical scoping. A let-binding is an optional ctype (:value by default) followed by a variable name (ie a symbol) followed by one expression. Variables bound by previous let-bindings are visible in the expression inside the current let-binding (so recursion is not permitted like with flet or letrec in some Lisps). Notice that a let-binding can bind a variable to unboxed stuff (like a plain long integer). The result of the whole let expression is the result of the evaluation of the last body expression, done with the new bindings.
letrec: (let (letrec-binding …) body …) is a recursive binding construct. The letrec bindings should only bind constructive expressions, that is lambda-s, tuple-s, instance-s and list-s.
list: (list expr …) is a constructive expressions for lists. It returns a tuple made of the arguments.
match: (match expr match-case … ) NOT IMPLEMENTED YET Do a pattern match. Evaluate expr and for the first maching match-case, do its body. There is no :else clause, use the joker pattern ?_ for that purpose. A match-case is a simple match case (pattern body …) where body is evaluated with the pattern variables appearing in pattern bound. A match-cas can be a when match case (:when pattern when-cond body …) where the body is done when that pattern matches and the when-cond (evaluated with the pattern variables bound) is a true condition.
multicall: (multicall (result-formals) call-expr body …) is the only way to retrieve multiple (one primary and some secondary) results from a function application or a message invocation call-expr (which should syntactically be an application or an invocation, not anything else). The result-formals are syntactically like formal arguments; See section MELT formals. The first result formal should be of ctype :value. Secondary result formals which are not matching the ctype of the actual secondary result are cleared. The bindings of the result formals are local to the multicall expression and usable in the body sequence.
or: (or e1 e2 e3 …) is the sequential disjunction of e1 … (at least one disjunct). In particular (or a b) is the same as (if a a b) except that a is evaluated once. All the disjuncts should have the same ctype (usually :value).
parent_module_environment: [For experts] (parent_module_environment) return the parent module’s environment.
progn: (progn e1 e2 … en) evaluates successfully e1 then e2 and return the value of the last en.
put_fields: (put_fields obj :field-name1 val1 …) updates the object value of obj by changing its field named :field-name1 to the value of val1 etc… (all the fields are updated at once). It is safe, in the sense that if obj is not an object of the appropriate class, nothing happens.
quote: (quote x) is the same as 'x and returns the symbol x itself (as an instance of CLASS_SYMBOL). When applied to an integer, like '1, it gives a constant boxed integer value (of DISCR_INTEGER). When applied to a string, like '"string", it gives a constant boxed string value (of DISCR_STRING). Therefore, when passed as an actual argument (to a primitive, a function, ...) '1 (a boxed integer value) is not the same as 1 (a raw integer stuff), and likewise '"abc" is a boxed string value, different of "abc" (a raw string stuff). This is very different from other Lisps! Only symbols, strings, integers can be quoted.
return: (return e1 …) return from the entire containing function (i.e. defun or lambda). The first expression e1 should be of ctype :value and is evaluated as the primary result. Other expressions are evaluated (and can have different ctypes) and returned as secondary results. A (return) without argument is a convenience for returning the nil value. The ctype of the return is :value even if the return expression itself does not gives a value (because it breaks the control flow), hence (or (return) 'x) is acceptable but tasteless.
setq: (setq var exp) assigns to the local variable var the value of exp (which is also the value of the entire setq expression). Both var and exp should have the same ctype.
store_predefined: [Expert] (store_predefined predef-name-or-number expr) Don’t use it if you don’t understand.
tuple: (tuple expr …) is a constructive expressions for tuples (or multiples). It returns a tuple made of the arguments.
unsafe_get_field: (unsafe_get_field :field-name expr) retrieves the field named :field-name from the object returned by expr expression (of ctype :value). If expr does not evaluates to an object instance (directly or indirectly) of the class defining the :field-name the behavior is undefined, and unsafe (GCC usually crashes).
unsafe_put_fields: (unsafe_put_fields obj :field-name1 val1 …) updates the object value of obj by changing its field named :field-name1 to the value of val1 etc… (all the fields are updated at once). If obj is not an object of the appropriate class for the fields, the behavior is undefined and unsafe (usually GCC crashes).
update_current_module_environment_container: [Expert] (update_current_module_environment_container) don’t use it if you don’t understand.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.3 MELT modules and translation

[for experts mostly; familiarity with the notions of bindings and environments is expected.]

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.3.1 MELT environments and bindings

A MELT module uses previously available bindings (imported values, etc..) and provides its own bindings (exported values, etc..). Bindings are objects (of superclass CLASS_ANY_BINDING, e.g. of some class like CLASS_VALUE_BINDING CLASS_MACRO_BINDING CLASS_PATMACRO_BINDING CLASS_INSTANCE_BINDING etc…). Bindings are grouped in environments (themselves objects of class CLASS_ENVIRONMENT). Each environment is linked to its parent. So a MELT module is initialized in its parent module environment and gives its own module environment.

Hence MELT environments are objects with a env_bind field (the object map of bindings), a env_prev field (the previous environment), etc… All bindings are objects with a binder field (the bound “name”, e.g. a symbol, used as the key in the binding map of environments).

User MELT code is ordinarily not supposed to explicitly change environments and bindings (but they are changed implicitly at module initialization).

Advanced MELT extension developers might rarely use with caution the current_module_environment_container -which is actually a reference, not a container- and parent_module_environment macros to obtain the current and parent environments.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.3.2 translating a MELT module

A MELT file ‘foo.melt’ [which can be viewed as defining the foo MELT module] is translated into a C source ‘foo.c’ which is then compiled into a dynamically loadable shared library - usually ‘foo.so’ on Linux. The translation to C is done using cc1 or gcc -c with the -f[plugin-arg-]melt-mode=translatefile -f[plugin-arg-]melt-arg=foo.melt -fmelt-secondarg=foo.c options. The generated file foo.c is usually quite big (and only #include-ing one file, "run-melt.h" which includes all the rest). It essentially contains one static C function (of signature compatible with melt_apply) for each defun or lambda function in MELT, and one big exported melt_start_this_module C function which does all the initializations, and some other stuff. The initialization code builds all the required data (quoted symbols, closures, classes, fields, boxed strings, static instances defined thru definstance etc..); MELT modules have no data outside of this melt_start_this_module function.

The start function melt_start_this_module (which is found by dynamic loading of the module, usually thru dlopen and dlsym or their equivalent, and called only once) expects a parent environment and returns the newly filled module environment(5).

To generate a MELT binary module from a MELT source file, use -fmelt-mode=translatetomodule.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.3.3 MELT module initialization and exports

Names defined (as a function thru defun, as a class thru defclass, as a field, etc…) are not visible outside their module (to further MELT modules loaded afterwards) unless they are exported. Most names (e.g. functions, selectors, instances) are exported as values using the export_values construct. Classes are usually exported using export_class(6), which also exports all the own fields of the exported class (but inherited fields are not exported, unless their class was export_class-ed).

Advanced users can extend the MELT language by exporting macros using the export_macro construct, which gets a macro name and its macro expander function, which takes as arguments the source expression (of CLASS_SEXPR), the environment (of CLASS_ENVIRONMENT), the current expander, and produces an instance of a subclass of CLASS_SRC.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.3.4 MELT translation steps

The generated C code is of much lower level than the MELT source. The MELT source code is usually in a file but can be elsewhere (a list or s-exprs in memory).

The generated C code interacts with MELT runtime and garbage collector; in particular, every value -even temporary ones- should be explicitly stored in MELT frames known by the GC. Hence, MELT expressions are quickly normalized : (f (g x) y) becomes something similar to let gg = g x in f gg y(7) where gg is a fresh variable (actually an instance of CLASS_CLONEDSYMBOL).

The reader, or some other source, provides a list of s-expressions to be translated. Each such s-expression is an instance of CLASS_SEXPR so has prop_table loca_location sexp_contents as fields. The :loca_location field is a mixloc giving the staring position and file of the s-expr. The :sexp_contents is a list value containing the s-expression elements. Leafs are read specifically, e.g. boxed integers (of DISCR_INTEGER) for integers, or symbols (instances of CLASS_SYMBOL) or keywords (instances of CLASS_KEYWORD, etc. All these classes are defined in ‘warmelt-first.melt’.

Then s-expressions are macro-expanded into objects of subclasses of CLASS_SRC. Standard macros (in particular all the constructs defined above, see section MELT syntax constructs.) are defined in ‘warmelt-macro.melt’. For instance, the if macro is expanded by the mexpand_if expander function (private to ‘warmelt-macro.melt’) which makes an instance of CLASS_SRC_IFELSE with fields :src_loc sif_test :sif_then :sif_else and this mexpand_if expander is given to export_macro. Macro expanders might need some of expand_apply lambda_arg_bindings macroexpand_1 … functions defined in ‘warmelt-macro.melt’.

After macro-expansion, the expanded source code (instances of some subclass of CLASS_SRC) is normalized into instances of subclasses of CLASS_NREP (for normal representations, i.e. nreps) by code in ‘warmelt-normal.melt’. Normal expressions are not nested, so we separate simple nreps from complex normal expressions (CLASS_NREP_SIMPLE vs CLASS_NREP_EXPR). Normalization means not only adding extra internal lets (i.e. instances of CLASS_NREP_LET but sometimes computing additional information, such as the ctype of many expressions. Normalization is in particular done with the normal_exp selector (returning the nrep primarily and secundarily a list of additional bindings), and other utilities such as normalize_tuple get_ctype wrap_normal_letseq etc… For instance the normalization of if constructs is done in the normal_exp method for CLASS_SRC_IF, in a private function called normexp_if which returns an instance of CLASS_NREP_IF with fields :nrep_loc nif_test :nif_then :nif_else :nif_ctyp and a list of additional normal bindings (of CLASS_NORMLET_BINDING). Macro-expansion and normalization sometimes give simpler representations; e.g. all of if and or constructs get normalized as instances of CLASS_NREP_IF.

After normalization, nreps (which are expression-like) are transformed in the “code generation”(8) step into instruction-like representations called objcodes . instances of subclasses of CLASS_OBJCODE. This happens in ‘warmelt-genobj.melt’ using the compile_obj selector, which, applied to nreps and a generation context (a merge of various info), produce objcodes. Moving from nreps expressions to instructions involve very often putting a destination on an nrep thru the put_objdest selector.

At last, the objcode is output, within the ‘warmelt-outobj.melt’ file, in two string-buffers (one for the header part, one for the body part) using several selectors like output_c_code output_c_declinit output_c_initfill output_c_initpredef. Only once all objcodes has been output in string buffers is it actually spilled to the generated C file, all at once.

Advanced users can extend the MELT language by implementing extensions at various levels of the MELT translator.

Several important data or functions are available thru the initial_system_data instance (the only instance of CLASS_SYSTEM_DATA), including the exporting and importing machinery, the fresh module environment maker, the symbols and keywords dictionnaries and internizers.

All the MELT translation occur in ‘warmelt-*.melt’ files which generate their ‘warmelt-*.c’ counterparts (these generated files are distributed with GCC sources). Be careful to minimize the interaction between these files and the rest of GCC (in particular, avoid having a strong dependecies between GCC internal data representations - like gimple) to be able to regenerate the translating and translated files ‘warmelt-*.c’ from ‘warmelt-*.melt’ even when GCC internal passes evolve(9).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.7.4 Writing GCC passes in MELT

[For experts, knowing about GCC passes in general]

GCC passes can be written in MELT. See the ‘ana-*.melt’ files.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.8 Writing C code for MELT

[For experts] Sometimes (i.e. to implement a new primitive) it may be necessary to write some C code for MELT. We describe here the coding conventions to follow, in particular because MELT has a copying generational garbage collector (which changes pointers when copying values out of the nursery).

Above all, avoid coding in C (a cumbersome task) and prefer writing MELT code when possible.

Remember that MELT pointers can move at every allocation and every MELT related call.

First, a real example. To box a long integer into a MELT value, MELT code have to use the make_integerbox defined in ‘warmelt-first.melt’ as

(defprimitive make_integerbox (discr :long n) :value
  #{(meltgc_new_int((meltobject_ptr_t)(" discr "), (" n ")))}#

If the passed discr is not a discriminant for boxed integers, make_integerbox gives nil.

To get the boxed integer’s content, use the getint primitive in MELT. To test if a value is a boxed integer, use the is_integerbox primitive.

The meltgc_new_int routine is implemented in ‘melt.c’ with the following code. We give it entirely, with additional comments

melt_ptr_t
meltgc_new_int (meltobject_ptr_t discr_p, long num)
{
  MELT_ENTERFRAME (2, NULL);
#define newintv curfram__.varptr[0]
#define discrv  curfram__.varptr[1]
#define object_discrv ((meltobject_ptr_t)(discrv))
#define int_newintv ((struct meltint_st*)(newintv))

We first create a MELT frame using the MELT_ENTERFRAME macro which creates, initialize the frame and install it at top. The first argument is the number of local MELT values, the second argument is the current MELT closure (so is NULL for C code which is not the code of a routine). Instead of writing curfram__.varptr[0] we #define some more descriptive names for readability. The frame is initially filled with nil values. The value pointer arguments (here discr_p) of the C function are conventionally named with a _p suffix. Every local MELT value should be inside your curfram__.varptr array.

  discrv = (void *) discr_p;

Every value passed as a C argument should be immediately copied into the MELT frame (i.e. as a local value) and the C argument should not be used directly afterwards. So never use _p suffixed arguments after have copied them inside the frame.

  if (melt_magic_discr ((melt_ptr_t) (discrv)) != OBMAG_OBJECT)
    goto end;
  if (object_discrv->object_magic != OBMAG_INT)
    goto end;

We try to be safe, so we at least test that the discriminant is an object. We could have tested that it is indeed an instance of CLASS_DISCR but that would be slower but safer. However we do test that the discriminant’s magic is indeed OBMAG_INT(10). If either test fail, we return nil by goto end. We cannot code a direct return statement, because that would not pop the topmost MELT frame.

  newintv = meltgc_allocate (sizeof (struct meltint_st), 0);
  int_newintv->discr = object_discrv;
  int_newintv->val = num;

We allocate space in the nursery with meltgc_allocate. This C function sometimes trigger MELT garbage collection, so may move any pointer inside any MELT frames. The first argument to meltgc_allocate is the sizeof of the fixed part of the value, and the second is the size of its trailing variable part. The allocated zone should be immediately filled to make a valid MELT value.

end:
  MELT_EXITFRAME ();
  return (melt_ptr_t) newintv;
#undef newintv
#undef discrv
#undef int_newintv
#undef object_discrv
}

We end by popping the current MELT frame and retuning. Popping the frame should always be done, so conventionally we use an end: label. To be good citizens for further C functions, we #undef-ing every C macro defined for readability.

More generally, every C function which may (directly or in any deeply called function) trigger the MELT garbage collector should follow these rules:

avoid coding in C. The whole purpose of MELT is to make coding more fun.
make an explicit MELT frame and enter it. The C routine should start by making a frame usually with MELT_ENTERFRAME macro (which expands to a C declaration followed by some C statements, so should be the last “declaration” like stuff in your function). For readability, you want to define C macros (conventionally ending with v) to access the local values in your frame instead of curfram__.varptr[index].
put every value in the MELT frame. This means that every value should be kept in a local inside the MELT frame, accessed thru curfram__. In particular, nesting function calls is prohibited; never code f(g(x)) if g may trigger a MELT garbage collection; use a local value for g(x) instead, and avoid declaring any MELT value as a C local.
try to code safely. Unless you have specific reasons to avoid that, try to test MELT values before using them.
notify on MELT updates. When a MELT value is updated by changing some MELT pointer inside it, you have to notify the garbage collector (write barrier) using the meltgc_touch function (taking as argument the modified MELT value) or the meltgc_touch_dest (also given the new MELT pointer inside). These functions has to be called just after writing the MELT pointer into the data. They can call the MELT garbage collector (which may change any local value in the MELT frame).
allocate MELT data appropriately. Use meltgc_allocate, or preferably some existing allocating function (like meltgc_new_*) to allocate new MELT values. Never forget that such an allocation may trigger the MELT GC and change every local pointer in the current MELT frame curfram__. Most C functions which may directly or indirectly trigger a MELT garbage collection are prefixed with meltgc (but melt_apply could also trigger that).
don’t use longjmp, because longjmp won’t pop the MELT frames.
always exit the MELT frame explicitly using MELT_EXITFRAME() macro, which usually is the last statement of your function (so avoid return-ing before, hence always use a goto end instead.
avoid using global MELT values. If you really need some, use the MELTGOB or MELTG macros. Adding additional MELT globals is tricky (edit files ‘melt.h’ and ‘warmelt-normal.melt’). Using existing MELT globals is simpler, e.g. MELTGOB(DISCR_LIST) to fetch the predefined discriminant DISCR_LIST.
apply MELT functions and send MELT messages using melt_apply and meltgc_send with their pecular calling conventions (constant string describing array of unions).

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Basile Starynkevitch on July 27, 2015 using texi2html 1.82.