We introduce metaprogramming in a completely informal way, and sketch out a theory of it. We explain why it is a major stake for computing today, by considering the processes underlying software development. We show, from the same perspective, how metaprogramming is related to another challenge of computing, the free availability of the sources of software, and how these two phenomena naturally complement each other.
In the primitive settings of early computing, the size of machines was so small that a one man’s mind could fully embrace the whole of a program’s activity down to the least details. Now, the human faculties haven’t evolved since, whereas the capacities of computers have increased at a sustained geometric progression. To benefit from the technological advance, ever more abstract conceptual tools and programming languages had to be developed and used.
But however great the progress made by programmers as individuals, the level of complexity has been long exceeded beyond which no one can conceive the entirety of a program that would take full advantage of existing computers. It is therefore of utmost importance to develop methods of exchange, cooperation, accumulation, construction, so as to elaborate complex programs; such is the field of software engineering [Brooks1995].
Now, central to any process of building software is manipulation of the source code of programs. To enhance these processes, to ease the task of the programmer, is to relieve the programmer from all repetitive and conceptually redundant operations, so that he may focus on the essence of programming, that is, on the problems that have never been solved yet. This means manipulation of code must be automated as much as possible, by having machines handle reliably all the inferior tasks that do not pose theoretical problems anymore. And to automate programming is by definition metaprogramming.
Thus, metaprogramming, the art of programming programs that read, transform, or write other programs, appears naturally in the chain of software development, where it plays an essential role, be it “only” under the form of compilers, interpreters, debuggers. However, metaprogramming is almost never consciously integrated in the processes of development, and gaining awareness of its role is precisely the first step towards progress in this domain.
Now, even if we would like to focus on the technical task, which is sufficiently hard in itself, consisting in the exploration of automation methods for the software development process, it is impossible for us to ignore the precondition necessary to the use of any such method as well as to any cooperative or incremental work: the availability of sources.
This availability is less than ever a technical problem, thanks to the advent of digital telecommunications; but it is more than ever a political problem, at a time when the near-totality of information on the planet is under control of publishing monopolies organized in powerful lobbies. The Free Software movement fights for the free availability of source code, nay, information in general; it opposes the artificial tollgates that the legal priviledges of “intellectual property” are.
Metaprogramming and free availability of sources, such are the two major challenges, intimately related, that computing has to face today, with a growing importance. Both find the same cybernetical1 justification in the necessity to adapt the software development process to programs that are more and more elaborate, that concern a wider and wider public, where everyone can only bring a small contribution each. At least, such is our conviction, that we’ll try to share with you.
It is possible to see the impact of metaprogramming on traditional development processes, by sorting existing programs as transformational processes with inputs and outputs, and considering specifically those of these inputs and outputs that can be categorized (with some arbitrariness) as “code”, as opposed to “data”. We’ll note n → p the sort of programs n of whose inputs are “code”, and p of whose outputs are “code”. Then, here are a few examples of programs whose sort range in the simplest of those rough categories.
We could go on toward arbitrarily sophisticated examples, or simply combine the above examples. We could also refine this type system to consider, beyond the mere arity of programs, the type of their arguments and results (and so on, recursively), with criteria more precise than that of being or not “code” (with each chunk of code being associated the language in which it is written), etc. These refinements are left as an exercise to the reader.
By reading of the above example list, one can’t help but notice that in every listed sort exists pieces of software that are actually used daily, at a smaller or larger scale, by programmers world-wide. Thus, metaprogramming already exists, and proves its utility every day.
However, inspection of the same list leads to one’s remarking that most of these metaprograms are written in an ad-hoc manner, and a posteriori, according to immediate needs, without any general framework to develop them (though with a disconnected collection of tools), and worse even, without any framework to think them. Certainly, metaprogramming is present, but few are those who are aware of it, and fewer are those who actively use it [Pitrat1990]; the tradition in computing seems more prone to create particular cases for every of the set out concepts than to put them together through a useful and systematic theory.
Metaprogramming isn’t particularly complicated idea to formalize, once you decide to do it; actually, all the underlying ideas already exist, and await to be gathered into a coherent whole. Below is shortly sketched out such a theory, accessible to anyone with a basic culture in theoretical computer science or in mathematical logic. We invite other readers to read ahead, and skip without regret the paragraphs that be too hard to them. As for those who’d be interested in more technical details rather than less, we’re preparing another, more formal, article, for them.
First, we need to define the abstract notion of a computer langage. After examining the properties we expect from such notion, we can identify it to the notion of abstract type of data structures, or to that of classes of syntax trees, or even to that of “abstract algebra”. To achieve that, we may start from the notion of inductive structure (or “initial” structure) freely generated by a signature (or abstract grammar), that is, constituted of the terms that can be written from the constructors given in the signature. To every such structure is associated a logical theory wherein the fact that the structure be freely generated corresponds to a schema of inductive reasoning (proof by induction). From freely generated structure, we can obtain new, more “constrained”, structures in two ways: firstly, by adding well-formedness requirements on terms (sorting, typing, lexical rules, etc), which is the same as restricting the considered terms to only part of the original structure; and secondly, by adding equality relations between terms (for instance, commutativity or associativity of some operations), which is the same as quotienting the structure by an equivalence relation. For purposes of relevance, we make sure that the constructors and constraints that define the source language satisfy some strong enough criterion of effectiveness: computability, primitive recursivity, computability in polynomial time, etc, so that the synthesis, the analysis, the comparison or the copy of terms be operations effectively simple to implement.
Then, we can define the notion of a mapping between langages as a binary relation between the elements of a language and those of the other language. We thus introduce the notion of a semantics, as a mapping between a “concrete” language and another “abstract” language, that maps every “signifier” term in the concrete language with its possible “signified” terms in the abstract language. We’ll be especially interested in observational semantics that give as an abstract signification to concrete terms the set of valid answers to a set of “relevant” observations on these terms. We’ll also introduce effectiveness criteria on mappings between languages and on semantical relations; for instance, in the most general case of observational semantics, we’ll require every observation to be semi-computable.
A programming langage is then the data of a source language and a semantics that relates this source language to a set of observable behaviors.
By describing these mappings between languages themselves with programming languages, we obtain the notion of metalanguage, whose terms are metaprograms. We’ll be able to assert that some metaprograms translate faithfully a language into another if they preserve the semantics of terms (that is if some diagram commutes). It is thus possible to express various criteria of correctness for metaprograms, to begin with interpreters and compilers. With appropriate effectiveness criteria, we can thus use such a framework to express not only computations, but also logical reasoning on programs.
Finally, when a metaprogramming framework as described above is able to represent itself, it is said to be reflective. Such a framework allows not to have to introduce a new metalanguage (with all associated axioms) each time we want to explore the behavior of the system in a deeper “meta” direction; the distinction between code and data is thus blurred, which leads to a more integrated vision of software development 2.
If we investigate existing applications of metaprogramming, we realize that it is used to automatically manage the transition between several different aspects of same computational objects: for instance, when one compiles a program, one is interested in the “same” program, under different forms (source or object code), each suited to its own set of tasks (human modification, or machine execution).
Conceptually, we consider a “same” program, a “same” object, a “same” idea, independently of the various representations through which we communicate and manipulate them; one form as well as one other can be used to describe the whole of the “interesting” properties of an object. Nevertheless, whatever the chosen representation, there must still be one, and considering the effectiveness criteria imposed by physical and economical constraints, one would better chose a representation “adapted” to implementing as efficiently as possible the manipulations expected to be effected on the considered object.
Now, the “interesting” properties of an object will inevitably vary through time and space, depending on the persons who work with the object, on the machine components that handle it, on the field of activity, the expectations, the proficiencies of ones, the usage goals and the technical constraints of others. To provide each program and each user at each moment with a representation of the considered objects that be as adapted as possible to his current interests, there needs be metaprograms that enforce consistence in time and space between the various aspects of these objects.
In this way, a plane will be for a engineer designing the fuselage, a set of curves and equations of which some parameters must be optimized, for a component manufacturer, it will be a schedule of conditions, for the builder, it will be an assembly process, for the maintenance officer, a set of tasks to perform, for the restoration engineer, a set of problems to fix, for the hardware manager, a set of spare parts, for the pilot, a ship to take to destination safe and sound, for the passenger, a disagreement to minimize to get to another place, for the traffic controller, a dot to route among many others, for the commercial clerk, a set of seats to fill, for the commercial strategist, an element of a float to deploy, for the director of human resource, people to manage, for the accountant, a historic of transactions, for the insurance agent, a risk to evaluate, etc, etc. The computerization of some or the whole of the management of the life of a airplane implies providing each actor in presence with a point of view of the plane that suits his needs, that will contain parameters most of which are only useful to him, but some of which also involve other actors, with whom it is essential that some agreements be found, that data be synchronized. Metaprogramming enables to handle this synchronization, this consistency between the points of views of multiple actors [Kiczales et al.1997].
Even for the computing projects that are simpler than the complete management of a air float, there necessarily appears a divergence of point of views, as soon as multiple actors (programmers, users) or in presence, and/or as soon as these actors evolve and that their points of focus change with time (a same person can take several roles in succession). As soon as a problem is complex enough, no one can embrace at once all the aspects of it, and it is necessary to treat them separately. Hence, metaprogramming is useful to enforce consistency between these aspects, that else would have to be managed manually; it allows to handle once and for all an uninteresting aspect, so as to forget it afterwards.
We have presented metaprogramming as a very useful technique (and a very used one, despite lack of awareness of it) to solve some problems. The question is thus to determine whether this technique be original, or would only be a combination of well-known existing techniques (which combination it would then be interesting to detail), whether or not it brings new solutions to known or previously unsolved problems, whether, in the end, it be an essential and indispensible technique, or on the contrary it would be but accessory.
This question is actually that of the expressiveness of programming languages, as algebras allowing to combine multiple techniques: what makes a language more expressive than another? what enables the programmer better or worse to solve his problems? And, to begin with, what are these problems, how can they and their solutions be characterized, compared?
The fundamental result concerning the expressive power of computation systems is the one by Turing [Turing1936], that shows the existence of a class of functions, said to be computable, able to perform any thinkable mechanical computation. Given a set of possible inputs (questions) and a set of possible outputs (answers), both digitizable (encodable with, for instance, natural integers), any mechanically implementable programming language can express at most as many functions from the input set to the output set as a Turing machine. A programming language that can express all computable functions from one set to the other is said to be universal, or Turing-equivalent (assuming both sets are infinite). All such languages are as powerful as all others, from the point of view of the functions they can express.
Nevertheless, it is obvious that not all Turing-equivalent languages are worth the same from the point of view of the programmer: all those who made the respective experiences will agree that it is easier to program in a high-level language (like LISP) than in a low-level language (like C), which in turn is easier to use than assembly language, which is better than binary code, which is simpler than the transistor-per-transistor design of a dedicated electronic circuit, or the specification of a Turing machine. Actually, Turing’s result is but the very beginning of a theory of the expressive power of computing systems, and certainly not the end of it. To stop with this mere result, and say that “since all languages, in practice, are not equivalent to each other as they are according to Turing, then theory cannot say anything”, would be to abdicate one’s reason and to look in a vague nowhere for an unspeakable explanation, it would be giving superstition and ignorance a gratuitous and preposterous dignity.
Before we go further, let us remark that the very demonstration of Turing’s result is based on metaprogramming: universal languages are equivalent one to the other in as much as it is always possible to write in one an interpreter for any other one, so that any program in the other language can find, through the use of a translator, an equivalent in the starting language. Besides, this is precisely the reason why these languages are called universal. Now, metaprogramming is a programming style ignored by all the software development methods as applied to most languages3, for it consists precisely in not using the studied language, but another language, through metaprograms. We may thus wonder how can this reluctance can be formalized, and what will then remain of the equivalence between universal languages.
There unhappily, doesn’t exist any fully satisfying theory of expressiveness yet; the only recent work that we could find on the subject [Felleisen1991], very interesting as it is, limits its ambition to the relative macro-expressibility of extensions of a same language one into the other. However, we do have some scattered elements of a general theory, that we think are largely enough first to justify the rather “intuitive” idea that metaprogramming brings an increase of expressiveness, and then to refine or reject a few common assertions heard about the “excessive” expressiveness of some languages.
The key point on which the notion of computability fails to express(!) the expressive power of programming languages, is that software development cannot be summed up as the writing of a one perfect monolithic program that either succeeds or fails to solve a given static problem, but is instead a dynamic and evolving process, in which man and machine interact in a more or less elaborate way.
Thus, for any given program to write on a given machine, the truly optimal solution can only be expressed in binary code, and contains lots of nifty “hacks”, of encodings based on fortunate coincidences, of “puns” on the encoding of data, addresses, and machine instructions. But the fact is that finding such a solution would demand gigantic efforts, and the advent of new hardware architectures will make all these pun optimization obsolete. Now the human forces spent during the development are important; there are even essential, since the ultimate purpose of a computer, like that of any tool, is to minimize the amount of human efforts required for the completion of ever more advanced tasks. The relative cost of human and mechanical resources may have been such that, in the 50’s, the hand-writing of super-optimized binary code was economically feasible (see the story of Mel, exerpt from the Jargon File [Raymond1996]); such thing is impossible today.
Thus, the problem is not only technical, it is also economical, moral, and political, in as much as it involves shifting of human efforts. In the development of computer programs, as anywhere else, the process matters. And it is this very process that the popular notions of code reuse, modularity, dynamic or incremental programming, development methods, etc, attempt to enhance, though without a formal rational approach. Similarly, the trend in computer science to leave too low-level languages in favor of higher-level languages is precisely due to human intelligence being a limited resource, a scarce one indeed, that has to be saved for the most important tasks, the ones where it is indispensible (if not forever, at least for now).
Therefore, a satisfying modelling of the expressiveness of programming languages, even though it be abstract enough not to overly depend on ephemeral technological concerns, must take into account the man-computer interaction in a way that includes a notion of human cost (and perhaps also a notions of error and confidence). We suspect such a modelling to be possible using the theory of games.
Though we lack a consistent algebraic theorization of expressiveness, at least we can give the following first intuitive approximation , taking into account a dead simple notion of human cost: a programming system will be more adapted than another to a given usage pattern if it requires “on the long run” less human interaction to solve in succession an indefinite series of problems within the given domain. The usual approximate “metrics” of production can be used (number of lines of code, or abstract syntactic constructs, or of symbols having been typed, number of man-months spent). This approximation provides only a gross characterization of expressiveness, and doesn’t allow to define an efficient programming style, but suffices to justify use of metaprogramming.
In the traditional development process, that excludes metaprogramming, the near totality of the code constituting software is hand-typed by humans; the details of execution, the overall behavior, everything must follow directly from human thought; every stage of the development, every modification is the work of man. Now, the catch is that sooner or later, some new functionalities to add to a software project will require global architectural changes of some depth: all the global modifications of the program will have to be performed by hand, with all the inconsistency problems unwillingly introduced by these changes, that induce numerous bugs and cost lots of iterations of development (for a publicly documented such phenomenon, see the occasional API changes in the Linux kernel; for a spectacular example, see the explosive bug of the first Ariane V rocket). The alternative to these modifications is a complete reimplementation, whose cost is that of a rewrite from scratch. All in all, the incremental cost of traditional development is proportional to the size of code that differs between two iterations of the process.
Let us now make metaprogramming part of the development cycle. Since metaprogramming allows arbitrary processing of code by the computers themselves, it enables the programmer who has to face a structural change in his program to perform semi-automatically any task of instrumentation or transformation of his program, so as to make it conform the new architecture, to check his new code with respect to declared invariants, to enforce consistency of various parts of his program. “Semi-automatically” here means that the simplest “metaprogram” to handle a few particular cases sometimes happens to consist in manually specifying them case-by-case. In brief, the incremental cost of software development between successive iterations of process is proportional not to the size of their difference in code, but to the size of the smallest metaprogram that allows to transform the previous state of the project into the new one, that is, to the conditional Kolmogorov complexity [Li and Vitanyi1998] of the next step relatively to the previous one!
In the worst case, metaprogramming won’t have served, and the cost will be exactly the same (up to a negligible additive constant) as that of traditional programming4; theory tells us that, by the very definition of the concept of randomness, this means that the project has evolved in a completely random way with respect to the existing code base. In the best case, there can be arbitrary gains. In general, metaprogramming allows for optimal cost with respect to complexity; its gains as compared to the traditional approach are proportional to the size of the project in the inevitable case where structural changes with global effects are necessary.
Thus, the theory of Kolmogorov complexity allows us to see how metaprogramming always win, and most often quite a lot, over non-meta programming, by optimizing the sharing of information between iterations of the development process. Besides, it is clear how the tiny doses of metaprogramming used in traditional software projects are a determining factor for the continuation of these projects: choice, often static, of a development environment; filtering of programs with static analyzers, checkers of more or less sophisticated invariants, or other kinds of lint; use of editors that manage the indentation and structure of programs, that they can colorize and fontify; occasional use of search-and-replace tools on the source code; in the case of more daring hackers, “macros” and “scripts” written in elisp, perl or other languages, to edit and manipulate sources.
However, and this will prove very important in a further part of this essay, metaprogramming wins if and only if on the one hand an infrastructural effort is spent to allow easy manipulation of programs, and only if on the other hand there is enough redundancy between successive developments to allow some non-trivial sharing of information accross time; the benefit enabled by metaprogramming is thus asymptotic: it is low to begin with, but gets better quickly with the size of programs and the number of development iterations, that, with the space-time extension of software development projects.
Let us admit the interest of of metaprogramming techniques. Why are they not more commonly known and consciously used? Just what is it that has held up their spreading more widely?
We personally find it obvious that barriers to the distribution of sources are as many brakes to the development of metaprogramming. In fact, the very condition for the use of a program-reading metaprogram is the availability of a program to read; the condition of usefulness of a program-writing metaprogram is that the output program may be distributed and used; and these conditions combine when a metaprogram at the same time reads and writes programs, and even more when the metaprogram depends on the long term accumulation of knowledge about programs! Every limitation on the manipulation of programs is as much of a limitation on the feasability or the utility of metaprograms, and a discourages as much their potential authors.
Now what are the implications of the current regime of “intellectual property” on software?
Firstly, there is the so called “authorship” right, which is actually a publisher’s right, since every author abandons all his rights to the corporation that salaries or publishes him (unless he becomes his own publisher; but then, it will be not as an author, but as a publisher, that he will earn money, which doesn’t remove anything to the problem). To respect this “right”, a metaprogram must refuse to make some inferences, some manipulations, that would violate (according to the local country’s legislation) the license under which the input program is available. Which manipulations exactly are forbidden isn’t clear.
For instance, some licenses are per-user, which forbids to a metaprogram any inference that will be useful to two users (or to a non-authorized user) for unclear and varying notions of “useful” and “user” (what happens when two people collaborate on a same document? or when one supports the another one or serves him as a secretary? what if the final result of the computation is broadcast to millions of persons?). When they are per-machine, or per-simultaneous-user, the problem gets hairier, as the notions invoked by these licenses lose any effectiveness in the presence of time-shared machines, of multiprocessors, of clusters of machines, or worse, of faster machines: no software parameter can correspond to these notions, and even less with software architecture emulators (e.g. 680x0 or x86 emulators), and worse even if these emulators use techniques of analysis and translation of code (themselves prohibited by many licenses) that completely set apart any precise correspondance between on the one hand the “abstract” machine for which the code is written and that the license expects, and on the other hand the “concrete” machine on which the program runs. The notion of user is also blurred by metaprograms that manipulate more than one program at once, and will become even more so with the hoped-for advent of “artificial intelligence”. All these problems can be avoided at the enormous cost of never using as input of metaprograms but software and information that are free of rights (aka free software and free information), which, in a strictly legal setting, would require anyone who introduces any original information to sign a form (or check a button in a menu?).
A trickier problem, that isn’t solved by this measure, is distribution of the results of metaprograms. For, even with free software licenses, that can be mutually incompatible, and even more with proprietary licenses, some operations are allowed, but only under condition of non-distribution of results (“private use”). But what exactly mustn’t be transmitted, when a metaprogram included in its database a synthesis of the analyzes of numerous programs? Where does redistribution start? Does it cover exchange between people in a same company, or a same moral entity? What if this entity is an association that admits anyone as a member? Does it apply to machines? What if this machine is a cluster covering several ones? What if this cluster covers the entire world?
The most refined is the existence of patents. These affect metaprograms universally, independently from the base material being used and their licenses. Patents do not introduce any structural rules that forbid a logical inference in a metaprogram; rather, if at a given moment, the metaprogram entails the use of some algorithms for some goals, then the inference that induced this result must be retroactively invalidated, least some license be obtained from the patent holder. But since the notions of algorithm and goal are quite fuzzy, it would require quite an elaborate meta-meta-program to monitor that the metaprogram never infringes any of the millions of registered patents.
If all the above rules concerning the use of metaprograms seem ridiculously absurd, do not forget that after all, though “artificial intelligences” differ greatly in their current quality from “natural intelligences”, they are based on the same principles, and the same constraints. Hence, be aware that the most widespread metaprogram in the world, that is the human mind, is the first to be compelled to these absurd constraints, by the same absurd laws!