----------------------------------------------------------------------- This text is about programming with classes and objects, but without subclasses. It will first introduce a very general manner in which to combine independent classes, and then show how a usable programming language can be built upon this model. To do so, the first two sections will each describe one basic semantic rule for the language; and subsequent sections will show how these two rules allow for a novel programming style, as well as for expressing mixins and templates, and emulating good old subclassing. For the examples we will use the notation of the programming language Tingle, which was designed to incarnate the ideas in this text. A description of features of Tingle not relevant to this discussion can be found on http://dirk.rave.org/tingle/. ----------------------------------------------------------------------- I The Predominant Method Let us define a simple class Human as follows: class Human { method play_tetris() { ... } } Now in languages with subclassing, to derive a subclass Woman, we would, using some specialized syntax like "extends", reopen the scope of Human, and embed the class Woman inside it, so that the methods and data of the class Human would be visible for any newly added methods for Woman. Abusing Tingle syntax, as the language does not allow of subclasses, we would write something like: class Human { class Woman { method buy_more_shoes() { ... } method give_birth() { ... } } } The two new methods are able to use the Human method play_tetris(), which is in an enclosing scope (for instance to while away the time until a shoe salesperson is available). In our language without subclassing, the above code is syntactically valid; but it actually means that the classes Human and Woman overlap, and that the two new methods exist in their intersection, available only to objects that are both Human and Woman. As all women are by definition also human, the class Woman and the intersection coincide. Maybe we can make the example more interesting by making it about the intersection of Human and Female instead: class Human { class Female { method buy_more_shoes() { ... } method give_birth() { ... } } } class Woman = Human Female We now have two independent classes, Human and Female, neither of which is a super- or a subclass of the other; and thus we see that we need a separate composite class Woman, for objects that are both Human and Female. As order is irrelevant for set intersection, we can change the order of nesting: class Female { class Human { method buy_more_shoes() { ... } method give_birth() { ... } } } Now we can express something that we could not express earlier: the fact that give_birth() is typically female, but buy_more_shoes() only applies to human females: class Female { method give_birth() { ... } class Human { method buy_more_shoes() { ... } } } As might be expected, a more specialized give_birth() for human females can be added in the intersection. If we would create an object of the class Female without composing it with Human, it would provide the basic method give_birth(), and no buy_more_shoes(). If we do create an object of the class Woman, on the other hand, all most specialized methods will be available. Methods in the intersection of Female and Human will predominate over methods with the same name in the composing classes. For a Woman named mary, mary.method() will refer to a method in the intersection if a method with the right name is available there; if not, it will refer to a method in either Human or Female. As there is no hierarchic relationship between Human and Female, when both the classes Human and Female define a method and the intersection does not resolve the conflict, neither version of the method is predominant, and it is an error to try to use it. For a composite class that combines more than two classes in which a same method name occurs several times, the predominant method, if it exists, is the unique method that occurs in the intersection where all the classes and all the intersections that define that name overlap. Or to put it differently: specialization is a partial order among methods; if there is one version that specializes all the others, that one is the predominant method. If not, there is none. Now let us reuse the class Female: class Bovine { method look_at_passing_trains() { ... } } class Cow = Female Bovine For a Cow, only the part of Female outside of the intersection with Human is relevant: the intersection with Human is simply ignored in composing this new class. The intersection of Bovine and Female on the other hand is still empty, so let us now add a method moo(): class Bovine { method look_at_passing_trains() { ... } class Female { method moo() { ... } } } The method moo() characterizes Bovine rather than Female -- I personally associate the concept of mooing with bovines rather than with femininity -- and that is why I add moo() to the definition of Bovine, further nested in Female, rather than the other way around. In section II we will see that the order of nesting, which is irrelevant for determining the predominant method, is however relevant for scoping, so in quite some cases the order of nesting will probably rather be chosen with scoping in mind. The class Female was reopened for this example: classes can be reopened as often as needed, as an outer nesting as well as nested in others. On the other hand, I did add the new code to the Bovine-block I wrote earlier, instead of reopening Bovine, but only because the two methods seem to go well together thematically. When you arrange methods thematically by reopening classes several times, tools can easily be provided, for instance as editor plugins, to view the code by complete classes, or to see all variables in a certain scope together, when this helps understanding. Tools for the opposite direction would be much harder to conceive. II Scoping The order of nesting of classes is irrelevant to the identity of predominant methods; but it remains the foundation of scoping, and so determines the internal structure of composite classes. Methods in an intersection can access only variables from classes that make up the intersection. There are no free variables that get caught at composition time: the order of nesting around a method definition determines locally which variable an identifier refers to in that method. The same goes for method lookup, when methods are called from the active object -- that is, as a function call, as in "method()", not by sending a message, as in "mary.method()". class Female { data children } class Bovine { class Female { method moo() { if (children > 3) ... } } } In the body of moo(), the inner scope is formed by the intersection of Female and Bovine; the next scope by Female only; and the next by Bovine only. The variable children in the class Female is not in the same definition, but it is in scope. (Here code browser plugins for the editor as mentioned before would be useful.) If the variable children had not been declared in Female, but in Human, moo() would not work, not even for the godess Isis: class Isis = Female Bovine Human The method moo() does not see the variable children in the class Human, although it is present in the composite class Isis. Using the notation class::variable the programmer can explicitly override the scoping order, but not refer to a class outside of the scope. In moo() the distinction between Female::children and Bovine::children could be made, but trying to use Human::children would be an error, as Human is not in scope. To explicitly refer to a variable or method in an intersection, use constructions like Female::Bovine::moo(). (Order is unimportant.) In our moo(), super.moo() will refer to Female::moo() if that exists; if not, Bovine::moo() is tried. Super calls are function calls, not messages. For methods defined only two levels deep, I think everyone except the most quarrelsome will accept this scoping rule as the obvious and "correct" one. The complete rule as currently used in Tingle however is only one of the possible generalizations of the more obvious one for two levels. It is rather complex: it was chosen because it assures that intersections always have precedence over their separate classes, and also that inner scopes have precedence over outer ones. class A { class B { ... class Y { class Z { method myMethod() { print(v) } ... } The variable v will first be looked for in the intersection of all classes A through Z (where it would reside if it was declared immediately above myMethod in the source code), then in the smaller intersections B through Z, C through Z etc. When the intersection Z through Z is reached, i.e. the class Z alone, and nothing has been found, then we continue with the intersections A through Y until only Y, then the intersections A through X until only X etc. This way all continuous intersections (like C+D+E and P+Q+R+S+T) will be searched; intersections are always searched before the single classes that make them up, and also before intersections of only some of the same classes; and inner classes before outer ones. If you want to use a variable or method that is in a non-continuous intersection, you can use the ::-notation; or change the order of nesting for the current method. The latter solution is possible because, as classes can be reopened as often as desired, each method can have its own nesting order; then again, nested blocks provide visible structure to the program, so they should not be chopped up too much. Some taste might be required. This might also be the right moment to remark that when you have to think hard about scoping while reading a program, the program is probably badly written and might benefit from more diverse and more descriptive identifiers. III Programming Style If the previous section leaves you with the impression that scoping for methods many levels deep will be awfully complex, this section is all about avoiding deep nestings. That does help in understanding programs; however the real reason to avoid nesting wherever possible is to promote reusability. (Every nesting is a dependency.) We can emulate boring old single inheritance in Tingle: class car { method start() { ... } method stop() { ... } } class bus { class car { method bus_stop() { ... stop() ... start() ... } } // nothing here -> classical single inheritance } class Car = car class Bus = bus car (There is no such concept as bus-ness that can be combined with the concept "car" to make a bus out of it, so I use "bus" and "Bus" for respectively the base class and the composite class. And "car" versus "Car" is only for symmetry.) This is all very well, but emulating multiple inheritance next would lead to overly deep nesting: note that we only nest with base classes, not with composite classes. (Composite classes have no internal order; and if they would have one, that one would probably not be the appropriate one in all circumstances.) A "literal translation" of code with multiple inheritance would therefore be ugly, and its scoping complex: class carGUIObject { class car { method draw() { ... } } } class busGUIObject { class bus { class carGUIObject { class car { method draw() { ... super.draw() ... } } } } } class CarGUIObject = car carGUIObject class BusGUIObject = bus car busGUIObject carGUIObject There is a better way, with shallow scoping, and all draw() methods snugly together in the source code: class GUIdraw { class car { method draw() { ... } class bus { method draw() { ... super.draw() ... } } } } class CarGUIObject = GUIdraw Car class BusGUIObject = GUIdraw Bus This shows that there is no dominant decomposition by the sort of the vehicle, and also that the "diamond problem" does not exist here. Note that, if we want to add another concern later, we can write this up without taking GUIdraw in scope, when the two concerns are independent of each other: class Accounting { class car { method costs() { ... } class bus { method costs() { ... } } } class employee { ... } } Again, scoping is shallow. We can compose all this with: class CompanyCar = Accounting GUIdraw Car class CompanyBus = Accounting GUIdraw Bus All these good things come with a trade-off: in a system based on intersecting classes, emulating "subclassing for construction" or "subclassing for combination" is the way madness lies. The relationship between a composite class and its parts should always be "is a" or "is" or something similar: never "has a" or "is a bit like a". (Use an instance variable to express "has a" relationships.) We conclude this section with a rather more superficial example of programming style: a more or less satisfying attempt to write down what in a traditional object oriented language would be subclasses that are smaller than their superclass: class ellipse { data colour, axis class generic { data second_axis method area() { ... } } } class circular { class ellipse { method area() { ... } } } class Ellipse = generic ellipse class Circle = circular ellipse IV Mixins Note again the contrast between the rule from section I to determine the predominant method in a composite class, and the scoping rule from section II. The first is for when an object is accessed "from the outside", with "object.methodname()"; the second is for accesses of methods or variables inside an object, with "methodname()" or "variablename" only. ("Object.variablename" is not allowed in Tingle; if it were, it would be the "predominant variable".) Two distinct rules are needed because order of nesting is only relevant for separate methods; when classes are composed, there is no general order or nesting. "Virtual" function calls, to abuse C++ terminology, as well as mixin behaviour can be obtained by "stepping out of the scope" by using "self.methodname()" instead of "methodname()". Note that this "virtuality" here is a property of the call, not of the method itself. There is no technical difference between mixins and classes, but the name mixin would typically fit a class without any intersections, which accesses other methods of the same object through "self" so that it can be mixed with any class that provides the necessary methods. Therefore we recognize the following code snippet as a mixin. It can be mixed with other classes that define isEqual and isLess: class OrderMixin { method isGreater(x) { ! (self.isEqual(x) || self.isLess(x)) } method isUnequal(x) { ! self.isEqual(x) } } Without nesting, the source code of this mixin does not show what classes it should or should not be mixed with, nor what kind of object "self" will then refer to. As a matter of fact, in the dynamically typed language Tingle, you can mix OrderMixin with classes that do not provide isEqual and isLess, and so compose broken code. For a very general mixin like this one, all this is not a real problem; but often templates with clean scoping are preferable above mixins. V Templates These are two independent classes, without any intersections: class Tapereader { method read() { ... } } class Diskreader { method read() { ... } } And this is how we extend them together: class Buffered { class Tapereader, Diskreader { method read() { ... super.read() ... } } } class BufferedTapereader = Buffered Tapereader class BufferedDiskreader = Buffered Diskreader Which is short for: class Buffered { class Tapereader { method read() { ... super.read() ... } } class Diskreader { method read() { ... super.read() ... } } } Syntax as used in the examples until here does however not allow to express the following as a template: class Buffered { class Diskreader { class Floppy { method read() { ... super.read() ... } } } class Tapereader { method read() { ... super.read() ... } } } So we add an alternative syntax for nestings that only consist of an intersection: class Buffered { class Diskreader->Floppy { method read() { ... super.read() ... } } class Tapereader { method read() { ... super.read() ... } } } And then we can write the template: class Buffered { class Tapereader, Diskreader->Floppy { method read() { ... super.read() ... } } } Whether every situation will be served well by either mixins or these simple templates, practice will show. When both are possible, templates seem the cleanest choice. VI Scalability Although the section on programming style illustrates that nesting of scopes should not become as deep as a classical subclassing programmer would expect, and complexity is thus kept in check, the language as presented up to here does not scale to programming in the large. That is because we have not considered that one would want to reuse composite classes without studying their composing parts. Suppose that a library or framework exports a class Widget, which is itself composed of ten other classes. We will probably want to use it as a black box, and in blissful ignorance of its internals derive a class MyWidget. In the system as described up to here, the only way to do this is by writing myWidget as a mixin for Widget; because if we work with intersections, we need to know about the composing classes of Widget to nest in them in a sensible manner. From the viewpoint of a mixin, Widget will indeed be a black box, which only exports its predominant methods: this takes care of encapsulation and a clean API, and so provides scalability. But this is not a very elegant solution, as: - we have to write "self." all the time; - we lose coupling: the source code of the mixin does not specify which one specific other class it is meant as a mixin for; - most importantly, when the mixin and the old class provide methods with the same name, no new predominant method can be added to resolve the conflict, as the classes do not intersect. Therefore we introduce the possibility to include composite classes in nesting: class Widget { class myWidget { ... } } class MyWidget = Widget myWidget This will mean that for the scoping rule, Widget will be considered as a class without data members, and with only the predominant methods of the composite. This corresponds with the viewpoint of a mixin, except that: - methods of Widget can be called without "self."; - coupling is explicit: if Widget is not part of a new composite class, then the intersection with myWidget will not be either, and no broken code will be present in the system; - the methods of Widget are less dominant than those in the new intersection with myWidget. Composite classes are self-contained for scoping: if they are nested themselves, their methods do not suddenly get the enclosing nestings in scope, as they were not defined in them. Composite classes will probably mostly be used as most exterior nesting, though. Of course, if several composite classes occur in a nesting, they can share data indirectly, if they should themselves share composing classes. (A class cannot occur several times in a composite class, not even if that is composed of other composite classes that share the first as an identical component. Variables of such an identical component are therefore shared -- variables that should be duplicated should be defined in well-chosen intersections.) The use of composite classes in nesting gives us a means for falling back on good old (single and multiple) inheritance when we need more encapsulation than can be provided in the new, very open style. It should be combined with the use of name spaces to hide the composing parts that we do not want to know about. VII Short Note on a Semantical Subtlety Although it may not necessarily be a good idea to do so, you can add code directly into a composite class, that is, as follows: class Widget { class myWidget { ... } method grok_this() { ... super.method() ... } } Scoping: the method grok_this() is in scope for the intersection of Widget and myWidget, and hides any predominant method with the same name in the class Widget. The body of grok_this() itself also only has access to the predominant methods of Widget. Thus super.method() can only refer to a method outside of the scope Widget, which in the example can only mean a globally defined method. Predominant method: it is as if grok_this() is in the intersection of all classes out of which Widget itself is composed. One could expect that if a method with the same name already existed in that intersection, it would be overwritten; and that would be a perfectly good semantics. That is not what happens in Tingle though: there it means that two methods with the same name are both present in the same intersection, which of course means that neither can be predominant -- a design decision taken because of language features that are not relevant to the present exposition. Here also ends the exposition of what is currently implemented. VIII Public, Protected, Private Currently all data is protected, i.e. unreachable from outside of the object; and all (predominant) methods are public. We could add a keyword "protected" to the language to exclude a method, and also all methods dominated by it, from the predominant method lookup. Access from inside, i.e. via scoping, could be limited by a keyword "private", for data as well as for methods. Adapting its meaning from other languages, this would mean "only accessible from the current intersection". Private does not have to imply protected, as the first is about scoping and the second about predominant method lookup. The practice of reopening scopes and textually grouping code by concern would suggest the introduction of an extra keyword, let us say "local", which would limit the visibility of data or methods not to a scope, but to a nesting block. Only doing the experiment will teach if it was a practical idea. Note that "protected" should be ignored for self sends, or it would sabotage "virtual calls". IX Loose Remarks and Future Extensions The current system does not provide class methods and class variables, but there is no good reason why these could not be added in an analogous way, with the same rules for scoping and composition. The semantics of a class variable of a class that occurs in several compositions remains to be chosen, however: is there one instance for all of them, or one per composite class ? In the first case, one could often want to write class methods and variables directly into composed classes, as in section VII. Modelled after self.method() and super.method(), one could define previous.method(). Because if a class can be reopened several times, it could also be nice to be able to overwrite a method, and call the old version from the new one with previous.method(). A simple application would be tweaking a framework without touching the source code. (Changing a method "in place" by reopening the original scope also affects all methods that have the original one in scope, an effect that cannot be reached in any other way.) Name spaces can be useful in all kinds of programming languages, but here in particular, to prevent that classes are reopened without the programmer realizing that they already existed. This is extra useful for base classes with very generic names, like indeed the class "generic" from the circles and ellipses example; and also for base classes of composite classes that one uses through nesting with the composite class: the base classes are irrelevant for scoping, but they are still considered individually when determining predominant methods, so they should be available but protected from name clashes. We could introduce virtual methods as methods every call to which will automatically be translated to a self send. One might be tempted to improve the language by checking all self sends at composition time, or to add a provides-requires-mechanism to the language, to make mixins safe. However, I do not see the sense in adding a little bit of static type checking to a dynamically typed language. I would rather consider going with static typing all the way. (Static typing would probably require that adding more classes to a composite class should never cause conflicts that eliminate predominant methods, but that seems like a reasonable restriction.) X Vision in Three Paragraphs In class based languages, class names are usually nouns. The system described above invites the programmer to write many classes as adjectives, or concerns; from these composite classes are constructed, the names of which will again often be nouns. The second step, composition, makes that we need a second mechanism after scoping to determine which methods are predominant in the composite class; in traditional class based languages these mechanisms can be one and the same, because there is a hierarchy among superclasses and subclasses. I hope that writing concerns as classes can contribute to untangling code conceptually, albeit at the price of more complex scoping mechanics. A good development environment should help the programmer cope with the latter in a way that it cannot do for the former. Dirk van Deun, December 2007, April 2008 (dirk@dinf.vub.ac.be)