Class-based object-oriented languages have two fundamental operations - class declaration and object instantiation. Object-based languages (hi, jim!) have only one. The fundamental operation is to copy an existing object, which I'll call "cloning".

Semantics of Cloning

What does it mean to be a "clone" of another object? The new object will have "copies" of all its progenitor's fields and methods. But are these full copies, or merely shared references?

One obvious way to clone an object is to make a shallow copy. This means that when object B is cloned from object A, B will have a snapshot of all of A's members at that point in time. Reassigning members of A will not update B, or vice-versa. However, some of the members may be references to complex structures. A and B will (at least initially) share the same reference, so if the contents of the structures are changed both A and B will reflect that.

To illustrate:

  >>>A.x    # x is an int
  5
  >>>A.l    # l is a list
  [1, 2, 3]
  >>>B = A.clone()
  >>>A.x = 6    # updates A.x but not B.x
  >>>A.x
  6
  >>>B.x
  5
  >>>B.x = 7    # updates B.x but not A.x
  >>>A.x
  6
  >>>B.x
  7
  >>>B.l.append(4)    # the list referenced by B.l is updated, not B.l itself
  >>>B.l
  [1, 2, 3, 4]
  >>>A.l
  [1, 2, 3, 4]
  >>>B.l = [10, 11, 12]    # reassign B.l
  >>>B.l
  [10, 11, 12]
  >>>A.l
  [1, 2, 3, 4]

Another method is to delegate. B will get an internal pointer to A, and any member access which it can't handle directly will be delegated to A. This means that updates to A will affect B, but updates to B will not affect A. The most straightforward way to handle the contents of structures stored by reference is as in a shallow copy, but it is possible that the system could notice when B is updating one of A's structures and make a copy of it instead. Note that this is very similar to inheritance; in fact, this may be how the system provides inheritance.

  >>>A.x
  5
  >>>B = A.clone()
  >>>B.x    # falls through to A.x
  5
  >>>A.x = 6    # updates A.x; B.x still falls through
  >>>A.x
  6
  >>>B.x
  6
  >>>B.x = 7    # assigns B.x for the first time
  >>>A.x
  6
  >>>B.x
  7

The most complex method to implement is to do a deep copy. This is complex because for large structures it is extremely wasteful, so it would almost certainly need to be implemented through copy-on-write. With this method, object B gets its own copy of all members of A, including a private copy of any structure stored by reference. No further operations on A or B can affect the other object.

There is no clear winner here. Deep copies, if they are implemented efficiently, hold the fewest pitfalls to the programmer, but make it more difficult for two objects to share state. The other two techniques are easy to implement, but the programmer must keep track of what data is shared between objects. Note that it is easy to provide an init() method which makes new, private members if they are necessary, so sharing data by default is not a huge hurdle.

Syntax of Cloning

Python has no generic scoping block, so the only way to clone an object and then modify it is to repeatedly refer to the object. With long names, this gets cumbersome. It would be useful to have some analogue of the "class" syntax to create a named object and modify it in one block statement. However, a "clone()" method is still needed for creating anonymous, identical objects. Prothon provides the "with" statement (similar to a proposed Python 3.0 extension) so that the object to be edited need only be given once.

An extended block syntax is especially important for Python because its support for anonymous functions (the lambda keyword) is limited to one-line functions. So, outside the class statement, the only way to bind a complex method to an object is to make a global function and then assign it, polluting the global namespace. Except for the lambda problem, it would not be too onerous to use a short alias to avoid repeating the object name over and over.

No extra syntax:

  >>>B = A.clone()
  >>>B.num = 0
  >>>def print_num(self):
  ...   print "My number is ", self.num
  >>>B.print_num = print_num
  >>>for i in range(1, 1000):
  ...   anon = B.clone()
  ...   anon.num = i

"class"-like statement:

  >>>object B(A):
  ...  num = 0
  ...  def print_num(self):
  ...    print "My number is ", self.num
  >>>for i in range(1, 1000):
  ...  anon = B.clone()
  ...  anon.num = i
  ...  # limited to basic Python syntax with anonymous objects

"with" statement (note that this is not exactly the Prothon syntax):

  >>>B = A.clone()
  >>>with B:
  ...  num = 0
  ...  def print_num(self):
  ...    print "My number is ", self.num
  >>>for i in range(1, 1000):
  ...  anon = B.clone()
  ...  B.num = i
  ...  # could also use "with anon", but it's wasteful for a single line
  ...  # however, can still use "with" to add method definitions to "anon"

The "object" statement is slightly more limited than "with". However, this isn't fatal - if we want to create 1000 objects identical to B but with a new method, we could always create B' for the sole purpose of holding that method and then clone B' 1000 times.

Notice that, except with the "object" statement, there's no need for objects to be modified at creation time. The object can be "re-opened" and new members added at any time. Python can already have methods rebound in this way, of course, but without the shortcut of the "class" statement it's cumbersome. The "object" statement could be brought even with the other syntaxes by allowing a second use of "object Foo" to re-open the object, but this is problematic. For instance, what would it mean to say "object Foo(A)" follwed by "object Foo(B)"? Also, should the programmer be encouraged to edit system core objects in this way?

Inheritance

I mentioned above that cloning by delegation is quite similar to inheritance. In fact, that's how Io implements inheritance, so I'll refer to this as "Io-style". In this model, a "clone" of an object isn't a clone at all; it's a blank object with a parent slot. (You could also use a shallow-copy clone and add a parent slot, but this wouldn't be very useful, since the parent slot would only be used for members added after the clone. This might make sense in a language where fields and methods were treated separately, so fields would be copied and methods delegated, but that's not Python.)

The other model of inheritance is Self-style. In Self, cloning an object makes an exact duplicate, and the new object's parent is the same as the old object's. The parent slot is publically visible, so you can change the inheritance by reassigning it. The important difference is that a child object could be cloned from any other object, not just its parent, so the way to create an object which delegates most requests is to simply clone the empty Object.

The Self language itself implements the parent slot with a lexical rule. Any member whose name ends in a * is treated as a parent slot (the canonical name is "parent'*"). If there is more than one slot named like this, we have multiple inheritance. (I admit that I'm not quite clear on how Self resolves this situation, since I'm used to having an ordered list of parents.) This has the advantage that parents can be given descriptive names, such as "interface_parent*'" and "data_parent'*'". However, we could just as well implement Self-style inheritance with a more typical list of parents (held in a slot always called "parents", for example) and use ordinary member variables to give descriptive names. This merely duplicates some information. (It's also possible to update "parents" and forget to update the corresponding member variable, but I doubt this is an important consideration. Most of the time parents won't be reassigned at run-time so it's enough to know the name of the parent object without storing it in a named slot.)

Self-style inheritance requires that parents be reassignable, because otherwise there would be no way to assign them at all. Io-style inheritance does not require this. Interestingly, Io-style inheritance allows "cloning" multiple objects at once if multiple inheritance is allowed, since it really creates a blank object that can delegate using any method of tie-breaking (such as Python's standard MRO). This means that the "class-like" syntax described above is most suited for Io-style inheritance, since this most closely parallels the standard Python "class Class(bases)". However, it could be restricted to listing a single object as progenitor in order to use Self-style inheritance.

I believe Self-style and Io-style inheritance are roughly equivalent in power. Self-style can exactly duplicate Io-style by cloning an empty object. Self-style can duplicate one object and inherit from another; Io-style could duplicate this by manually copying the fields of the one and then cloning from the other, (This is cumbersme, but Python's dir() statement should be enough to allow it to be done automatically, A library implementing Io-style inheritance may wish to provide utilities to do shallow copies as well.)

Inherited Method Calls (aka super())

Implementations making use of Python's MethodResolutionOrder will need to use the built-in function super(). The syntax is super(Class, object).method(...). Obviously, the need for a class is bad news for an object-based language or library. The system will need to provide a wrapper or analogue for super.

Some terminology is needed. In a class-based language, a method is defined in a class and executes in the context of an object. In an object-based language, it is defined in an object, and executes in the context of another object (or possibly the same one). For instance, in the following code:

 >>>class A:
 ...  name = "A"
 ...  def foo(self):
 ...    print "A.foo running in ", self.name, "'s context"
 >>>class B(A):
 ...  def foo(self):
 ...    print "B.foo running in ", self.name, "'s context"
 ...    A.foo(self)
 >>>b = B()
 >>>b.foo()
 B.foo running in B's context
 A.foo running in B's context

There are two copies of foo(), attached to A and B. b.foo() first runs B.foo in object b's context, and then it runs A.foo, still in b's context (that is, self is always set to b).

This becomes more confusing in an object-based language, since we can't simply remember that the method is attached to a class and runs in the context of an object. We must keep in mind which object might supply the execution context, even when the code we are looking at is attached to a different object.

The closest analogue to the above code is:

 >>>A = Object.clone()
 >>>with A:
 ...  name = "A"
 ...  def foo(self):
 ...    print "A.foo running in ", self.name, "'s context"
 >>>B = A.clone()
 >>>with B:
 ...  name = "B"
 ...  def foo(self):
 ...    print "B.foo running in ", self.name, "'s context"
 ...    A.foo()        # note not A.foo(self) - this is a regular object method call!
 >>>B.foo()

Here, we run B.foo in B's context, and it in turn runs A.foo in A's context. In class-based Python, we keep the execution context explicitly by passing self as a parameter; this is possible because it knows that A is a class and not an object, so A.foo is an UnboundMethod? call. In an object-based language, we don't have this handy distinction - when self is A, A.foo(self) would end up calling foo(A, A) instead of foo(A), since Python implicitly adds the self parameter in a method call. We need to add some syntax to retrieve an unbound method (or otherwise keep track of the execution context).

Even once we add syntax to allow "A.foo(self)" to work unambiguously, we still need to add an analogue to super(). super(Class, obj) allows Python to use the nifty MethodResolutionOrder to return class A automatically, allowing the programmer to avoid typing the parent class manually and resolving diamond inheritance patterns cleanly. In this call, Class is the class where the method is defined and obj is the current execution context - thus, in a method of A, the call will always look like "super(A, self)". In an object-based variant, super would simply take two objects instead of a class and an object.

As an aside, the only reason that super() needs the class and object passed explicitly is that Python's parser hasn't been extended to supply this information automatically. Bonus points to any implementation that can beat Python to this.

One final problem with super in an object-based language - it is, of course, short for superclass, which simply makes no sense without classes!

Summary

  • There are three semantics for object cloning: CloningByShallowCopy?, CloningByDeepCopy?, and CloningByDelegation?.
  • Additional syntax is helpful. I've described the ObjectDefinitionStatement? and the WithStatement?.
  • There are two types of inheritance - IoStyleInheritance? (with implies CloningByDelegation?) and SelfStyleInheritance?.
  • True SelfStyleInheritance? uses named slots, which makes MultipleInheritance? complicated. A slight variation using an ordered list allows use of the standard MethodResolutionOrder. IoStyleInheritance? can always use the MethodResolutionOrder if the designer chooses to allow MultipleInheritance?.
  • An implementation must provide a way to call an inherited method without using the parent class to get an UnboundMethod? to call. Similarly, the super() call should be tweaked to work without a class.