Self programming language
|
Self is an object-oriented programming language based on the concept of prototypes. It was used primarily as an experimental test system for language design in the 1990s; however, as of September 2004, Self is still being actively developed. The last major version is 4.2.1, which was released in April 2004.
Contents |
History
Self was designed primarily by David Ungar and Randall Smith in 1986 while working at Xerox PARC. Their objective was to push forward the state of the art in object-oriented programming language research, once Smalltalk-80 had gone out of the labs and began to be taken seriously by the industry. They moved to Stanford University and continued work on the language, building the first working compiler in 1987. At that point focus changed to attempting to bring up an entire system for Self, as opposed to just the language.
The first public release was in 1990, and the next year the team moved to Sun Microsystems where they continued work on the language. Several new releases followed until falling largely dormant in 1995 with the 4.0 version. The latest 4.2 version was released in 2004 and runs on Mac OS X and Solaris.
Self also inspired a number of languages based on its concepts. Most notable, perhaps, was the NewtonScript language for the Apple Newton. Other examples include Io, Cel and Agora.
The problem
Traditional object languages are based on a deep-rooted duality. Classes define the basic qualities and behaviours of objects, and instances are a particular object based on a class.
For instance you might have a Vehicle
class that has a "name" and the ability to perform "drive to work" and "deliver construction materials". Porsche 911
is a particular instance of the class Vehicle
with the name set to "Porsche 911". In theory you can then send a message to Porsche 911
, telling it to "deliver construction materials".
This example shows one of the problems with this approach. A Porsche is not capable of delivering construction materials (in any general sense anyway!) but this is something that a vehicle can do. In order to avoid this problem we have to add additional specialization to Vehicle via the creation of subclasses. In this case one could imagine "sports car" and "flatbed truck".
This is a contrived example but it illustrates a very real problem. Unless you can predict with certainty what qualities the objects will have in the distant future, you cannot design your class hierarchy properly. All too often the program will evolve to require additional behaviours, and suddenly the entire system has to be re-designed (or refactored) to break out the objects in a different way.
Experience with early OO languages like Smalltalk showed that this sort of issue came up time and time again. Systems would tend to grow to a point and then become very rigid, as the basic classes deep below the programmer's code were simply "wrong". Without some way of easily changing the original class, you might have a serious problem.
Dynamic languages such as Smalltalk allowed for this sort of change via well-known methods in the classes, by changing the class the objects based on it would change their behaviour. But in other languages like C++ there is no such ability, and making such a change can actually break other code, a problem known as the fragile base class problem. In general such changes had to be done very carefully, as other objects based on the same class might be expecting this "wrong" behavior: "wrong" is often dependent on the context.
The solution
The problem here is that there is a duality, classes and instances. Self simply eliminated this duality.
Instead of having an "instance" of an object that is based on some "class", in Self you make a copy of an existing object, and change it. So "Porsche 911" would be created by making a copy of an existing "Vehicle" object, and then adding the "drive to work" method. Basic objects that were used primarily to make copies of were known as prototypes.
This may not sound earth shattering, but in fact it greatly simplifies dynamism. If you have to fix a problem in some "base class" because your program has a problem, simply change it and make copies of that new object instead. No other program will see this change. If at some point in the future Porsches can "deliver construction materials", big deal, add it.
Better yet, this actually dramatically simplifies the entire OO concept as well. Everything might be an object in traditional system, but there is a very fundamental difference between classes and instances. In Self, there isn't.
The language
Self objects are a collection of "slots". Slots are accessor methods that return values, and placing a colon after the name of a slot sets the value. For instance if you have a slot called "name",
myPerson name
returns the value in name, and
myPerson name:'gizifa'
sets it.
Self, like Smalltalk, uses blocks for flow control and other duties. Methods are objects containing code in addition to slots (which they use for arguments and temporary values), and can be placed in a Self slot just like any other object: a number for instance. The syntax remains the same in either case.
Note that there is no distinction in Self between fields and methods: everything is a slot. Since accessing slots via messages forms the majority of the syntax in Self, many messages are sent to "self", and the "self" can be left off (hence the name).
Basic syntax
The syntax for talking to slots is Smalltalk-like. Three kinds of messages are available:
- unary
-
receiver slot_name
- binary
-
receiver + argument
- keyword
-
receiver keyword: arg1 With: arg2
All messages return results, so the receiver (if present, otherwise "self is implied") and arguments can be themselves messages. Following a message by a period means you want to discard the returned value. For instance:
'Hello, World!' print.
This is the Self version of the hello world program. The '
syntax indicates a literal string object. Other literals include numbers, blocks and general objects.
Grouping is as in math, using parentheses. In the absence of explicit grouping, the unary messages are considered to have the highest precedence followed by binary and the keywords having the lowest. The use of keywords for assignment would lead to some extra parenthesis where expressions also had keyword messages, so to avoid that Self defines the first part of the keyword to start with lower case and all the rest with upper case letters. So:
valid: base bottom between: ligature bottom + height And: base top / scale factor.
This has exactly the same meaning as:
valid: ((base bottom) between: ((ligature bottom) + height) And: ((base top) / (scale factor))).
In Smalltalk-80, the same expression would look like:
valid := self base bottom between: self ligature bottom + self height and: self base top / self scale factor.
Making new objects
Consider a slightly more complex example:
labelWidget copy label: 'Hello, World!'.
makes a copy of the "labelWidget" object with the copy message (no shortcut this time), then sends it a message to put "Hello, World" into the slot called "label". Now let's do something with it:
(desktop activeWindow) draw: (labelWidget copy label: 'Hello, World!').
In this case the (desktop activeWindow)
is performed first, returning the active window from the list of windows that the desktop object knows about. Next (read inner to outer, left to right) the code we examined earlier returns the labelWidget. Finally the widget is sent into the draw slot of the active window.
Inheritance
In theory, every Self object is a stand alone entity. There are no classes, meta-classes and so on to help it do its job. Changes to this object don't affect any other, but in some cases it would be nice if they did. Normally an object can understand only messages corresponding to its local slots, but by having one or more slots indicating parent objects the object can delegate any message it doesn't understand itself to them. Any slot can be made a parent pointer by adding an asterisk as a suffix. In this way Self handles duties that would use inheritance in more traditional languages. It is also used to implement name spaces and lexical scoping.
For instance, you might have an object defined called "bank account" that is used in a simple book keeping application. Typically this object would be created with the methods inside, perhaps "deposit" and "withdraw", and any data slots needed by them. This is a prototype, which is only special in the way it is used since it also happens to be a fully functional bank account.
Making a clone of this object for "Bob's account" will create a new object which start out exactly like the prototype. In this case we have copied the slots including the methods and any data. However a more common solution is to first make a more simple object called a traits object which contains the items that one would normally associate with a class.
In this example the "bank account" object would not have the deposit and withdraw method, but would have as a parent an object that did. In this way many copies of the bank account object can be made, but we can still change the behaviour of them all by changing the slots in that root object.
How is this any different than a traditional class? Well consider the meaning of:
-
myObject parent: someOtherObject.
This is quite interesting, it changes the "class" of myObject at runtime by changing the value associated with the 'parent*' slot (the asterisk is part of the slot name, but not the corresponding messages).
Adding slots
How can copied objects in Self be modified to include new slots? Using the graphical programming environment, this is very easy. Programmatically, the proper way to do it is to create a mirror object reflecting the one that will be modified and then send a series of messages to that mirror.
A more direct way is to use the primitive '_AddSlots:'. A primitive has the same syntax as a normal keyword message, but its name starts with the underscore character. The _AddSlots primitive should be avoided because it is a left over from early implementations. However, we will show it in the example below because it makes the code shorter.
An earlier example was about refactoring a simple class called Vehicle in order to be able to differentiate the behaviour between cars and trucks. In Self one would accomplish this something like this:
_AddSlots: (| vehicle <- (|parent* = traits clonable|) |).
Since the receiver of the '_AddSlots:' primitive isn't indicated, it is "self". In the case of expressions typed at the prompt, that is an object called the "lobby". The argument for '_AddSlots:' is the object whose slots will be copied over to the receiver. In this case it is a literal object with exactly one slot. The slot's name is 'vehicle' and its value is another literal object. The "<-" notation implies a second slot called 'vehicle:' which can be used to change the first slot's value.
The "=" indicates a constant slot, so there is no corresponding 'parent:'. The literal object that is the initial value of 'vehicle' includes a single slot so it can understand messages related to cloning. A truly empty object, indicated as (| |) or more simply as (), cannot receive any messages at all.
vehicle _AddSlots: (| name <- 'automobile'|).
Here the receiver is the previous object, which now will include 'name' and 'name:' slots in addition to 'parent*'.
_AddSlots: (| sportsCar <- vehicle copy |).
sportsCar _AddSlots: (| driveToWork = (some code, this is a method) |).
Though previously 'vehicle' and 'sportsCar' were exactly alike, now the latter includes a new slot with a method that the original doesn't have. Methods can only be included in constant slots.
_AddSlots: (| porsche911 <- sportsCar copy |).
porsche911 name:'Bobs Porsche'.
The new object 'porsche911' started out exactly like 'sportsCar', but the last message changed the value of its 'name' slot. Note that both still have exactly the same slots even though one of them has a different value.
The environment
Perhaps one of Self's few problems is that it is based on the same sort of virtual machine system that earlier Smalltalk systems used. That is, programs are not stand-alone entities as they are in languages such as C, but need their entire memory environment in order to run. This requires applications to be shipped in chunks of saved memory known as snapshots which tend to be large and annoying to use.
On the upside the Self environment is very powerful. You can stop programs at any point, change values and code, and continue running where you left off. This sort of "on the fly" development delivers a huge increase in productivity.
In addition the environment is tailored to the rapid and continual change of the objects in the system. Refactoring your "class" design is as easy as dragging methods out of the existing ancestors and into new ones. Simple tasks like test methods can be handled by making a copy, dragging the method into the copy, then changing it. Unlike traditional systems, only that object has the new code, and nothing has to be rebuilt in order to test it. If the method works, simply drag it back into the ancestor.
External links
- Self Home Page at Sun Microsystems (http://research.sun.com/self/)
- Papers on Self from UCSB (mirror for the Sun papers page) (http://www.cs.ucsb.edu/labs/oocsb/self/papers/papers.html)
- Self resources at Cetus Links (http://www.cetus-links.org/oo_self.html)
- Merlin Project (http://www.merlintec.com/lsi/)
- Yahoo! Group on Self (http://groups.yahoo.com/group/self-interest/)
- Self ported to Linux (without many optimizations) (http://gliebe.de/self/index.html)
- Automated Refactoring application on sourceforge.net, written for and in Self (http://selfguru.sourceforge.net/)