1 Mar 2010

The "all you can do is send a message" model isn't enough

I came across a post by Jonathan Rees that I couldn't understand, but after several re-readings, it made sense, and it's an interesting point — the "all you can do is send a message" model, a.k.a the actor model, isn't enough.

First, some background, if you're not a programming language geek: many languages support message passing, which is like calling a method but different in that the receiver can define a catch all method that traps all unresolved method calls. Ruby, for example, calls this method_missing. This is different from Java where a class has to specify at compile time what methods it defines. Message passing is obviously more powerful, since it lets you implement things like proxies for RPC, access control or object persistence that can transparently decorate any object without knowing its interface.

The "all you can do is send a message" model rules out any other way of interacting with an object -- instanceof, object identity (that is, comparing pointers) and, of course, fields, which anyway can be looked at as synactic sugar for methods.

With that background, the aforementioned post says that this model, despite seeming to be more powerful since it lets objects control and customize all interaction with them, is actually less powerful since you can't deal with unknown objects in a safe way. For instance, if you're building an OS or runtime and implementing file I/O, you can't check paths against an ACL in a safe way. You can access the string object, but who knows if it's a nefarious imposter that returns one value when you check its value, and another when you actually access the path? You could store individual characters in the string in an array of your own, but who knows if the character objects are themselves safe? To generalize, you can't have any scenario in which you load untrusted or partially trusted code into the same process and attempt to enforce a reasonable security policy. Examples of these being a Javascript interpreter in a browser that allows script to access the same domain but not others, or an OS that uses sandboxing rather than hardware-supported processes.

The only way out is to have an instrinsic notion type of an object as something visible to clients, independent of the behavior (methods). There must be a builtin instanceof operator or function as part of the language itself, and a way to compare pointers to see if they point to the same object.

Once you have this notion of the type of an object independent of its behavior (methods), methods don't need to be defined as part of the object; instead they can be defined as independent entities and hang off the object type. That is, an object is merely a map of key-value pairs, with an associated tag or type. Methods can be defined outside the object, and found based on the types of the arguments (multimethod dispatch). And you can build the language all the way up from that.

2 comments:

  1. Yes, typing provides some safety. But the same typing-based safety can be provided in actors using typed contracts (You can look at contracts in Singularity OS and Microsoft's Axum language).

    An additional layer of security/safety that actors provide is that of (memory) isolation. If a third-party component communicates with me I am assured it can not directly access any of my state/memory, and I can enforce any security/type checking on the message it sends to me. (Note that this is looking at your scenario from the other side). This is specially true of the 3rd party components in browser and OS, which you mentioned as examples.

    ReplyDelete
  2. At the risk of repeating myself, my point -- actually Rees's point as I understand it -- is that typing provides safety, in the sense of catching errors earlier, but also power, in the sense that without typing at all (the all you can do is send a message model) you can't load un- or partially trusted code and provide reasonable security.

    ReplyDelete