26 Mar 2017

Progamming Languages Should be Built Up Using Zero-Overhead Abstractions

How do you design a programming language?

You can do it in an unprincipled way, by taking a popular language and adding features to it without an underlying logic to exactly what features you're adding, and why, and why not other features. And why the features you've added took the form they did. This can never produce languages as good as a principled approach can, where everything that's there is there for a justifiable reason, and it all fits together. Great design feels inevitable.

So, what's a principled approach to language design? One is to build the language up using zero-overhead abstractions. As Stroustrup defines them:
What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
An example is Java: it has both lists, which are resizable, and arrays, which aren't [1]. Using a list requires an additional memory allocation (for the list object itself), garbage-collection, and dereference on each access. Further, lists can't directly store primitive types like ints — you have to box each int into an Integer object.

So, lists impose extra costs over using arrays. But if you don't need the benefits of a list, you don't need to pay the cost — you can use a plain array. That's an example of the zero-overhead principle.

The zero-overhead approach to language design is to carefully identify these zero-overhead abstractions, and build the language up using them [2].

Why is this important? In one word: performance.


For a while, people like Paul Graham thought that the ever-increasing power of computers will mean that we'll be able to use less and less efficient abstractions as time goes by. But then came the cloud, and you had to pay for server costs. I've heard it asked at Google, paraphrasing: "Do you want to use C++ and run 100 servers, use Java and run 200, or use Python and run 2000? Your time is not as costly as you may think." [3]

After the cloud came mobile, which have weak CPUs and GPUs, and limited memory. Storage is limited, so your binary size matters. Even the power the device does have can't be used continuously, because that will deplete the battery. And the network is costly, with few users having unlimited plans, so you should use it judiciously.

Over the years, smartphones became more powerful, and some of these constraints eased, but then came smartwatches, which are far more resource- and battery- constrained.

As one category of device becomes more powerful, newer, less powerful devices are entering the scene. Like AirPods, Apple's wireless earphones, which support interactions like tapping and Siri. Imagine if, next year, they could run third-party apps. They will be even more resource-constrained than smartwatches. The AirPods run for only 4-5 hours, compared to phones or smartwatches, which run the better part of the day.

With IoT, you may have a smart key that's expected to last for a year.

The point is that as technology improves and one class of device becomes more powerful, we'll end up building newer, even more constrained devices. We'll never have a situation where performance or battery optimisation doesn't matter. In the sense that if your language is inefficient, you're ceding territory to other languages.

Then there's processor speed. Single-threaded performance has been increasingly only slowly for almost a decade. We don't have 30 Ghz CPUs. If you were using Python in 2003 assuming that in a few years, it will be as fast as C++ (was in 2003), that didn't happen. As single-threaded performance increases slowly, but multi-core performance continued increasing quickly, if you want to take advantage of the hardware, you have to multithread your app, which is a significant overhead for you to design, code and debug.

Now, with Moore's law slowing down, even multithreaded performance is going to increase slower than it has in the past. Which means we need languages that make more efficient use of the resources we do have.

Another factor is that as hundreds of millions of people in emerging markets like India come online, they're buying cheap phones. They'll use only apps that run well on those phones, which is again an argument for efficiency.

For all these reasons, performance is important in programming language design, and will remain so for the foreseeable future.

A List of Zero-overhead Abstractions

So, if performance is important, what are some zero-overhead abstractions that give us power without hurting performance?

One is type annotations, also known as static typing. These let the compiler generate [4] more efficient code [5].

Second, value types. These don't require dynamic allocation, deallocation [6] and indirection on every access. Languages like Ruby where even integers are heap objects fail this criterion, compared to Java, say.

Value types are more than things like numbers. They include structs. In a language like Swift, you can bundle multiple variables into a struct without the overhead of putting it in the heap [7]. Unlike Java, where objects have to live on the heap [8]. Similarly, an object that's a field of another object requires two allocations in Java, while a struct that's a member of struct in Swift or C doesn't require even one heap allocation.

Arrays are another example. A Java array of 100 objects in Java requires 101 allocations, whereas in C++, say, you can have only one allocation. Similarly, a multidimensional array in Java requires a separate allocation for each of the inner arrays. Constant-size arrays, like int[100], are another case, which can be allocated inline, without dynamic memory allocation, not even one.

Heap allocation is not the only overhead with Java objects. They have runtime type information, which allows things like instanceof to work. You should be able to omit when you don't intend to do something dynamic with it, again like structs in Swift [9].

Objects in dynamically typed languages like Javascript have even more overhead, because you can dynamically add and remove fields from it. Which is again fine when you want it, but you should be able to opt out of it when you don't, for performance. If you can't opt out, the language isn't zero-overhead.

Templates, also known as generics, are another zero-overhead abstraction. If you want to write code that works with different data types, you have two choices: using virtual methods to dispatch at runtime, and templates to do it at compile-time, which makes it a zero-overhead abstraction.

Importantly, Java generics aren't a zero-overhead abstraction, because dispatch is still dynamic, with the overhead that entails. By contrast, templates in C++, Swift [10] or Rust are zero-overhead.

Raw pointers are an important zero-overhead abstraction. Swift, for example, uses ref-counting, but it also supports unsafe pointers for situations in which you don't want the overhead of ref-counting [11].

Likewise, Rust supports multiple kinds of references, like ref-counting, unique references and borrowed references, but there are situations when all of them impose runtime overhead [12]. So Rust supports raw pointers as well.

This makes Swift and Rust zero-overhead, like C++ [13] [14].

Related to raw pointers is skipping array bounds-checking, in those few critical sections where it matters, as in Rust [15].

An important abstraction is functions. A language like Ruby that dynamically dispatches all calls is obviously not zero-overhead. You need static dispatch [16]. Overloading is zero-overhead, as are default arguments, and keyword arguments.

Varargs as implemented in Java, say, are also zero-overhead because you don't suffer the overhead when you don't have varargs, and the compiler-generated code is no slower than what you'd write.

Another example is properties, as in Swift, which are things that look like instance variables but are actually methods. Since properties are just syntactic sugar, they're zero-overhead abstractions.

Exception-handling can also be zero-cost, as it is in Objective-C [17].

Reflection can also be a zero-cost abstraction, in that while reflective calls may be slow, it shouldn't slow down normal calls.

The opposite of reflection is forwarding, a way for an object to forward calls to another object in a generic way, without having to implement each method separately. In Objective-C, for example, your class can define a method forwardInvocation. When someone calls a method on the object that isn't defined, before generating an error, the runtime invokes forwardInvocation, giving you a chance to deal with the call, usually by forwarding it to another object. Put differently, it treats the method name dynamically as just another argument. Forwarding is also a zero-overhead abstraction, since it doesn't slow down normal method calls.

A language can also support lazy functions, via a lazy keyword. These which execute only when the return value is accessed. A lazy function that returns a Foo is effectively a normal function that returns a Callable<Foo>. This has a single get() method that returns a Foo [18]. Have the complier implicitly call this get() method when needed. With this transformation, lazy evaluation is just syntactic sugar, and imposes no overhead when not used, making it a zero-overhead abstraction.


Languages should be designed in a principled way, and not by gluing together features without having a strong reason why those features were chosen and not others. When you decide what abstractions your language should have, first choose zero-overhead abstractions, so that the language is powerful and efficient at once. Abstractions that aren't zero-overhead make you choose between programmer productivity and efficiency, but if you can have both at once, that's the best choice.

In addition to new languages, you can retrofit zero-overhead abstractions onto existing languages, say to increase performance. For example, Java could choose to support structs in the next version. This increases performance without hurting productivity. Alternatively, Java could choose to add support for lazy functions, which increases power without hurting performance.

As another example, a dynamically-typed language like Ruby could support objects that have a fixed set of fields at allocation time, rather than dynamically adding and removing them.

A lazy language like Haskell could support an eager keyword for eager evaluation, in those hot spots where lazy evaluation imposes significant overhead.

These are all ways to retrofit zero-overhead abstractions onto existing languages, which increase performance, power or both at once.

[1] In Java, arrays are part of the language, while lists are implemented in the standard library, but even if lists are included in the language in a future version, it wouldn't make a difference to the zero-overhead nature of Java.

Conversely, arrays could have been part of the library rather than the language. As in Python. Again, a Java-like language with this design choice would still be zero-overhead, because you can use arrays when you want them, for greater performance.

Only if a language didn’t offer fixed-size arrays at all would it fail to be a zero-overhead language.

[2] You can and should keep the API of both arrays and lists as similar as possible. For example, in Java, array.length vs list.size(). They’re named differently, and one is a property while the other isn’t. This isn’t ideal.

[3] If you're thinking that most of us don't operate at the scale of Google, that's true, but we have less money, as well, to spend on hosting. A single App Engine instance with 1GB memory costs ₹10,000 a month. If I use an inefficient language that requires another instance, I have to pay ₹10,000 more a month. And that's just for one app. I plan to build multiple, so the costs add up.

Not to mention that if my app isn’t successful enough, or successful in acquiring users but not in revenue, I have to pay the hosting bill out of my pocket. Since I’m an independent developer, living off my savings, as opposed to a company or being VC-backed, this may force me to shut the app down sooner. Whereas, if I’d built it in a performant language and didn’t have a significant hosting bill, I’ll be able to let it continue running for a while and iterate on it a few times to give it a chance to succeed.

[4] By definition: If someone comes up with a more efficient dispatch mechanism that doesn't use type information, statically typed code can use that too.

[5] Note that I said that the language must support type annotations, not that it should require them. An optionally-typed language like TypeScript is also zero-overhead, because you can always add a type declaration to eliminate the overhead of dynamic typing. But a language like Ruby that forbids type declarations doesn't meet the bar of being zero-overhead.

We can also have a language with an optimistic type checker, rather than the pessimistic ones we see in most statically typed languages. That is, we can have a language that lets us do:

Collection c = new ArrayList();

The get() method is defined in List, not in Collection, but a language could permit this code, perhaps with a warning rather than an error. Then, the declaration that c is a Collection ensures that we can call Collection methods on it without risking a runtime error, and do so more efficiently than in a dynamically-typed language. In other words it’s just an assertion that c instanceof Collection, not a straitjacket that prevents you from calling List methods on it.

Such a language would also be zero-overhead.

[6] Whether garbage-collection, reference-counting or explicit C-style dealloc() calls, it imposes an overhead.

[7] Even better is C++, which lets you allocate a primitive on the heap, if that’s what you need, without having to wrap it in an object, as in Java. Whether something is on the heap and whether it’s an aggregate consisting of multiple variables are logically unrelated questions.

[8] Escape analysis optimises some, but not all, of these cases. In other words, if a function f() allocates an ArrayList and passes it to g(), which passes it on to h(), which doesn’t pass a reference to it elsewhere, escape analysis should stack-allocate it. I don’t think Java, for example, can do that every time. If my supposition is correct, escape analysis isn’t a substitute for explicitly being able to allocate a struct on the stack.

Though one solution to this would be to add annotations to method parameters saying that it doesn’t escape from the function. That is:

int max(@noescape List<Integer> list);

declares that the list doesn’t escape out of the max() function, say to be stored in a long-lived object. This means that max() either doesn’t pass the list to any other function or, if it does, that function itself declares it as @noescape.

[9] Then there are enums, which are effectively structs that take less space. A type-safe language can have tagged unions, which is better than not having enums at all.

[10] Even Swift or Rust generics are limited. For example, in Swift:

func closeBoth<T>(a: T, b: T) {

... generates an error saying that T doesn't have a close() method. You need to say something like:

func closeBoth<T: Closeable>(a: T, b: T) {

... where Closeable is an interface that defines a close() method. But this excludes classes that don't implement Closeable, even if they have a close() method.

The above code works in C++:

template<typename T>
void closeBoth(T *a, T *b) {

This makes C++ templates more flexible, which means you can use them in more situations, without having to dispatch dynamically.

[11] Swift’s raw pointers are very inconvenient to use:

let p = UnsafeMutablePointer<Foo>.alloc(1)

compared to a normal reference:

let p = Foo()

But you can still use them, so Swift remains zero-overhead.

[12] You can’t easily use unique and borrowed references to implement a doubly-linked list, for example.

[13] C++ is more dangerous than Swift or Rust, because you end up using unsafe pointers pervasively, not just when the performance is needed.

[14] Which is not to say that every language should have unsafe pointers. Unsafe pointers clearly doesn’t belong in Javascript, since it runs in the browser. Other languages could also choose to be secure, like Java. Which is a fine decision to make; just that they’re not zero-overhead, then: they are ceding some use cases to other languages.

[15] Rust again does it better than C++, where you do unchecked accesses pervasively, rather than only in a few critical sections where it matters. The C++ approach makes programs insecure and unreliable.

[16] Again, the default can be virtual, as in Java, or not, as in C++. Both are zero-overhead since you can dispatch statically when it matters.

Even if Java didn’t have final methods, it would remain zero-overhead since it has static methods, which are dispatched without runtime overhead.

[17] Though, if you’re using a language / coding style in which exceptions are regularly thrown and caught, I don’t know if zero-overhead exceptions slow down the program overall.

[18] The get() method can be called multiple times, but executes the underlying computation only once.

No comments:

Post a Comment