3 Nov 2013

Analyzing the Performance Cost of Language Features

It will be interesting to see an analysis of what makes Java (or any other programming language) slow. Language design necessarily requires compromises between performance and other attributes like convenience, succinctness, programmer productivity, readability, maintainability, dynamic features, potential for run-time errors, etc. We should go back and try to figure out the cost of each language feature or design decision, so that we can hopefully do better the next time.

This is like optimizing a program – how can you do so unless you have profiled it first? By the same token, we should figure out what makes a given programming language slow, so that we can make a more informed decision when we design the next language.

Put differently, if you were given a language (Java, Python, C++, what have you) and asked to design a variant that’s faster, what changes will you make? “Faster” could mean any of the following:

  • throughput (like the number of requests a server can process per second)

  • latency / responsiveness: how soon do we get a response when given input. This could be in the context of a server, or a client app, where it measures how soon the screen changes when the user touches the screen. We could measure any latency at any percentile.

Taking Java as an example, what’s the cost of each of the following language features:

  • Every object requires a separate allocation, indirection to access, and a GC. This goes for arrays, too. Arrays of objects are worse, because they require two levels of indirection, one for the array itself, and another for the object references stored in the array. Multidimensional arrays and objects composed of other objects suffer at least two indirections, too.

  • Garbage collection as opposed to explicit deallocation.

  • Every object needs to know its type. This takes space, requires initialization, and checking whether you do a cast.

  • Java guarantees memory safety: All array elements and objects fields must be zero-initialized, array indices have to be bounds-checked, and references have to be checked for null before dereference.

  • The Java Memory Model imposes more requirements than, say, is the case with C++. How much do these hurt performance, if at all? For example, no matter what you do in a Java program, the language rules cannot be violated. A reference to a Foo must always point to a Foo, pointers cannot point to an invalid location, etc. These kinds of things can happen in C++ due to race conditions, while the JVM has to ensure that they don’t. What’s the cost of these checks?

  • Java code doesn’t execute natively – it requires a VM. You can’t easily take advantage of CPU-specific instructions for things like SIMD types, half-precision floats, AES and SHA-1, permutation and combination instructions, and so forth. Further, GPU offload is hard in Java, while native code offers OpenCL or similar APIs.

  • Java APIs are in effect a platform on top of a platform.

  • The JVM can’t memory-map classes in, as with native code, where the binary or .so file is mmap()ed in.

  • The Java compiler can’t do as many optimizations as possible in a statically typed languages like C++. Methods are virtual by default, and you can dynamically load a class that overrides a given method. Instead of compile-time optimizations, Java uses run-time optimizations, along with runtime de-optimizations when needed. What’s the cost of these?

  • Java doesn’t have an algebraic type, which some languages like Haskell directly support. The usual way to model this in Java is an inheritance hierarchy, which has its overhead – multiple classes need to be loaded, method calls are virtual and harder to inline, casts need to be checked, and so forth.

  • Java practice calls for using the interface type rather than the concrete type, like List instead of ArrayList. Whereas, in C++, you would just pass a pointer or reference to a vector, which is a concrete type.

  • The Java community likes to over-design and build AbstractFactoryFactories. This is not strictly a language issue, but nevertheless may have a cost on performance.

  • On the plus side, Java code doesn’t need to copy arrays or objects to keep them alive. Just keep a reference and the GC will keep the target alive as long as the reference exists. Whereas in C++, if you’re given an object or array, you may sometimes have to copy it to ensure that it’s not deallocated until you’re done with it.

This is a laundry list of features in Java that may decrease performance. It would be interesting to see a study that quantifies the cost of each of these features or design decisions, so that people designing languages in the future can make more informed decisions.

For example, if it turns out that the fact that objects and arrays in Java are always handled by reference hurts performance more than any other factor, then perhaps languages designed in the future could have a struct type that’s passed by value (as Go does), or permit an object to be stored inline in the stack or containing object or array (as C++ does). But if zero-initialization of arrays and objects doesn’t hurt performance much, then a hypothetical future language could retain this feature.

Whatever it is, we need data to make informed decisions, and make the right tradeoffs, retaining features that don’t hurt performance much but make programming more convenient, while abolishing or providing alternatives to features that hurt performance a lot.

No comments:

Post a Comment