Languages have many abstractions that make programming more productive. These abstractions come with their own costs. These costs generally restrict what you can do with the language. For example, you wouldn’t implement a database in Python.
One way out of this problem is to design the language in which a way that you can bypass a costly abstraction to regain performance.
C++ does this right. Many of its abstractions are consciously designed to not impose an overhead when they are not used.
How do we apply this to Java? If Java were designed today from scratch, how would we design it to have more zero-overhead abstractions?
Objects in Java have a lot of overhead, in both time and space. In space, they have runtime type information and locking state for synchronized statements. In time, they need to be dynamically allocated, eventually garbage collected, and each access requires a dereference. How might we fix this overhead in both time and space?
Add structs. Structs have several properties that make them faster.
To begin with, they don’t need to be allocated on the heap. Local variables can be allocated on the stack. An array of structs contains the structs inline rather than needing to be an array of references. When a struct is a field of an object or of another struct, it can be allocated inline. This eliminates allocation, garbage collection and indirections on every class. In terms of space, a struct doesn’t need a header that heap objects have to have.
Structs have no runtime type information. You can’t have a pointer to a struct, because Java doesn’t have a & operator.
Structs can be passed by value to methods, and returned by value .
Structs support methods, like objects, because they are objects, just restricted, optimised ones.
OpenCL and Instruction Set Extensions
Java should support OpenCL, so that one should be able to write an optimised function that can execute on the GPU in addition to the CPU. Perhaps a function can be marked as optimised, in which case the compiler will check that you’re using a subset of features that work on the GPU.
Similarly, Java should support instruction set extensions like ARM’s NEON or Intel’s SSE. It’s possible to do this in a cross-platform way, by letting a vendor ship a library with functions that map to the extended instructions via compiler intrinsics. The library would also come with a pure Java implementation as a fallback, for other platforms. For example, ARM could ship a library for NEON, with functions like vmul, which multiplies a vector. A Java program can invoke this function. When this program is running on an ARM processor with NEON support, the function call maps directly to the vmul instruction. When it’s running on an x86 system, say, the pure Java implementation is used, so that programs remain cross-platform and efficient, at once.
Grand Central Dispatch
Threads have a lot of overhead: they consume a lot of memory for their stack. Creating too many threads slows down all of them, and can cause you to run out of memory. Threads have race conditions, and to fix them, you need locks, which in turn cause deadlocks and performance degradation.
A thread pool isn’t a complete solution, because the number of threads in the pool is generally fixed, while free system resources keep fluctuating over time.
iOS and OS X have a much better API in the form of Grand Central Dispatch. Java should learn from GCD and have a facility to submit a Runnable for execution when system resources are available. It could be as simple as a method in some class:
static void runLater(Runnable runnable);
You can fire off thousands of Runnables at once knowing that you’re not going to strain the CPU, or run out of memory and crash.
Java should support allocation of non-zeroed arrays. These could be a primitive type, or a struct, as long as there are no references. When such an array is garbage-collected, and subsequently reallocated, information can leak, so non-zeroed arrays should be used only for non-privacy-sensitive information. In particular, references can’t leak this way, so you can’t find out the address of an object in memory.
Java should also support unions of primitive, non-reference types. And bitfields.
And tail recursion, and unsigned ints.
Abstractions are what make programming possible, but they also restrict the situations in which a language can be used. As much as possible, abstractions shouldn’t impose an overhead when they are not used, so that you can bypass them for better performance. Or there should be an escape hatch, like a way to allocate non-zeroed arrays.
Such tweaks will make Java faster and usable in a wider variety of circumstances, while not significantly changing the essence of the language, and without compromising on the unique advantages of Java, like memory safety and hardware-independence.
 The compiler can still optimise it to a pointer to const, in C++ terminology.