25 May 2016

How To Evolve Programming Languages

Most progress in programming languages comes from newer languages, as opposed to evolution of existing languages.

Other software tends to evolve a lot. iOS 9 is a hugely different operating system from iOS 1. Apple didn’t have to throw iOS away and write another OS. They were able to evolve iOS to meet today’s needs.

Why isn’t that the case with programming languages? Java or C++ today is not that different a language from what it was five years ago, certainly not as different as iOS today is from what it was five years ago. Languages evolve slowly.

If you ask why this is the case, you’re usually told that once there’s millions of lines of code in a language, you can’t break that code. But is that true? Are there some types of changes you can make to a programming language with minimal disruption?

One can classify changes to a language in three ways:

  • Changes that make code that didn’t compile earlier do so now.
  • Changes that make code that compiled earlier no longer do so.
  • Changes to the runtime behavior of code.

Doing away with a compiler error

… is a safe change to make. For example, let’s say we want to enhance Java to implicitly convert arrays to lists, so that you can write:

List<String> cities = [“Bangalore”, “San Francisco”, “London”];

instead of having to write:

List<String> cities = Arrays.asList([“Bangalore”, “San Francisco”, “London”]);

Since this makes code that didn’t compile earlier compile now, it’s a safe change to make to the language.

Adding a new compiler error

The second category of change is to introduce a new compiler error, making code that compiled earlier no longer compile.

For example, one mis-feature of Java is that any reference can be null, not just ones declared @Nullable. After all, if a reference is of type String, for example, it means that you can invoke methods like toUpperCase() on it. Since you can’t invoke this method (or other methods declared in the String class) on a null reference, it follows that null logically isn’t of type String any more than a File is of type String.

Suppose the Java standards committee wanted to fix this flaw in the next version of Java, Java 9, by allowing a null value only if the reference is declared @Nullable. They should go ahead and make this change. What about old code that no longer compiles?

The solution is to release a tool that automatically adds @Nullable annotations where required. This should come in various flavours — an Eclipse plugin, an IntelliJ plugin, and a standalone command-line tool — so that no matter what IDE you use, you can easily update your code.

Importantly, the output of the tool also compiles on earlier versions of Java, like Java 8, since @Nullable annotations are legal there, too. So, there’s no downside to running the tool. It’s not a one-way ticket to Java 9. If you later need to compile the code with a Java 8 compiler [1], you can. So you lose nothing by running the tool. Even if you have no intention of upgrading to Java 9, you can still run the tool for its code quality improvements.

One can run the tool before migrating to Java 9, rather than having to do it atomically, which is hard. For a big project, you would want to migrate individual directories, rather than submitting a huge patch that touches thousands of files, which is risky, can break something, and can result in nightmarish merge conflicts.

With such an easy upgrade mechanism, a language can introduce a new compiler error with minimal disruption to existing code [2].

Changing Runtime Behavior

The third category of changes are those that change runtime behavior, without the benefit of compile-type checking.

For example, Objective-C handles null differently from Java. Rather than throwing a NullPointerException, you can invoke any method on nil (as it’s called in Objective-C) and it becomes a noop. It’s as if you’d prefixed each method call with:

if (object != nil)

In fact, Objective-C code eliminates a lot of the null checks that Java code needs.

Suppose the Java standards committee wanted to change Java to adopt the Objective-C behavior [3]. Suppose they wanted to make this change in the next version of Java, Java 9. How would they do this? By relying on the -source flag that javac accepts. This flag tells the compiler to treat the file as if it were written in an earlier version of the language. You would either upgrade each file, or modify your build script to invoke javac with -source 8.

In summary, programming language committees shouldn’t be so conservative when it comes to evolving the language. There are ways to make backward-incompatible changes with minimal disruption. That way, programmers get the benefit of the new features, rather than being stuck with what they use, and the language remains productive and relevant longer. Everyone wins.

[1] Or the Java 9 compiler invoked with -source 8, which asks it to treat the file as written in the Java 8 language.

[2] An alternative to upgrading all the source files is to invoke the Java 9 compiler with -source 8, which makes it accept null references whether or not it’s annotated as @Nullable. This solution would be useful for old code or third-party code, which you don’t want to invest time in upgrading. Or if you want to first upgrade the compiler and then update the code.

[3] Leaving aside the question of which is better. Alternatively, imagine changing Objective-C to adopt the Java behavior.

No comments:

Post a Comment