26 May 2016

All Languages Should Have Type Inference

All programming languages, static or dynamic, should use type inference.

But for different reasons. Statically typed languages should infer types to make code less rigid, verbose and bureaucratic, and more reusable. Dynamically typed languages should infer types to detect latent bugs that could have been caught when writing the code.

Let's look at this in detail.

Statically Typed Languages


...  require a lot of verbiage: you need to tell the compiler what it already knows. For example, if you write:

File f = "foo.txt";

the compiler will tell you that it's a String, not a File. In that case, why can't the compiler infer the type itself? Why should you have to do something silly like

Foo foo = new Foo();

You should be able to write

var foo = new Foo();

Declaring types is just noise. It makes the code less readable, for no benefit. It also prevents reuse, as with this method:

void printAll(ArrayList<T> list) {
  for (T element: list) System.out.println(element);
}

This could work with all Lists, not just ArrayList, but because the type has been declared explicitly to be an ArrayList, it doesn't. You have to modify the method, but if you can't, say if it's part of a library, you have to reimplement it. The type declaration has prevented reuse.

If you're thinking that this is a beginner mistake that you would never make, and that you would instead declare the argument to be a List, you still can't use your function to print a Set. The function doesn't require its argument to be a List. For example, it doesn't index into the collection with an integer. It doesn't depend on the ability of a List to have duplicates. Since the function doesn't really require a List, logically, you should be able to reuse the function to print a Set. But you can't. The solution would have been the declare the argument as an Iterable. That's not obvious, is it?

The point is that having to explicitly declare types sometimes prevents reuse.

Instead of painstakingly constructing elaborate class hierarchies, and getting it wrong, you shouldn't have to declare types. Instead of worrying about what type a method argument is, let the compiler verify that it implements the methods that are being invoked on it.

For example, if you have a method:

void startJourney(vehicle) {
  vehicle.turnEngineOn();
}

the compiler should verify that the argument passed in to this method has a method named turnEngineOn(). It doesn't matter what class it is, or what interfaces it implements, as long as it has this method. That's all that matters, to avoid a runtime error. Classes and interfaces are an artifact, not the point of type safety. Unfortunately, many languages confuse the means for the ends, and aren't open to other, better ways of achieving the same goals.

Dynamically Typed Languages


Dynamically typed languages have the opposite problem: they are very productive and flexible to work in. But since there's no compile-time type checking, type errors can lie hidden in the code. If it's code that's executed every time, they will manifest and can be fixed. But in code that executes conditionally, like edge cases or error recovery code, type errors can lurk hidden.

In college, I built an assembler in Python, and I found that even after I thought it was done and it worked correctly when I invoked it, it would generate type errors later when invoked with a different input. I got the feeling that the code was fragile and could fail at any time. I didn't have a lot of confidence in it.

Type inferencing should detect such latent errors, code that is guaranteed to fail if executed. Like:

try {
  ...
catch(err) {
  errorElement.innerHTML = err.message.toUppercase();
}

Did you see the bug? The method is named toUpperCase() with a capital C, not a small C. This code will work fine normally, but when an exception occurs, your error handling code will break, exactly when you need it to work.

IDEs should use type inference to detect such errors. Not all errors can be detected, especially in dynamically typed languages, where types are not declared and the same variable may have different types at different times. But if the inferencer detects any line of code that's guaranteed to fail if executed, the IDE should inform the programmer.

In summary, both statically and dynamically typed languages would benefit from type inference — the former to make code less verbose and bureaucratic and more reusable, and the latter to detect latent bugs. There's no reason for a language not to have type inference.

No comments:

Post a Comment