26 Mar 2016

How Programming Languages Handle Null Differently

It’s interesting to see that programming languages handle null in a variety of different ways.

The worst option is C and C++, where dereferencing null just crashes your program.

Java does it better by throwing an exception, which gives you a chance to recover or cleanup (say by closing files) and prevents malicious input from bringing down a server.

Objective-C does it differently: you can invoke any method on a null reference and it’s a noop [1].

Best are languages like Haskell that don’t assume null if a valid value for a reference. After all, when you declare that a reference is of a certain type, you’re saying that it points to an object that implements a given set of methods. But null doesn’t implement any methods in Java, so it’s wrong to say:

File f = null;

just as it’s wrong to say:

File f = “Hello”;

In both cases, the right-hand side doesn’t implement the methods declared in the File interface, so it shouldn’t be assignable to File.

The cleanest solution is a separate type named Optional<T>. If you want a nullable [2] reference to T, you declare its type as Optional<T>. Otherwise you declare it as a plain T. This lets the compiler ensure that you’re not passing null to a function that doesn’t expect it, preventing bugs that can come up in Java, Objective-C, or most other languages. In addition to preventing bugs, this can be faster than Java or Objective-C, since runtime checks can be eliminated.

[1] The return value is zero of the appropriate type: 0 for a number, null for a pointer, NO for a boolean, etc.

[2] Optional is a type, a way of telling the compiler that the reference can be null. It’s not an actual class. It doesn’t require a separate memory allocation, for example. You can’t have two different Optional objects wrapping the same actual object, in which case comparing == on references to type Optional<T> can return false though the underlying objects are the same.

No comments:

Post a Comment