If you’re an app developer, you know that the real power is with the platform. It controls the overall experience of using a device, what apps can do, what features apps have, what apps are expected to do to be good citizens, how money can and cannot be made, how privacy and security are handled, what direction the platform is going in, and so on. The real power is with the platform, not with the apps.
But the platform is not the real boss. It itself is controlled by something more fundamental: the interaction model. Steve Jobs, in his famous iPhone introduction video, pointed out that a revolutionary user interface made possible a revolutionary product. The bitmapped display and the mouse made possible the Mac. The clickwheel made possible the iPod. And touch made possible the iPhone.
The interaction model affects which choices work well for the user, what the opportunities are (like direct manipulation), and what the constraints are. It affects the context in which the user is using the app. A pocketable touch device may be used standing at a bus stop, in the heat and dust and noise, at which time the user may not have a lot of patience for screens that aren’t clear or for convoluted flows.
The platform is merely a consequence of the interaction model. If you’re not convinced about that claim, the proof is that OSs for a particular interaction model are very similar to each other, and very different from OSs for a different type of interaction model. GUI OSs like OS X and Windows are very similar to each other, practically identical compared to a command-line OS like DOS. Android and iOS are practically identical when compared to a desktop OS like OS X.
In fact, the major milestones in the industry have been milestones of interaction models.
First we had the command-line, with a keyboard and screens that could display only text. The GUI switched to a bitmapped display, and the interface was primarily driven by the mouse, not the keyboard, which was relegated to the role of text entry. Apps presented information graphically. They also presented controls graphically, in the form of menus and toolbars. Rather than remembering esoteric commands like chown, you could look through the menus to see what commands were available and pick the right one. You may then get a dialog box that tells you what options the selected command has, which of them are optional and what the defaults are. Multitasking became common, with windows serving as a way to divide the screen between multiple apps running at once. Icons served as a visual representation of abstract objects like a file or a network connection. You could directly manipulate these proxies to perform operations on the underlying object. For example, you could drag a file icon to the recycle bin to delete it, rather than having to remember and type poorly named commands like mv. And so on.
All these conventions made sense in the context of a bitmapped display and a mouse to directly manipulate objects, and not on a command-line OS. No wonder OS X and Windows are so similar, just as a Maruti and a Hyundai car are more similar to each other than to a plane. The product is merely a consequence of the chosen approach to solve the problem.
The move from the command-line to the GUI was an epochal event in the industry. Everything changed — the hardware, the software, the features expected from a PC and from apps, how easy a PC is to use, which companies wield the power in the industry, and so on. The industry expanded by at least an order of magnitude from the days of DOS to 2007.
The next pivotal shift came in 2007, with the introduction of the iPhone. Screen sizes changed from 15+ inches to pocketable. Hardware keyboards and mice were out, as were windows, menu bars, and the filesystem. All data was stored on and accessed from the cloud. The hardware was much less powerful, whether CPU, GPU, memory, disk or the network. Energy management was crucial. Some things were better than they were on the PC, like Retina displays or having a phlethora of sensors. Multitasking gave way to one app at a time. Windows gave way to full-screen. The platform became far less intimidating to novices, and far cheaper, like sub-10K instead of 50K.
The three eras of the computing world so far are command-line, GUI and touch.
After smartphones came tablets, which were initially criticised as being phones with large screens. As people understood what they were good for, they were recognised as a different kind of device. For a while, people thought that it would be a three-device world, consisting of the phone, tablet and PC.
But, over time, laptops became thinner and lighter. The iPad’s 10-hour battery life was a big deal in a world were laptop batteries lasted a few hours. But many laptops have approached or exceeded the iPad’s 10-hour battery life. In the other direction, smartphones developed bigger screens and became more powerful, and phablets took off. The tablet ended up being attacked from both sides. Actually, by three different types of devices: phones, phablets and laptops.
The old criticism of tablets as being phones with a big screen became true again. Tablets don’t have a different interaction model from the smartphone. Both use touch. Tablets are a new form factor, not a new interaction model. And sticking with the same interaction model provides limited opportunities to do something truly different.
Making a different OS for the same form factor has small impact (Android vs iOS). A new form factor with the same interaction model has a medium impact (tablets vs smartphones). A new interaction model (laptops vs phones) is pivotal.
Other examples of new form factors with the same interaction model, which have had medium impact, are: phablets, hybrid tablets like the iPad Pro or Surface Pro, detachable laptops like the Surface Book, convertible laptops like the Yoga, and ultralight laptops like the 12-inch Macbook.
A bigger example of a new form factor with the same interaction model is smartwatches — they are still touchscreens, which you tap and swipe on. They are slightly different from smartphones in that they don’t have an onscreen keyboard, but otherwise their interaction model is very similar to that of smartphones. Smartwatches’ failure till now is a consequence of the general point I’m making here — only a new interaction model can have a pivotal effect.
Another example of new form factors that aren’t new interaction models are car UIs like Android Auto or CarPlay. They are essentially a tablet mounted in a car, not a new interaction model.
If none of these new form factors — phablets, hybrid tablets, ultralight laptops like the 12-inch Macbook, smartwatches and cars — aren’t a new interaction model, what is? What are some upcoming interaction models?
One is the new Apple TV, with support for apps and a touch remote:
You can swipe on this remote, or you can press the microphone button to issue a voice command. Swiping on this remote is different from swiping on a touchscreen because you’re not swiping the actual object you want to manipulate. It’s also different from a laptop trackpad because laptops have a mouse cursor to tell you where you are on the screen. Apple TV doesn’t have a cursor. Instead, it relies on focus, as with laptop keyboards. The focused object is bigger and animates to indicate that it’s focused. Gestures on an Apple TV remote seem genuinely different from gestures on a touch screen or laptop trackpad.
Beyond input, TVs are different from other screens like phones or laptops because you can’t easily read text on a TV screen that’s 10 feet away or further. TVs work well with images and video, not text. TVs are also social, and therefore the focus is not on a particular individual’s data .
You can have new kinds of apps like a weather app that uses geolocation to show the weather for your city. Or a sports app that lets you change cameras, replay, play in slow motion, and get multiple commentaries or analyses of a critical shot. Or a TV app that lets viewers pose questions to a panel discussion, using a webcam, microphone and Internet connection.
If apps on TVs take off, it will be a genuinely new interaction model.
The second example of an upcoming interaction model is Amazon Echo. Nobody has yet made a device controlled only by voice. All devices that run apps have screens to display information and UI. And they have keyboards (hardware or onscreen), trackpads or mice, or touch input. The Echo uses voice exclusively, for both input and output. It’s a genuinely new interaction model.
The third example of an upcoming interaction model, and the most promising one, is VR. It’s the first UI that envelopes you, rather than being confined to a screen. And you can interact with it by moving your arms around (while holding a controller), rather than merely swiping on a screen or trackpad. A lot of experiences can be re-imagined for VR. Rather than watching a movie on screen, you might be in the movie. If there’s a TV report of snow in Leh, you can be in the scene with snow falling all around you. If you’re watching a panel discussion, wouldn’t you want to sit at the same table as the panel? Even photos would be in VR. If I visit New Zealand and want to capture a slice of it for my memory, I’d want to capture it in VR so that I be in New Zealand again, rather than merely looking at on a rectangular screen. Videos will be even more impressive — seeing things move around you will
seem even closer to being there. You can set up a camera on a tripod,
leave it there for a few minutes, and the resulting video would be
interesting, both as a memory for you and as a way of experiencing New
Zealand for people who haven’t been there. Maybe two-dimensional photos will be looked at as being a poor medium, like black-and-white photos are today. If VR delivers on its promises, its success is guaranteed.
New interaction models are exciting, and pivotal in the long-term. Nothing has as much effect as a new interaction model, not a new OS (Android vs iOS) or a new form factor with the same interaction model (tablets vs phones). I look forward to VR, Apple TV and Amazon Echo.
 Unless you have a notion of logging in.