A few years back, I wrote about the way we communicate with our technology. It was obvious even then that a big game-changer would be enabling a reliable conversational interaction with technology in order to overcome the friction humans experience when we use our modern tools, be they apps, phones, cars or semi-autonomous coffee makers. Too much typing and swiping and app management crowds our experiences with our connected “things.”
To some degree, this game-changer has come to pass.
Voice interaction is now a big part of technology interface in everything from smartphones to virtual assistant/smart speaker products to connected home and vehicle solutions — and so it will be going forward. While this is marked progress, it is not really “conversation.”
For the most part, the state of voice interaction is more akin to commanding a four-year-old to do your bidding than having a useful, rich conversation with a friend or assistant. As we continue to minimize friction and advance usability of technology via voice, it is clear that more is needed. I’ll predict right here that the next big game-changer in technology interface is ambient contextuality.
Ambient contextuality hinges on the idea that there is information hidden all around us that helps clarify our intent in any given conversation. Answering the simple questions of who, what, where and when is now easier than ever as IoT continues to mine and mind the data of our lives. I once sketched out a derivative needs pyramid for IoT devices using the example of Maslow’s hierarchy of needs pyramid to chart a course for “thing-actualization,” whereby our technology could use analytics, learned logic and predictive behavior to establish groups and networks of things and enable other more “complex” things. The voice interfaces and natural-language processing technology on display in interactive speakers such as Amazon’s Alexa or Apple’s Homepod are examples of this actualization in action — predictive analytics and machine learning imbued into objects and interfaces to technology that collect data and collectively power progressively complex functions, often in real time.
But it is still not conversation. There is a new, nascent communications triangle between people, processes and things that fuels usability, and it still has a bit of its own growing up to do.
Deeper questions like how and why are also key to conversation for humans. To achieve truly conversational interactions, one or many of the answers to these questions not only need to be captured, but also learned and retained. Recently, Google has made some good strides into this for targeted types of online search. But we have to do much more before something akin to natural conversation emerges.
Establishing ambient contextuality to enable the kinds of conversations we do want to have is the actual end goal of all this connected stuff.
Most human conversation is abridged. Known quantities may not even be discussed, but they are deeply factored into interaction. A simple example is shifting from nouns and proper names to pronouns. “I asked about Dave’s vacation and Jen said she’d take him to the airport to kick it off right.” This may seem like a small thing, but think about how unnatural a conversation is when you cannot use human “shorthand.” Referring to every subject in every sentence by its proper name quickly becomes as uncomfortable as it is unnatural.
A simple definition of a conversation is an informal exchange of sentiment and ideas, and it’s the way people naturally communicate with each other. Informal conversation is contextual, cohesive and comprehensive. It involves a lot of storytelling. It ebbs and flows, jumps around in time and tense, references shared experience or knowledge to exchange new experiences and knowledge. It is inference infused and doesn’t require adherence to strict conventions. But this is pretty much the exact opposite of the way “things” are designed to communicate. Machine communication is specific to whatever technology drives it and is based on code. It is binary, resource-constrained, inflexible, standalone, purely informational and lacks context. It is rigid and formal. It is very much not storytelling.
This elemental difference in communication creates a usability gap, which we have traditionally bridged by forcing people to learn to “speak” machine — download a new app to control every new device, use this set of wake words or language constructs for one device and an entirely different set for another, update, update, update, and if-this-then-that for everything. It’s why so many “things” end up thrown in a drawer after two weeks, never to be used again. This is not the kind of conversation humans want to have.
Putting aside the creepiness factor and important privacy issues surrounding devices that constantly collect information about us, establishing ambient contextuality to enable the kinds of conversations we do want to have is the actual end goal of all this connected stuff. The aim is to smooth our experiences with our technology throughout the day and blur the seams enough to feel natural to us.
The challenge now is to make our machines “speak” human — to imbue them with context and inference and informality so that conversation flows naturally. DARPA has been working on it. So, too, Amazon and Google. In fact, most technology efforts are concerned with reducing interface friction. Improving the quality of our conversation is key to achieving that goal.
Development on IoT, augmented and mixed reality, Assistive Intelligence (my term for AI, but that’s an entirely different conversation) and even the miniaturization and extension properties on display in mobility and power advancements are all examples of the quest for that quality. Responsibly developed ambient contextuality, and ultimately natural conversation, will be better enabled by these technologies, and our lives will become much more conversational soon. Once we experience reliable and useful conversations with our technological world, I think we will all be hooked.