Using our voices to communicate with each other is one of the oldest and most most powerful forms of connection. While using voice to communicate with machines is not something new, it’s only in recent years that it has become common place to control a machine by talking to it. We now regularly use voice to undertake tasks such as perform searches via OK Google, control apps via Alexa or send texts via Siri.
While voice control will not be able to take over every interaction we have with machines, it does seem its popularity is set to continue as we see companies scramble to capitalise on this new (ish) medium. And as voice user interfaces (VUIs) replace some of our more typical screen based interfaces, it raises some interesting considerations for us UX’ers when thinking about designing these new experiences.
Some considerations when designing VUIs
- The complexity of the human language.
Designing conversations between humans and machines is difficult. Conversations can be messy, fluid and often illogical, variables that machines generally don’t like. Although we see machine learning, AI and chatbots come on along way to make conversations less stilted and more relevant, its seems there is still a way to go to in terms of machines being able to actually understand proper meaning in language.
- Communication is emotional
We humans use emotion to communicate. The tone or type of phrase we use in a conversation can say a lot about how we feel. Even the words we don’t say can convey meaning and as Druker aptly put it, “The most important thing in communication is hearing what isn’t said”. Of course it’s difficult for machines to pick up the conversational tone or correctly interpret emotions. However, some of the assistants such as Siri do seem to making some headway in terms of recognising and responding to emotions, in a more human way.
3. The limitations of using voice to connect with technology
- There are no visable controls
- There is no affordance in voice applications e.g clues to how to engage with the object or system
- Users generally have no conceptual maps, that is ideas of how a system could work
- Human memory has a limited capacity to remember conversations and commands, as there are no visual reminders on screen
- And while we might assume that when we talk to a machine that it can do almost anything, voice commands can be deceptively limited in scope
Voice is a easy, natural and alternative way to engage with machines and its popularity and usefulness in technology looks set to rise. However, when designing experiences with VUIs we should not try to just replicate human to human conversation as this is probably impossible and would result in a jarring experience.
What seems more likely is that a new type of hybrid language will emerge. Machines will get better at talking to us, and we will learn how to better communicate with them. Just as we learned to speak language of the screen and become ‘fluent’ in performing tasks such as double click, enter etc. we will need to learn a new language and turn of phrase that will allow to communicate with machines in a way that they can understand and effectively executive our commands.