For years, sound design and music have been used to great effect in films (Hans Zimmer’s Use of the Shepard Tone), advertising (VW spot featuring Nick Drake’s “Pink Moon”), and games (Koji Kondo’s sound design for Nintendo). But when it comes to designing and developing digital products, decisions pertaining to sound do not command anywhere near the strategic rigor generally accepted and regularly applied to visual design, written copy, or technology. Sound elements are often seen as critical enhancements for “lean back” entertainment experiences like games and video content but not central to the utility-focused, user-centered design we so often seek in our digital products. Sound is treated like an add-on, not critical for a user’s primary experience.
It wasn’t always this way. The internet used to be a loud and Flashy (pun intended) place. Sound was explored freely, along with visual elements, as we collectively figured out what to do with the internet. Remember picking your MySpace song to accompany your personal background? This open and exploratory approach eventually fell out of fashion in favor of efficient and logical flows, accessibility, mobile optimization, fast performance, and clear information hierarchy. A better-informed, savvier world of pattern libraries, responsive CSS frameworks, and Dribbble emerged, and the levers that controlled “design delight” became almost entirely visual (I’m including written copy as visual because it’s not spoken). The industry began focusing on “great user experience”–a good thing–and delight and polish became the strict purview of visual design considerations like color palette, button style, animations, microinteractions, and engaging copy. Sound in an interface began to be viewed as gaudy excess to trim, bad practice, and about as fashionable as Hammer pants.
The industry responded to serve this context. Designers and developers worked and trained in this context. Companies and tech firms staffed for this context. But nothing stays the same for long, and now that context is changing. With the rise of Zero UI and conversational interfaces, products are leaving the confines of a visual screen. For many new applications, sound is no longer the intrusive, tacky Cousin Eddy of UI, it is the UI. Sound is beginning to occupy an important seat at the table as a primary interaction medium. But, as an industry, are we ready for it?
As sound becomes more critical to product interactions, there are some important considerations for how we utilize and execute design as we enter the post-screen era:
- Breaking the Screen Barrier
While sound has generally taken a secondary role to visual in interaction design, there has been some progress. The annoying navigation hover clicks of the Flash era have been replaced by “warm and subtle” sound design elements supporting key visual interactions. Apps like Facebook Messenger use sound judiciously to enhance context and reinforce the primary visual interaction. In this case, things would, and should, work just fine without the audio. Sound merely enhances the experience. But now, as we move beyond visual screens, we’re asking a lot more of sound as a primary instead of secondary medium for interaction.
We can see this with interactions that are not purely visual, but instead design systems where visual is only a component. In these cases, a screen might be only a bit player in an overall interaction also involving physical hardware or real humans. Sound, in this case, can act as “glue” for the system, providing feedback, attention cues, and emotional closure for a task, particularly in high distraction areas like a busy store or a public kiosk.
The development of in-store credit card payment systems is a good example of using audio as a feedback device to communicate and draw attention. Remember a few years back when stores started using chip readers for payment instead of the card swipe? The design problem was that the chip workflow required users to leave their valuable credit card inserted in the machine while it processed, increasing the risk of users leaving it behind. The initial solution was to prompt and cue users with an earcon–in this case, a loud buzz, which, while quite audible in a crowded store, sounded indistinguishable from an error message and was not what a user would typically expect to hear for a successful transaction.
The earcon was being used to direct a user’s attention back to their card after the chip reading process completed, but the use of this earcon was completely inappropriate for what the design intended to communicate. Granted, the designers were most likely dealing with some serious considerations: potential hardware constraints, audibility for a wide user group including those with hearing issues, and deadlines for getting this technology working effectively in the stores. Thankfully, audio design for chip readers has improved since its early stages and I have noticed more appropriate earcons.
Another example is the Google Assistant confirmation beep. This earcon provides feedback across multiple devices for the Google Home design system, which is not centered on a single screen. The percolating, distinctive beep conveys a sense of responsiveness when a user speaks the phrase “OK Google” to initiate an interaction. Using an earcon instead of words saves time and cognitive strain for the user, and staying in the audio realm for voice interaction seems more natural and frees the user up from processing visual feedback on a screen.
- Voice and Sound in a Hands-Free Environment
Conversational UIs are becoming more skilled at understanding human speech and intent, and products based on AI chatbots utilizing voice interfaces like Amazon’s Alexa-powered Echo and Google’s Home are becoming more popular–Gartner predicts that 75% of homes will have smart speakers by the year 2020. While it may be fun for many to engage their Amazon Echos in conversation, many of the activities currently available via voice interface do not necessarily require the voice part to get the job done. I doubt that most people would be hamstrung if they couldn’t get their Echos to provide the day’s weather report or tell them a joke. But, a conversational interface can be a particularly effective solution in scenarios where users are not able to see or touch a visual screen.
Take for example a sunny, dusty orange grove where crop inspectors need to observe and report the presence of pests and the growth progress of the orange crop. While a visual screen would require touch and bright sunlight would create glare, a tool like AgVoice allows for hands-free crop reporting with its voice interface, prompting and recording data from inspectors in the field, freeing them from paperwork, and creating a mobile and efficient workflow. Add to this the benefits of time stamping, geolocation, and cloud-based data aggregation and you have a significantly more efficient experience.
Even consumer products can be adapted for specific scenarios, like an Alexa skill called Helix being used in scientific laboratory settings to provide smart assistance for calculations or next steps in an experimental procedure while scientists have their gloved hands full.
- Sound As a Brand Differentiator
When it comes to product design, the visual design space is well-established and crowded. Brands devote oodles of money, time, and thought to how they express themselves visually. There are style and brand guidelines that entire boardrooms have to approve, and designers can spot the trends and conventions that drive strategic implementation of visual branding, whether it’s flat design, icon styles, or specific typographic treatments.
Sound, in comparison, is under-explored. The branding opportunities for sound interfaces are in many ways a blank canvas. Sure, there are jingles and celebrity voice overs, but the considerations for the strategic use of sound have not come close to the scrutiny regularly given to the visual space. But as sound becomes a primary form of interaction, a brand’s audio presence has become a more important consideration worthy of closer look.
The rise of sound as a primary interface provides a real opportunity for brands to engage and express themselves in ways they’ve never done before, and the potential impact from building an emotional connection with a brand through sound has never been greater. Thoughtful sound design can establish and leverage emotional “triggers” to enhance or change behavior, such as Apple Bedtime’s gentle, elegant take on the alarm clock to help form better sleep habits. For conversational interfaces, the human-like intimacy inherent to voice communication can open doors for a different kind of relationships with customers.
Opportunities continue to unfold as sound becomes more central to digital product development in the emerging Zero UI era. These are exciting times, as it’s not often that we get to explore our relationship with technology in a whole new way.
Interested in exploring sound for your product? Give us a ring.