Last week Voice Labs released a Voice Report on the current state of the voice computing industry. The main theme of the document is the tremendous growth in the proliferation of voice computing devices (mostly Amazon Echos) over the past year or so and the sheer number of skills created to date. However, the media narrative that was picked up wasn’t at as positive. Instead, both ReCode and Techcrunch spotlighted a glaring issue: people aren’t sticking with voice apps they try. With all of the 9000+ skills which are available to consumers today, are there any that are really worth using? Or all they just silly disposable toys?
A voice computing proponent’s retort to that critique might resemble Ed Sim’s insightful tweet reply: that “retention on voice apps, sounds like mobile world.” Getting consumers to frequently & consistently integrate any application into their daily lives is a meaningful challenge, regardless of computing platform. I can definitely see the parallels with the mobile world, especially fast-forward a few years from now and perhaps consumers will have tried dozens of skills, just like they’ve tried dozens of mobile apps when that platform emerged, but then settled on just a few that are core to their daily life.
So while the core challenge of app retention is a concern, I believe that there is something deeper going on here. The retention issue is a symptom, not a primary cause of current troubles for voice computing. A couple of weeks ago I got together with Jo Jaquinta, creator of Starlanes, the “first real multi-player complex game for the Echo” which some have dubbed “the coolest thing on Echo.” His perspective nailed the current state of the industry on the head in sharing that real developers won’t spend real time in developing application skills until it’s worth it for them.
What are the two components which are missing for strong developers that would want them to devote meaningful energy to creating useful and powerful voice skills? First, compensation – it just has to be worth it. Fame and glory only go so far. The current “Alexa Skill Master,” Nick Schwab, one of the top 10 independent Amazon Alexa skill developers in the world, is building apps like Ambient Noise: Thunderstorm Sounds. Sure, it’s highly rated and one of the most utilized skills, but I believe that kind of app barely scratches the surface of the potential of this medium. Voicebot.ai, an online magazine devoted solely to “all things voice web,” reported earlier this month that there will be a new payment feature coming soon to Google Assistant. Privately I’ve heard whispers that similar functionality will be coming to Alexa soon, too. And it makes sense – the mobile app stores provide a clear analogy for the platform companies to offer equivalent functionality in voice, which are a key ingredient to a rich serious developer ecosystem. It’s just a matter of time before this issue is addressed, but today it’s certainly a gating item. When Nick Schwab is lamenting, “I would love to expand my development to Google Home, but $130 is a bitter pill to swallow to develop on a platform that doesn’t have a clear monetization strategy for developers yet,” certainly that’s reflective of a screaming need for native revenue-generating capabilities. As a result, it’s no surprise that many of the most functionally-rich skills created to date have been from larger brands. They’re monetizing their skills with their separate primary business model – whether it’s Capital One or Domino’s – and their voice skills are really showcase augmentations to that strategy.
The second missing piece required for real developers to spend real time developing application skills is distribution. Right now the state of the art is twofold. Part of the game is to cuddle up to the developer evangelism teams at Google or Amazon to have them feature the skill in an email newsletter or blog post. This informal process isn’t dissimilar from the way to get your mobile app featured in the iTunes App Store. Also similar to mobile apps, (positive) early customer reviews become self-reinforcing for bringing exposure to the “top apps.” And it’s a fairly recent development for all skills to have their own permanent page on the Amazon.com website. But I don’t believe it’s enough. How is a serious development team supposed to get the commensurate meaningful attention from consumers to generate enough distribution from the 20M devices in homes for it to be worthwhile? Or, put another way, how can consumers better discover the skills best for them, particularly via a voice interface (rather than separately on the desktop web)? Moreover, the platforms are missing key social and invitation functionality that allowed for apps in other contexts to spread more broadly & more quickly (think not just mobile analogy, but also opening up of the Facebook app platform a decade ago). Yes, there is always going to be fierce competition for attention, but right now the playbook for meaningful skill distribution is so vague, it’s dissuading real developers. Again, the large existing brands here are currently “winning,” because they’re riding on top of their own distribution off-platform.
So is retention a problem in voice computing? Yes, but that shouldn’t be the loud narrative. The current real story on voice computing is the massive consumer adoption of these devices and the rapid progress which the platform companies of Amazon & Google are making to address the underlying causes of distribution and monetization. Once those issues are addressed, significant developer attention will then be directed towards third-party skills, empowering individual developers, startups, and existing enterprises to begin taking the voice computing layer seriously.