“Will it be sunny in Miami this weekend?”
“Bring your sunglasses, it’s going to be nice in Miami.”
No, this isn’t a conversation between two friends. It’s an exchange between user and a smartphone. The iPhone 4S—equipped with Apple’s Siri, which can understand and respond intelligently to what you say, without preset commands.
“Siri is a massive new source of semantic data,” says analyst John Jackson of CCS Insight. “It’s something between a useful tool and a novelty. But Apple has succeeded in capturing human interest.”
As frustrating as it is for competitors that have occupied the space even before Siri came out, it also poses an opportunity for them to catch up—or even reinvent the space.
Until now, users have mostly interacted with mobile interfaces by typing in search terms. Natural language recognition changes that. “You suddenly have the possibility of saying what you want, and the system—if properly designed—can go off and grab what you want,” said Vlad Sejnoha, CTO of Nuance Communication.
Moreover, Nuance has garnered interest on the developers’ side for programming for speech-recognition user interfaces. From late summer to October, the company saw the number of sign-ups—those looking to integrate speech into their Android and iOS apps—grow from 2,000 to more than 5,000.
Nuance recently acquired Vlingo, a voice-to-text and voice recognition technology firm, which tells you this space will continue to heat up in 2012. The company also acquired Swype, and has since released a touch keyboard with Dragon voice dictation technology built in.
Traditionally, voice control on phones has been limited in appeal and functionality, but Siri is different. To activate Siri on the iPhone 4S, you can either long-press the home button and then start speaking or simply bring the phone up to your head as if you were having a conversation.
For instance, you could say, “What steakhouses are near me?” and Siri will ping Yelp for recommendations. Then you could say, “How about Mexican?” and she’ll grab that information. This assistant understands context. In our case, Siri said, “I found 12 Mexican restaurants, six of them are fairly close to you.” If you tap the result, you’ll be brought to the Maps app.
Alternatively, Google Voice Actions lets users send texts, play a song, and navigate to an address. But the Android 4.0 new Ice Cream Sandwich software will bring more features, including user interface navigation via voice and voice-typing.
Meanwhile, Nuance is working with IBM to apply the breakthrough Watson engine to answer questions. For example, instead of flipping through your car’s manual to find out where to put the windshield-washer fluid, you could ask your car and it would give you a specific spoken answer along with a photo or video.
To understand the scope of voice recognition, Mr. Sejnoha says we should realize that it will show up in household appliances. Many industry watchers believe that Apple’s iTV, rumored to be the company’s first television, will integrate Siri-like intelligence. Perhaps that’s why Steve Jobs told biographer Walter Isaacson that the set will boast “the simplest user interface you could imagine.”
With Siri, Apple has gone beyond the scope of traditional voice-recognition technology. Now users can literally carry on a conversation with their smartphone. And Apple has restarted the race to reinvent voice recognition. Sejnoha says we should expect leaps and bounds in the technology in the upcoming year.