Apple released an advertisement showcasing Siri starring former pro-wrestler turned film star, Dwayne (‘The Rock’) Johnson. Teased yesterday by Johnson on Twitter and Facebook, the video, posted to Apple’s YouTube channel, features Johnson accomplishing a long list of life goals with the help of Siri during a single day. The tongue-in-cheek spot highlights several Siri features such as:
- reading Johnson’s schedule;
- creating a reminder;
- scheduling a Lyft ride;
- getting the weather forecast;
- reading email;
- displaying photos;
- texting someone;
- converting measurements;
- playing a playlist;
- starting a FaceTime call; and
- taking a selfie.
The Siri ad is a clever and entertaining way of explaining the breadth of tasks that can be accomplished with Siri, from the basics like weather forecasts to less well-known features like taking a selfie.
Great overview by Steven Aquino on the Accessibility changes coming with iOS 11. In particular, he’s got the details on Type to Siri, a new option for keyboard interaction with the assistant:
Available on iOS and the Mac, Type to Siri is a feature whereby a user can interact with Siri via an iMessage-like UI. Apple says the interaction is one-way; presently it’s not possible to simultaneously switch between text and voice. There are two caveats, however. The first is, it’s possible to use the system-wide Siri Dictation feature (the mic button on the keyboard) in conjunction with typing. Therefore, instead of typing everything, you can dictate text and send commands thusly. The other caveat pertains to “Hey Siri.” According to a macOS Siri engineer on Twitter, who responded to this tweet I wrote about the feature, it seems Type to Siri is initiated only by a press of the Home button. The verbal “Hey Siri” trigger will cause Siri to await voice input as normal.
Technicalities aside, Type to Siri is a feature many have clamored for, and should prove useful across a variety of situations. In an accessibility context, this feature should be a boon for deaf and hard-of-hearing people, who previously may have felt excluded from using Siri due to its voice-first nature. It levels the playing field by democratizing the technology, opening up Siri to an even wider group of people.
I wish there was a way to switch between voice and keyboard input from the same UI, but retaining the ‘Hey Siri’ voice activation seems like a sensible trade-off. I’m probably going to enable Type to Siri on my iPad, where I’m typing most of the time anyway, and where I could save time with "Siri templates" made with native iOS Text Replacements.
Stephen Nellis, writing for Reuters, shares an interesting look into Apple's method for teaching Siri a new language:
At Apple, the company starts working on a new language by bringing in humans to read passages in a range of accents and dialects, which are then transcribed by hand so the computer has an exact representation of the spoken text to learn from, said Alex Acero, head of the speech team at Apple. Apple also captures a range of sounds in a variety of voices. From there, an acoustic model is built that tries to predict words sequences.
Then Apple deploys “dictation mode,” its text-to-speech translator, in the new language, Acero said. When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them anonymous. The recordings, complete with background noise and mumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.
After enough data has been gathered and a voice actor has been recorded to play Siri in a new language, Siri is released with answers to what Apple estimates will be the most common questions, Acero said. Once released, Siri learns more about what real-world users ask and is updated every two weeks with more tweaks.
The report also shares that one of Siri's next languages will be Shanghainese, a dialect of Wu Chinese spoken in Shanghai and surrounding areas. This addition will join the existing 21 languages Siri currently speaks, which are localized across a total of 36 different countries.
Debating the strengths and weaknesses of Siri has become common practice in recent years, particularly as competing voice assistants from Amazon, Google, and Microsoft have grown more intelligent. But one area Siri has long held the lead over its competition is in supporting a large variety of different languages. It doesn't seem like Apple will be slowing down in that regard.
Alongside beta versions of iOS, macOS, and tvOS, Apple today announced the release of the first beta of watchOS 3.2. The beta has yet to appear on Apple's developer portal, but it should be available soon. Besides the standard bug fixes and performance improvements, this update includes a couple new features, one of which is called Theater Mode. From Apple's developer release notes:
Theater Mode lets users quickly mute the sound on their Apple Watch and avoid waking the screen on wrist raise. Users still receive notifications (including haptics) while in Theater Mode, which they can view by tapping the screen or pressing the Digital Crown.
This sounds like an interesting new option that could be useful in scenarios besides being at the movie theater. Personally, I'm likely to use Theater Mode when I wear my Apple Watch overnight for sleep tracking. My normal practice is to turn off Raise to Wake in the Settings app before going to bed, but this could prove an easier method.
Besides Theater Mode, the most significant update in 3.2 is enhancements to Siri. Last year iOS 10 improved Siri by enabling it to handle queries from third-party apps that fit into specific categories:
- Ride booking
- Searching photos
Though all of those areas could be handled by Siri on iOS 10, Siri on Apple Watch was previously only able to direct you to your iPhone to perform those actions. But with watchOS 3.2, that is longer the case, as Siri on the Watch is now able to perform these third-party requests.
watchOS 3.2 will likely see a public release this spring, after a couple months of beta testing is complete.
Some interesting thoughts about the AirPods by Steven Aquino. In particular, he highlights a weak aspect of Siri that isn't usually mentioned in traditional reviews:
The gist of my concern is Siri doesn't handle speech impediments very gracefully. (I've found the same is true of Amazon's Alexa, as I recently bought an Echo Dot to try out.) I’m a stutterer, which causes a lot of repetitive sounds and long breaks between words. This seems to confuse the hell out of these voice-driven interfaces. The crux of the problem lies in the fact that if I don’t enunciate perfectly, which leaves several seconds between words, the AI cuts me off and runs with it. Oftentimes, the feedback is weird or I’ll get a “Sorry, I didn’t get that” reply. It’s an exercise in futility, sadly.
Siri on the AirPods suffers from the same issues I encounter on my other devices. It’s too frustrating to try to fumble my way through if she keeps asking me to repeat myself. It’s for this reason that I don’t use Siri at all with AirPods, having changed the setting to enable Play/Pause on double-tap instead (more on this later). It sucks to not use Siri this way—again, the future implications are glaringly obvious—but it’s just not strong enough at reliably parsing my speech. Therefore, AirPods lose some luster because one of its main selling points is effectively inaccessible for a person like me.
That's a hard problem to solve in a conversational assistant, and exactly the kind of Accessibility area where Apple could lead over other companies.
Steven Levy, writing for Backchannel, interviewed Apple's Phil Schiller for the tenth anniversary of the iPhone's introduction:
“If it weren’t for iPod, I don’t know that there would ever be iPhone.” he says. “It introduced Apple to customers that were not typical Apple customers, so iPod went from being an accessory to Mac to becoming its own cultural momentum. During that time, Apple changed. Our marketing changed. We had silhouette ads with dancers and an iconic product with white headphones. We asked, “Well, if Apple can do this one thing different than all of its previous products, what else can Apple do?’”
In the story, Schiller also makes an interesting point about Siri and conversational interfaces after being asked about Alexa and competing voice assistants:
“That’s really important,” Schiller says, “and I’m so glad the team years ago set out to create Siri — I think we do more with that conversational interface that anyone else. Personally, I still think the best intelligent assistant is the one that’s with you all the time. Having my iPhone with me as the thing I speak to is better than something stuck in my kitchen or on a wall somewhere.”
“People are forgetting the value and importance of the display,” he says “Some of the greatest innovations on iPhone over the last ten years have been in display. Displays are not going to go away. We still like to take pictures and we need to look at them, and a disembodied voice is not going to show me what the picture is.”
Ben Bajarin makes a strong point on using Siri with the AirPods:
There is, however, an important distinction to be made where I believe the Amazon Echo shows us a bit more of the voice-only interface and where I’d like to see Apple take Siri when it is embedded in devices without a screen, like the AirPods. You very quickly realize, the more you use Siri with the AirPods, how much the experience today assumes you have a screen in front of you. For example, if I use the AirPods to activate Siri and say, “What’s the latest news?” Siri will fetch the news then say, “Here is some news — take a look.” The experience assumes I want to use my screen (or it at least assumes I have a screen near me to look at) to read the news. Whereas, the Amazon Echo and Google Home just start reading the latest news headlines and tidbits. Similarly, when I activate Siri on the AirPods and say, “Play Christmas music”, the query processes and then plays. Where with the Echo, the same request yields Alexa to say, “OK, playing Christmas music from top 50 Christmas songs.” When you aren’t looking at a screen, the feedback is important. If I was to ask that same request while I was looking at my iPhone, you realize, as Siri processes the request, it says, “OK” on the screen but not in my ear. In voice-only interfaces, we need and want feedback that the request is happening or has been acknowledged.
Siri already adapts to the way it's activated – it talks more when invoked via "Hey Siri" as it assumes you're not looking at the screen, and it uses UI elements when triggered from the Home button.
Currently, activating Siri from AirPods yields the same feedback of the "Hey Siri" method. I wonder if future Siri will talk even more when it detects AirPods in your ear as it means only you will be able to hear its responses.
This is a good video by Marques Brownlee on where things stand today between Siri (iOS 10) and the Google Assistant (running Android Nougat on a Google Pixel XL). Three takeaways: Google Assistant is more chatty than old Google Voice Search; Google still seems to have an edge over Siri when it comes to follow-up questions based on topic inference (which Siri also does, but not as well); and, Siri holds up well in most types of questions asked by Brownlee.
In my daily experience, however, Siri still falls short of basic tasks too often (two examples) and deals with questions inconsistently. There is also, I believe, a perception problem with Siri in that Apple fixes obvious Siri shortcomings too slowly or simply isn't prepared for new types of questions – such as asking how the last presidential debate went. In addition, being able to text with Google Assistant in Allo for iOS has reinforced a longstanding wish of mine – the ability to converse silently with a digital assistant. I hope Siri gets some kind of textual mode or iMessage integration in iOS 11.
One note on Brownlee's video: the reason Siri isn't as conversational as Google Assistant is due to the way Brownlee activates Siri. When invoked with the Home button (or by tapping the microphone icon), Siri assumes the user is looking at the screen and provides fewer audio cues, prioritizing visual feedback instead. If Brownlee had opened Siri using "Hey Siri" hands-free activation, Siri would have likely been just as conversational as Google. I prefer Apple's approach here – if I'm holding a phone, it means I can look at the UI, and there's no need to speak detailed results aloud.