In a press release today, Apple announced that it is part of a new working group with Google, Amazon, and the Zigbee Alliance called Project Connected Home over IP. According to Apple’s press release:
The goal of the Connected Home over IP project is to simplify development for manufacturers and increase compatibility for consumers. The project is built around a shared belief that smart home devices should be secure, reliable, and seamless to use. By building upon Internet Protocol (IP), the project aims to enable communication across smart home devices, mobile apps, and cloud services and to define a specific set of IP-based networking technologies for device certification.
Apple says smart home device makers IKEA, Legrand, NXP Semiconductors, Resideo, Samsung SmartThings, Schneider Electric, Signify (formerly Philips Lighting), Silicon Labs, Somfy, and Wulian will also contribute to the project. The group wants to make it easier for manufacturers of smart home devices to integrate with Amazon’s Alexa, Apple’s Siri, and Google’s Assistant and will take an open source approach to the development of the joint protocol.
This is fantastic news. To date, many smart home devices have adopted support for some, but not all, smart assistants. It has also become all too common for companies to announce support for certain assistants was coming to their products only to delay or abandon it altogether. With a unified approach across the three major companies with smart assistants, support will hopefully be more consistent in the future.
Bloomberg reports that Apple will open up Siri to third-party messaging apps with a software update later this year. Third-party phone apps will be added later. According to Bloomberg’s Mark Gurman:
When the software refresh kicks in, Siri will default to the apps that people use frequently to communicate with their contacts. For example, if an iPhone user always messages another person via WhatsApp, Siri will automatically launch WhatsApp, rather than iMessage. It will decide which service to use based on interactions with specific contacts. Developers will need to enable the new Siri functionality in their apps. This will be expanded later to phone apps for calls as well.
As Gurman notes, the company’s change in approach comes as Apple is facing scrutiny over the competitive implications of its dual role as app maker and App Store gatekeeper in the US and elsewhere.
It’s interesting that the update is a Siri-only change. Users will still not be able to replace Messages with WhatsApp or Phone with Skype as their default messaging and phone apps for instance, but it strikes me as a step in the right direction and a change that I hope leads to broader customization options on iOS and iPadOS.
Another year, another batch of Siri improvements aimed at enhancing what’s already there, but not radically transforming it. Siri in iOS 13 comes with a handful of changes, all of which are in line with the types of iteration we’re used to seeing for Apple’s intelligent assistant. Siri now offers suggested actions in more places and ways than before, its voice continues becoming more human, and perhaps this year’s biggest change is a new SiriKit domain for media, which should enable – after the necessary work by third-party developers – audio apps like Spotify, Overcast, and Audible to be controlled by voice the way Apple’s native Music, Podcasts, and Books apps can be.
Earlier this month, Apple suspended its Siri grading program, in which third-party contractors listened to small snippets of audio to evaluate Siri’s effectiveness. Today in a press release, Apple explained its Siri grading program and changes the company is making:
We know that customers have been concerned by recent reports of people listening to audio Siri recordings as part of our Siri quality evaluation process — which we call grading. We heard their concerns, immediately suspended human grading of Siri requests and began a thorough review of our practices and policies. We’ve decided to make some changes to Siri as a result.
Apologizing for not living up to the privacy standards customers expect from it, Apple outlined three changes that will be implemented this fall when operating system updates are released:
First, by default, we will no longer retain audio recordings of Siri interactions. We will continue to use computer-generated transcripts to help Siri improve.
Second, users will be able to opt in to help Siri improve by learning from the audio samples of their requests. We hope that many people will choose to help Siri get better, knowing that Apple respects their data and has strong privacy controls in place. Those who choose to participate will be able to opt out at any time.
Third, when customers opt in, only Apple employees will be allowed to listen to audio samples of the Siri interactions. Our team will work to delete any recording which is determined to be an inadvertent trigger of Siri.
This is a sensible plan. It’s clear, concise, and has the benefit if being verifiable once implemented. It’s unfortunate that Siri recordings were being handled this way in the first place, but I appreciate the plain-English response and unambiguous plan for the future.
Last week, The Guardian reported on Apple’s Siri grading program in which contractors listen to snippets of audio to evaluate the effectiveness of Siri’s response to its trigger phrase. That article quoted extensively from an anonymous contractor who said they and other contractors regularly heard private user information as part of the program.
In response, Apple has announced that it is suspending the Siri grading program worldwide. While suspended, Apple says it will re-evaluate the program and issue a software update that will let users choose whether to allow their audio to be used as part of the program.
In a statement to Matthew Panzarino, the editor-in-chief of TechCrunch, Apple said:
“We are committed to delivering a great Siri experience while protecting user privacy,” Apple said in a statement to TechCrunch. “While we conduct a thorough review, we are suspending Siri grading globally. Additionally, as part of a future software update, users will have the ability to choose to participate in grading.”
In an earlier response to The Guardian, Apple had said that less than 1% of daily Siri requests are sent to humans as part of the grading program. However, that’s not very comforting to users who are left wondering whether snippets of their daily life are part of the audio shared with contractors. Consequently, I’m glad to see that Apple is re-examining its Siri quality-control efforts and has promised to give users a choice of whether they participate.
What should a wrist computer ideally do for you?
Telling the time is a given, and activity tracking has become another default inclusion for that category of gadget. But we’re talking about a computer here, not a simple watch with built-in pedometer. The device should present the information you need, exactly when you need it. This would include notifications to be sure, but also basic data like the weather forecast and current date. It should integrate with the various cloud services you depend on to keep your life and work running – calendars, task managers, and the like. It doesn’t have to be all business though – throwing in a little surprise and delight would be nice too, because we can all use some added sparks of joy throughout our days.
Each of these different data sources streaming through such a device presents a dilemma: how do you fit so much data on such a tiny screen? By necessity a wrist computer’s display is small, limiting how much information it can offer at once. This challenge makes it extremely important for the device to offer data that’s contextual – fit for the occasion – and dynamic – constantly changing.
Serving a constant flow of relevant data is great, but a computer that’s tied to your wrist, always close at hand, could do even more. It could serve as a control center of sorts, providing a quick and easy way to perform common actions – setting a timer or alarm, toggling smart home devices on and off, adjusting audio playback, and so on. Each of these controls must be presented at just the right time, custom-tailored for your normal daily needs.
If all of this sounds familiar, it’s because this product already exists: the Apple Watch. However, most of the functionality I described doesn’t apply to the average Watch owner’s experience, because most people use a watch face that doesn’t offer these capabilities – at least not many of them. The Watch experience closest to that of the ideal wrist computer I’ve envisioned is only possible with a single watch face: the Siri face.
Before I get any further, let me tell you that some of what I’m going to say here was already covered by David Sparks in this post from almost six years ago. This was just a year and a half after the “beta” introduction of Siri with the iPhone 4S, and David was pleased with what Siri could do. I like a lot of what Siri can do with dates, too, but there are still some frustrating blind spots and inconsistencies. In fact, with one of David’s examples, Siri isn’t as convenient as it was six years ago.
Context has always been one of Siri’s weaknesses, and that’s where it failed Casey. Any normal human being would understand immediately that a question asked in January about days since a day in December is talking about the December of the previous year. But Siri ignores (or doesn’t understand) the word “since” and calculates the days until the next December 18.
Solid collection of examples of date calculations with Siri by Dr. Drang. As he notes, it’s not that Siri can’t answer complex questions involving dates – it’s that you often have to phrase your questions with an exact syntax that a computer program can understand. This is frustrating because Apple promotes Siri as a smart assistant that can infer context without a refined syntax. I still run into a similar problem with time zone conversions; of course, the old trick I used to rely on no longer works for me unless I preface the question with “Ask Wolfram”.
Today Apple announced that one of its most recent high profile hires, John Giannandrea, has been added as the twelfth member of the company’s executive team. His title is now Senior Vice President of Machine Learning and AI Strategy. From the press release:
“John hit the ground running at Apple and we are thrilled to have him as part of our executive team,” said Tim Cook, Apple’s CEO. “Machine learning and AI are important to Apple’s future as they are fundamentally changing the way people interact with technology, and already helping our customers live better lives. We’re fortunate to have John, a leader in the AI industry, driving our efforts in this critical area.”
News of Giannandrea’s hiring at Apple first broke in April at The New York Times. Apple didn’t formally announce the hire, however, until July. And here we are just a few short months later, with another press release from Apple announcing his promotion.
Giannandrea’s role involves leadership of Siri, machine learning, and other artificial intelligence projects, all of which are right up his wheelhouse due to his former role as Google’s chief of search and artificial intelligence. While it’s hard to say from the outside what kind of difference his influence is making at Apple, this move is a good sign that the company’s pleased with his early months of work. Perhaps we’ll get to see the fruits of his labors at WWDC 2019.
Apple’s online Machine Learning Journal has published a paper on the methodologies the HomePod uses to implement Siri functionality in far-field settings. As Apple’s Audio Software Engineering and Siri Speech Teams explain:
Siri on HomePod is designed to work in challenging usage scenarios such as:
- During loud music playback
- When the talker is far away from HomePod
- When other sound sources in a room, such as a TV or household appliances, are active
Each of those conditions requires a different approach to effectively separate a spoken Siri command from other household sounds and to do so efficiently. The report notes that the HomePod’s speech enhancement system uses less than 15% of one core of a 1.4 GHz A8 processor.
Apple engineers tested their speech enhancement system under a variety of conditions:
We evaluated the performance of the proposed speech processing system on a large speech test set recorded on HomePod in several acoustic conditions:
- Music and podcast playback at different levels
- Continuous background noise, including babble and rain noise
- Directional noises generated by household appliances such as a vacuum cleaner, hairdryer, and microwave
- Interference from external competing sources of speech
In these recordings, we varied the locations of HomePod and the test subjects to cover different use cases, for example, in living room or kitchen environments where HomePod was placed against the wall or in the middle of the room.
The paper concludes with examples of filtered and unfiltered audio from those HomePod tests. Regardless of whether you’re interested in the details of noise reduction technology, the sample audio clips are worth a listen. It’s impressive to hear barely audible commands emerge from background noises like a dishwasher and music playback.