Federico Viticci

10594 posts on MacStories since April 2009

Federico is the founder and Editor-in-Chief of MacStories, where he writes about Apple with a focus on apps, developers, iPad, and iOS productivity. He founded MacStories in April 2009 and has been writing about Apple since. Federico is also the co-host of AppStories, a weekly podcast exploring the world of apps, Unwind, a fun exploration of media and more, and NPC: Next Portable Console, a show about portable gaming and the handheld revolution.

This Week's Sponsor:

SoundSource

New Year, New Audio Setup: SoundSource 6 from Rogue Amoeba


What Siri Isn’t: Perplexity’s Voice Assistant and the Potential of LLMs Integrated with iOS

Perplexity's voice assistant for iOS.

Perplexity’s voice assistant for iOS.

You’ve probably heard that Perplexity – a company whose web scraping tactics I generally despise, and the only AI bot we still block at MacStories – has rolled out an iOS version of their voice assistant that integrates with several native features of the operating system. Here’s their promo video in case you missed it:

This is a very clever idea: while other major LLMs’ voice modes are limited to having a conversation with the chatbot (with the kind of quality and conversation flow that, frankly, annihilates Siri), Perplexity put a different spin on it: they used native Apple APIs and frameworks to make conversations more actionable (some may even say “agentic”) and integrated with the Apple apps you use every day. I’ve seen a lot of people calling Perplexity’s voice assistant “what Siri should be” or arguing that Apple should consider Perplexity as an acquisition target because of this, and I thought I’d share some additional comments and notes after having played with their voice mode for a while.

Read more


The Current State of Major LLMs and Their Shortcuts Integrations

Earlier this week, I decided to do some research about the current state of major LLM apps and their implementations of Shortcuts actions. While millions of people are interacting with chatbots on a daily basis using their respective websites and dedicated mobile apps, I thought it’d be interesting to see how these popular services are...


How We’re Using AI

This week, Federico and John revisit the fast-paced world of artificial intelligence to describe how they’re using a variety of tools for their everyday workflows.

On AppStories+, John shares his theory of the way we’ll look at AI models in the future.


We deliver AppStories+ to subscribers with bonus content, ad-free, and at a high bitrate early every week.

To learn more about an AppStories+ subscription, visit our Plans page, or read the AppStories+ FAQ.


AppStories+ Deeper into the world of apps

AppStories Episode 432 - How We’re Using AI

0:00
54:14

AppStories+ Deeper into the world of apps

This episode is sponsored by:

  • Notion – Try the powerful, easy-to-use Notion AI today.

Read more



Time for Calendars

This week, Federico and John survey their favorite calendar apps, discussing the strengths and weaknesses of each.

On AppStories+, Federico shares Shortcuts tips for working with Google’s Gemini API and the highly structured data it returns. Plus he and John share their concern and cautious optimism for the future of Shortcuts.


We deliver AppStories+ to subscribers with bonus content, ad-free, and at a high bitrate early every week.

To learn more about an AppStories+ subscription, visit our Plans page, or read the AppStories+ FAQ.


AppStories+ Deeper into the world of apps

AppStories Episode 431 - Time for Calendars

0:00
34:25

AppStories+ Deeper into the world of apps

Read more



How Could Apple Use Open-Source AI Models?

Yesterday, Wayne Ma, reporting for The Information, published an outstanding story detailing the internal turmoil at Apple that led to the delay of the highly anticipated Siri AI features last month. From the article:

In November 2022, OpenAI released ChatGPT to a thunderous response from the tech industry and public. Within Giannandrea’s AI team, however, senior leaders didn’t respond with a sense of urgency, according to former engineers who were on the team at the time.

The reaction was different inside Federighi’s software engineering group. Senior leaders of the Intelligent Systems team immediately began sharing papers about LLMs and openly talking about how they could be used to improve the iPhone, said multiple former Apple employees.

Excitement began to build within the software engineering group after members of the Intelligent Systems team presented demos to Federighi showcasing what could be achieved on iPhones with AI. Using OpenAI’s models, the demos showed how AI could understand content on a user’s phone screen and enable more conversational speech for navigating apps and performing other tasks.

Assuming the details in this report are correct, I truly can’t imagine how one could possibly see the debut of ChatGPT two years ago and not feel a sense of urgency. Fortunately, other teams at Apple did, and it sounds like they’re the folks who have now been put in charge of the next generation of Siri and AI.

There are plenty of other details worth reading in the full story (especially the parts about what Rockwell’s team wanted to accomplish with Siri and AI on the Vision Pro), but one tidbit in particular stood out to me: Federighi has now given the green light to rely on third-party, open-source LLMs to build the next wave of AI features.

Federighi has already shaken things up. In a departure from previous policy, he has instructed Siri’s machine-learning engineers to do whatever it takes to build the best AI features, even if it means using open-source models from other companies in its software products as opposed to Apple’s own models, according to a person familiar with the matter.

“Using” open-source models from other companies doesn’t necessarily mean shipping consumer features in iOS powered by external LLMs. I’ve seen some people interpret this paragraph as Apple preparing to release a local Siri powered by Llama 4 or DeepSeek, and I think we should pay more attention to that “build the best AI features” (emphasis mine) line.

My read of this part is that Federighi might have instructed his team to use distillation to better train Apple’s in-house models as a way to accelerate the development of the delayed Siri features and put them back on the company’s roadmap. Given Tim Cook’s public appreciation for DeepSeek and this morning’s New York Times report that the delayed features may come this fall, I wouldn’t be shocked to learn that Federighi told Siri’s ML team to distill DeepSeek R1’s reasoning knowledge into a new variant of their ∼3 billion parameter foundation model that runs on-device. Doing that wouldn’t mean that iOS 19’s Apple Intelligence would be “powered by DeepSeek”; it would just be a faster way for Apple to catch up without throwing away the foundational model they unveiled last year (which, supposedly, had a ~30% error rate).

In thinking about this possibility, I got curious and decided to check out the original paper that Apple published last year with details on how they trained the two versions of AFM (Apple Foundation Model): AFM-server and AFM-on-device. The latter would be the smaller, ~3 billion model that gets downloaded on-device with Apple Intelligence. I’ll let you guess what Apple did to improve the performance of the smaller model:

For the on-device model, we found that knowledge distillation (Hinton et al., 2015) and structural pruning are effective ways to improve model performance and training efficiency. These two methods are complementary to each other and work in different ways. More specifically, before training AFM-on-device, we initialize it from a pruned 6.4B model (trained from scratch using the same recipe as AFM-server), using pruning masks that are learned through a method similar to what is described in (Wang et al., 2020; Xia et al., 2023).

Or, more simply:

AFM-server core training is conducted from scratch, while AFM-on-device is distilled and pruned from a larger model.

If the distilled version of AFM-on-device that was tested until a few weeks ago produced a wrong output one third of the time, perhaps it would be a good idea to perform distillation again based on knowledge from other smarter and larger models? Say, using 250 Nvidia GB300 NVL72 servers?

(One last fun fact: per their paper, Apple trained AFM-server on 8192 TPUv4 chips for 6.3 trillion tokens; that setup still wouldn’t be as powerful as “only” 250 modern Nvidia servers today.)

Permalink

Return of the Utility Grab Bag

This week, Federico and John share some of their favorite utility apps, including Amphetamine, Text Lens, Gifski, Folder Peek, Mic Drop, Keka, and Marked.

Then, on AppStories+, Federico and John extend their conversation about utilities with six more favorites.


We deliver AppStories+ to subscribers with bonus content, ad-free, and at a high bitrate early every week.

To learn more about an AppStories+ subscription, visit our Plans page, or read the AppStories+ FAQ.


AppStories+ Deeper into the world of apps

AppStories Episode 430 - Return of the Utility Grab Bag

0:00
30:37

AppStories+ Deeper into the world of apps

This episode is sponsored by:

  • Rogue Amoeba: makers of incredibly useful audio tools for your Mac. Use the code MS2504 through the end of April to get 20% off Rogue Amoeba’s apps.
    Read more

Transcriber: A Shortcut to Generate YouTube Video Transcripts

As I teased yesterday in my story about processing video transcripts on the Mac using Simon Willison’s llm CLI, I wanted to write about the shortcut that actually generates those raw video transcripts. Today, I’m happy to share Transcriber, a shortcut that takes any YouTube video URL, extracts its content, and saves a full transcript...