Federico Viticci

9613 posts on MacStories since April 2009

Federico is the founder and Editor-in-Chief of MacStories, where he writes about Apple with a focus on apps, developers, iPad, and iOS productivity. He founded MacStories in April 2009 and has been writing about Apple since. Federico is also the co-host of AppStories, a weekly podcast exploring the world of apps, Unwind, a fun exploration of media and more, and NPC: Next Portable Console, a show about portable gaming and the handheld revolution.

This Week's Sponsor:

Setapp

Start Your 30-day Free Trial Today


The Curious Case of Apple and Perplexity

Good post by Parker Ortolani, analyzing the pros and cons of a potential Perplexity acquisition by Apple:

According to Mark Gurman, Apple executives are in the early stages of mulling an acquisition of Perplexity. My initial reaction was “that wouldn’t work.” But I’ve taken some time to think through what it could look like if it were to come to fruition.

He gets to the core of the issue with this acquisition:

At the end of the day, Apple needs a technology company, not another product company. Perplexity is really good at, for lack of a better word, forking models. But their true speciality is in making great products, they’re amazing at packaging this technology. The reality is though, that Apple already knows how to do that. Of course, only if they can get out of their own way. That very issue is why I’m unsure the two companies would fit together. A company like Anthropic, a foundational AI lab that develops models from scratch is what Apple could stand to benefit from. That’s something that doesn’t just put them on more equal footing with Google, it’s something that also puts them on equal footing with OpenAI which is arguably the real threat.

While I’m not the biggest fan of Perplexity’s web scraping policies and its CEO’s remarks, it’s undeniable that the company has built a series of good consumer products, they’re fast at integrating the latest models from major AI vendors, and they’ve even dipped their toes in the custom model waters (with Sonar, an in-house model based on Llama). At first sight, I would agree with Ortolani and say that Apple would need Perplexity’s search engine and LLM integration talent more than the Perplexity app itself. So far, Apple has only integrated ChatGPT into its operating systems; Perplexity supports all the major LLMs currently in existence. If Apple wants to make the best computers for AI rather than being a bleeding-edge AI provider itself…well, that’s pretty much aligned with Perplexity’s software-focused goals.

However, I wonder if Perplexity’s work on its iOS voice assistant may have also played a role in these rumors. As I wrote a few months ago, Perplexity shipped a solid demo of what a deep LLM integration with core iOS services and frameworks could look like. What could Perplexity’s tech do when integrated with Siri, Spotlight, Safari, Music, or even third-party app entities in Shortcuts?

Or, look at it this way: if you’re Apple, would you spend $14 billion to buy an app and rebrand it as “Siri That Works” next year?

Permalink

I Have Many Questions About Apple’s Updated Foundation Models and the (Great) ‘Use Model’ Action in Shortcuts

Apple's 'Use Model' action in Shortcuts.

Apple’s ‘Use Model’ action in Shortcuts.

I mentioned this on AppStories during the week of WWDC: I think Apple’s new ‘Use Model’ action in Shortcuts for iOS/iPadOS/macOS 26, which lets you prompt either the local or cloud-based Apple Foundation models, is Apple Intelligence’s best and most exciting new feature for power users this year. This blog post is a way for me to better explain why as well as publicly investigate some aspects of the updated Foundation models that I don’t fully understand yet.

Read more


Initial Notes on iPadOS 26’s Local Capture Mode

Now this is what I call follow-up: six years after I linked to Jason Snell’s first experiments with podcasting on the iPad Pro (which later became part of a chapter of my Beyond the Tablet story from 2019), I get to link to Snell’s first impressions of iPadOS 26’s brand new local capture mode, which lets iPad users record their own audio and video during a call.

First, some context:

To ensure that the very best audio and video is used in the final product, we tend to use a technique called a “multi-ender.” In addition to the lower-quality call that’s going on, we all record ourselves on our local device at full quality, and upload those files when we’re done. The result is a final product that isn’t plagued by the dropouts and other quirks of the call itself. I’ve had podcasts where one of my panelists was connected to us via a plain old phone line—but they recorded themselves locally and the finished product sounded completely pristine.

This is how I’ve been recording podcasts since 2013. We used to be on a call on Skype and record audio with QuickTime; now we use Zoom, Audio Hijack, and OBS for video, but the concept is the same. Here’s Snell on how the new iPadOS feature, which lives in Control Center, works:

The file it saves is marked as an mp4 file, but it’s really a container featuring two separate content streams: full-quality video saved in HEVC (H.265) format, and lossless audio in the FLAC compression format. Regardless, I haven’t run into a single format conversion issue. My audio-sync automations on my Mac accept the file just fine, and Ferrite had no problem importing it, either. (The only quirk was that it captured audio at a 48KHz sample rate and I generally work at 24-bit, 44.1KHz. I have no idea if that’s because of my microphone or because of the iPad, but it doesn’t really matter since converting sample rates and dithering bit depths is easy.)

I tested this today with a FaceTime call. Everything worked as advertised, and the call’s MP4 file was successfully saved in my Downloads folder in iCloud Drive (I wish there was a way to change this). I was initially confused by the fact that recording automatically begins as soon as a call starts: if you press the Local Capture button in Control Center before getting on a call, as soon as it connects, you’ll be recording. It’s kind of an odd choice to make this feature just a…Control Center toggle, but I’ll take it! My MixPre-3 II audio interface and microphone worked right away, and I think there’s a very good chance I’ll be able to record AppStories and my other shows from my iPad Pro – with no more workarounds – this summer.

Permalink

Interview: Craig Federighi Opens Up About iPadOS, Its Multitasking Journey, and the iPad’s Essence

iPadOS 26. Source: Apple.

iPadOS 26. Source: Apple.

It’s a cool, sunny morning at Apple Park as I’m walking my way along the iconic glass ring to meet with Apple’s SVP of Software Engineering, Craig Federighi, for a conversation about the iPad.

It’s the Wednesday after WWDC, and although there are still some developers and members of the press around Apple’s campus, it seems like employees have returned to their regular routines. Peek through the glass, and you’ll see engineers working at their stations, half-erased whiteboards, and an infinite supply of Studio Displays on wooden desks with rounded corners. Some guests are still taking pictures by the WWDC sign. There are fewer security dogs, but they’re obviously all good.

Despite the list of elaborate questions on my mind about iPadOS 26 and its new multitasking, the long history of iPad criticisms (including mine) over the years, and what makes an iPad different from a Mac, I can’t stop thinking about the simplest, most obvious question I could ask – one that harkens back to an old commercial about the company’s modular tablet:

In 2025, what even is an iPad according to Federighi?

Read more


Testing DeepSeek R1-0528 on the M3 Ultra Mac Studio and Installing Local GGUF Models with Ollama on macOS

DeepSeek released an updated version of their popular R1 reasoning model (version 0528) with – according to the company – increased benchmark performance, reduced hallucinations, and native support for function calling and JSON output. Early tests from Artificial Analysis report a nice bump in performance, putting it behind OpenAI’s o3 and o4-mini-high in their Intelligence Index benchmarks. The model is available in the official DeepSeek API, and open weights have been distributed on Hugging Face. I downloaded different quantized versions of the full model on my M3 Ultra Mac Studio, and here are some notes on how it went.

Read more


From the Creators of Shortcuts, Sky Extends AI Integration and Automation to Your Entire Mac

Sky for Mac.

Sky for Mac.

Over the course of my career, I’ve had three distinct moments in which I saw a brand-new app and immediately felt it was going to change how I used my computer – and they were all about empowering people to do more with their devices.

I had that feeling the first time I tried Editorial, the scriptable Markdown text editor by Ole Zorn. I knew right away when two young developers told me about their automation app, Workflow, in 2014. And I couldn’t believe it when Apple showed that not only had they acquired Workflow, but they were going to integrate the renamed Shortcuts app system-wide on iOS and iPadOS.

Notably, the same two people – Ari Weinstein and Conrad Kramer – were involved with two of those three moments, first with Workflow, then with Shortcuts. And a couple of weeks ago, I found out that they were going to define my fourth moment, along with their co-founder Kim Beverett at Software Applications Incorporated, with the new app they’ve been working on in secret since 2023 and officially announced today.

For the past two weeks, I’ve been able to use Sky, the new app from the people behind Shortcuts who left Apple two years ago. As soon as I saw a demo, I felt the same way I did about Editorial, Workflow, and Shortcuts: I knew Sky was going to fundamentally change how I think about my macOS workflow and the role of automation in my everyday tasks.

Only this time, because of AI and LLMs, Sky is more intuitive than all those apps and requires a different approach, as I will explain in this exclusive preview story ahead of a full review of the app later this year.

Read more


Early Impressions of Claude Opus 4 and Using Tools with Extended Thinking

Claude Opus 4 and extended thinking with tools.

Claude Opus 4 and extended thinking with tools.

For the past two days, I’ve been testing an early access version of Claude Opus 4, the latest model by Anthropic that was just announced today. You can read more about the model in the official blog post and find additional documentation here. What follows is a series of initial thoughts and notes based on the 48 hours I spent with Claude Opus 4, which I tested in both the Claude app and Claude Code.

For starters, Anthropic describes Opus 4 as its most capable hybrid model with improvements in coding, writing, and reasoning. I don’t use AI for creative writing, but I have dabbled with “vibe coding” for a collection of personal Obsidian plugins (created and managed with Claude Code, following these tips by Harper Reed), and I’m especially interested in Claude’s integrations with Google Workspace and MCP servers. (My favorite solution for MCP at the moment is Zapier, which I’ve been using for a long time for web automations.) So I decided to focus my tests on reasoning with integrations and some light experiments with the upgraded Claude Code in the macOS Terminal.

Read more


Notes on Early Mac Studio AI Benchmarks with Qwen3-235B-A22B and Qwen2.5-VL-72B

I received a top-of-the-line Mac Studio (M3 Ultra, 512 GB of RAM, 8 TB of storage) on loan from Apple last week, and I thought I’d use this opportunity to revive something I’ve been mulling over for some time: more short-form blogging on MacStories in the form of brief “notes” with a dedicated Notes category on the site. Expect more of these “low-pressure”, quick posts in the future.

I’ve been sent this Mac Studio as part of my ongoing experiments with assistive AI and automation, and one of the things I plan to do over the coming weeks and months is playing around with local LLMs that tap into the power of Apple Silicon and the incredible performance headroom afforded by the M3 Ultra and this computer’s specs. I have a lot to learn when it comes to local AI (my shortcuts and experiments so far have focused on cloud models and the Shortcuts app combined with the LLM CLI), but since I had to start somewhere, I downloaded LM Studio and Ollama, installed the llm-ollama plugin, and began experimenting with open-weights models (served from Hugging Face as well as the Ollama library) both in the GGUF format and Apple’s own MLX framework.

LM Studio.

LM Studio.

I posted some of these early tests on Bluesky. I ran the massive Qwen3-235B-A22B model (a Mixture-of-Experts model with 235 billion parameters, 22 billion of which activated at once) with both GGUF and MLX using the beta version of the LM Studio app, and these were the results:

  • GGUF: 16 tokens/second, ~133 GB of RAM used
  • MLX: 24 tok/sec, ~124 GB RAM

As you can see from these first benchmarks (both based on the 4-bit quant of Qwen3-235B-A22B), the Apple Silicon-optimized version of the model resulted in better performance both for token generation and memory usage. Regardless of the version, the Mac Studio absolutely didn’t care and I could barely hear the fans going.

I also wanted to play around with the new generation of vision models (VLMs) to test modern OCR capabilities of these models. One of the tasks that has become kind of a personal AI eval for me lately is taking a long screenshot of a shortcut from the Shortcuts app (using CleanShot’s scrolling captures) and feed it either as a full-res PNG or PDF to an LLM. As I shared before, due to image compression, the vast majority of cloud LLMs either fail to accept the image as input or compresses the image so much that graphical artifacts lead to severe hallucinations in the text analysis of the image. Only o4-mini-high – thanks to its more agentic capabilities and tool-calling – was able to produce a decent output; even then, that was only possible because o4-mini-high decided to slice the image in multiple parts and iterate through each one with discrete pytesseract calls. The task took almost seven minutes to run in ChatGPT.

This morning, I installed the 72-billion parameter version of Qwen2.5-VL, gave it a full-resolution screenshot of a 40-action shortcut, and let it run with Ollama and llm-ollama. After 3.5 minutes and around 100 GB RAM usage, I got a really good, Markdown-formatted analysis of my shortcut back from the model.

To make the experience nicer, I even built a small local-scanning utility that lets me pick an image from Shortcuts and runs it through Qwen2.5-VL (72B) using the ‘Run Shell Script’ action on macOS. It worked beautifully on my first try. Amusingly, the smaller version of Qwen2.5-VL (32B) thought my photo of ergonomic mice was a “collection of seashells”. Fair enough: there’s a reason bigger models are heavier and costlier to run.

Given my struggles with OCR and document analysis with cloud-hosted models, I’m very excited about the potential of local VLMs that bypass memory constraints thanks to the M3 Ultra and provide accurate results in just a few minutes without having to upload private images or PDFs anywhere. I’ve been writing a lot about this idea of “hybrid automation” that combines traditional Mac scripting tools, Shortcuts, and LLMs to unlock workflows that just weren’t possible before; I feel like the power of this Mac Studio is going to be an amazing accelerator for that.

Next up on my list: understanding how to run MLX models with mlx-lm, investigating long-context models with dual-chunk attention support (looking at you, Qwen 2.5), and experimenting with Gemma 3. Fun times ahead!


Post-Chat UI

Fascinating analysis by Allen Pike on how, beyond traditional chatbot interactions, the technology behind LLMs can be used in other types of user interfaces and interactions:

While chat is powerful, for most products chatting with the underlying LLM should be more of a debug interface – a fallback mode – and not the primary UX.

So, how is AI making our software more useful, if not via chat? Let’s do a tour.

There are plenty of useful, practical examples in the story showing how natural language understanding and processing can be embedded in different features of modern apps. My favorite example is search, as Pike writes:

Another UI convention being reinvented is the search field.

It used to be that finding your flight details in your email required typing something exact, like “air canada confirmation”, and hoping that’s actually the phrasing in the email you’re thinking of.

Now, you should be able to type “what are the flight details for the offsite?” and find what you want.

Having used Shortwave and its AI-powered search for the past few months, I couldn’t agree more. The moment you get used to searching without exact queries or specific operators, there’s no going back.

Experience this once, and products with an old-school text-match search field feel broken. You should be able to just find “tax receipts from registered charities” in your email app, “the file where the login UI is defined” in your IDE, and “my upcoming vacations” in your calendar.

Interestingly, Pike mentions Command-K bars as another interface pattern that can benefit from LLM-infused interactions. I knew that sounded familiar – I covered the topic in mid-November 2022, and I still think it’s a shame that Apple hasn’t natively implemented these anywhere in their apps, especially now that commands can be fuzzier (just consider what Raycast is doing). Funnily enough, that post was published just two weeks before the public debut of ChatGPT on November 30, 2022. That feels like forever ago now.

Permalink